Introduction
Welcome to the Octivas API documentation. Octivas provides a powerful set of APIs for web extraction, crawling, and search. Whether you need to extract structured data from websites, crawl entire domains, or search the web programmatically, Octivas has you covered.
Key Features
- •Extract: Pull structured data from any webpage using CSS selectors or AI-powered extraction.
- •Crawl: Recursively crawl websites and collect data at scale.
- •Search: Search the web and get structured results with content extraction.
Quickstart
Get started with Octivas in just a few steps. This guide will walk you through making your first API request.
1. Get your API key
Sign up for a free account and navigate to your dashboard to get your API key.
2. Install the SDK
npm install octivas3. Make your first request
1import Octivas from 'octivas';23const client = new Octivas('your_api_key');45const result = await client.extract({6 url: 'https://example.com',7 schema: {8 title: 'string',9 description: 'string'10 }11});1213console.log(result.data);Authentication
All API requests require authentication using your API key. Include your API key in the Authorization header of each request.
curl -X POST https://api.octivas.com/v1/extract \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Security Note: Never expose your API key in client-side code. Always make API calls from your server.
Extract API
The Extract API allows you to pull structured data from any webpage. Define a schema and we'll extract the data for you.
Endpoint
POST https://api.octivas.com/v1/extractParameters
| Parameter | Type | Description |
|---|---|---|
url | string | The URL to extract data from |
schema | object | The data schema to extract |
render_js | boolean | Enable JavaScript rendering (default: false) |
Example
1const result = await client.extract({2 url: 'https://news.example.com/article/123',3 schema: {4 title: 'string',5 author: 'string',6 publishedAt: 'date',7 content: 'string',8 tags: 'string[]'9 },10 render_js: true11});Crawl API
The Crawl API lets you recursively crawl websites, following links and collecting data from multiple pages.
Endpoint
POST https://api.octivas.com/v1/crawlExample
1const job = await client.crawl({2 startUrl: 'https://blog.example.com',3 maxPages: 50,4 depth: 2,5 followLinks: true,6 includePatterns: ['/posts/*'],7 excludePatterns: ['/admin/*', '/login']8});910// Poll for results11const results = await job.waitForCompletion();12console.log(`Crawled ${results.pages.length} pages`);Search API
The Search API allows you to search the web programmatically and get structured results with optional content extraction.
Endpoint
POST https://api.octivas.com/v1/searchExample
1const results = await client.search({2 query: 'machine learning tutorials',3 numResults: 10,4 extractContent: true,5 region: 'us'6});78results.forEach(result => {9 console.log(result.title);10 console.log(result.url);11 console.log(result.snippet);12});Python SDK
Install and use the official Python SDK for Octivas.
Installation
pip install octivasUsage
1import octivas23client = octivas.Client("your_api_key")45# Extract data6result = client.extract(7 url="https://example.com",8 schema={"title": "string", "price": "number"}9)10print(result.data)1112# Crawl a website13job = client.crawl(14 start_url="https://example.com",15 max_pages=10016)17for page in job.pages():18 print(page.url)1920# Search the web21results = client.search(query="python tutorials")22for result in results:23 print(result.title)JavaScript SDK
Install and use the official JavaScript/TypeScript SDK for Octivas.
Installation
npm install octivasUsage
1import Octivas from 'octivas';23const client = new Octivas('your_api_key');45// Extract data6const extractResult = await client.extract({7 url: 'https://example.com',8 schema: { title: 'string', price: 'number' }9});10console.log(extractResult.data);1112// Crawl a website13const crawlJob = await client.crawl({14 startUrl: 'https://example.com',15 maxPages: 10016});17for await (const page of crawlJob.pages()) {18 console.log(page.url);19}2021// Search the web22const searchResults = await client.search({23 query: 'javascript tutorials'24});25searchResults.forEach(result => console.log(result.title));