Introduction

Welcome to the Octivas API documentation. Octivas provides a powerful set of APIs for web extraction, crawling, and search. Whether you need to extract structured data from websites, crawl entire domains, or search the web programmatically, Octivas has you covered.

Key Features

  • Extract: Pull structured data from any webpage using CSS selectors or AI-powered extraction.
  • Crawl: Recursively crawl websites and collect data at scale.
  • Search: Search the web and get structured results with content extraction.

Quickstart

Get started with Octivas in just a few steps. This guide will walk you through making your first API request.

1. Get your API key

Sign up for a free account and navigate to your dashboard to get your API key.

2. Install the SDK

bash
npm install octivas

3. Make your first request

javascript
1import Octivas from 'octivas';
2
3const client = new Octivas('your_api_key');
4
5const result = await client.extract({
6 url: 'https://example.com',
7 schema: {
8 title: 'string',
9 description: 'string'
10 }
11});
12
13console.log(result.data);

Authentication

All API requests require authentication using your API key. Include your API key in the Authorization header of each request.

bash
curl -X POST https://api.octivas.com/v1/extract \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Security Note: Never expose your API key in client-side code. Always make API calls from your server.

Extract API

The Extract API allows you to pull structured data from any webpage. Define a schema and we'll extract the data for you.

Endpoint

POST https://api.octivas.com/v1/extract

Parameters

ParameterTypeDescription
urlstringThe URL to extract data from
schemaobjectThe data schema to extract
render_jsbooleanEnable JavaScript rendering (default: false)

Example

javascript
1const result = await client.extract({
2 url: 'https://news.example.com/article/123',
3 schema: {
4 title: 'string',
5 author: 'string',
6 publishedAt: 'date',
7 content: 'string',
8 tags: 'string[]'
9 },
10 render_js: true
11});

Crawl API

The Crawl API lets you recursively crawl websites, following links and collecting data from multiple pages.

Endpoint

POST https://api.octivas.com/v1/crawl

Example

javascript
1const job = await client.crawl({
2 startUrl: 'https://blog.example.com',
3 maxPages: 50,
4 depth: 2,
5 followLinks: true,
6 includePatterns: ['/posts/*'],
7 excludePatterns: ['/admin/*', '/login']
8});
9
10// Poll for results
11const results = await job.waitForCompletion();
12console.log(`Crawled ${results.pages.length} pages`);

Python SDK

Install and use the official Python SDK for Octivas.

Installation

bash
pip install octivas

Usage

python
1import octivas
2
3client = octivas.Client("your_api_key")
4
5# Extract data
6result = client.extract(
7 url="https://example.com",
8 schema={"title": "string", "price": "number"}
9)
10print(result.data)
11
12# Crawl a website
13job = client.crawl(
14 start_url="https://example.com",
15 max_pages=100
16)
17for page in job.pages():
18 print(page.url)
19
20# Search the web
21results = client.search(query="python tutorials")
22for result in results:
23 print(result.title)

JavaScript SDK

Install and use the official JavaScript/TypeScript SDK for Octivas.

Installation

bash
npm install octivas

Usage

javascript
1import Octivas from 'octivas';
2
3const client = new Octivas('your_api_key');
4
5// Extract data
6const extractResult = await client.extract({
7 url: 'https://example.com',
8 schema: { title: 'string', price: 'number' }
9});
10console.log(extractResult.data);
11
12// Crawl a website
13const crawlJob = await client.crawl({
14 startUrl: 'https://example.com',
15 maxPages: 100
16});
17for await (const page of crawlJob.pages()) {
18 console.log(page.url);
19}
20
21// Search the web
22const searchResults = await client.search({
23 query: 'javascript tutorials'
24});
25searchResults.forEach(result => console.log(result.title));