Introduction

Welcome to the Octivas API documentation. Octivas provides a powerful set of APIs for web extraction, crawling, and search. Whether you need to extract structured data from websites, crawl entire domains, or search the web programmatically, Octivas has you covered.

Key Features

•Extract: Pull structured data from any webpage using CSS selectors or AI-powered extraction.
•Crawl: Recursively crawl websites and collect data at scale.
•Search: Search the web and get structured results with content extraction.

Quickstart

Get started with Octivas in just a few steps. This guide will walk you through making your first API request.

1. Get your API key

2. Install the SDK

bash

npm install octivas

3. Make your first request

javascript

1import Octivas from 'octivas';
2
3const client = new Octivas('your_api_key');
4
5const result = await client.extract({
6  url: 'https://example.com',
7  schema: {
8    title: 'string',
9    description: 'string'
10  }
11});
12
13console.log(result.data);

Authentication

All API requests require authentication using your API key. Include your API key in the Authorization header of each request.

bash

curl -X POST https://api.octivas.com/v1/extract \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Security Note: Never expose your API key in client-side code. Always make API calls from your server.

Extract API

The Extract API allows you to pull structured data from any webpage. Define a schema and we'll extract the data for you.

Endpoint

POST https://api.octivas.com/v1/extract

Parameters

Parameter	Type	Description
`url`	string	The URL to extract data from
`schema`	object	The data schema to extract
`render_js`	boolean	Enable JavaScript rendering (default: false)

Example

javascript

1const result = await client.extract({
2  url: 'https://news.example.com/article/123',
3  schema: {
4    title: 'string',
5    author: 'string',
6    publishedAt: 'date',
7    content: 'string',
8    tags: 'string[]'
9  },
10  render_js: true
11});

Crawl API

The Crawl API lets you recursively crawl websites, following links and collecting data from multiple pages.

Endpoint

POST https://api.octivas.com/v1/crawl

Example

javascript

1const job = await client.crawl({
2  startUrl: 'https://blog.example.com',
3  maxPages: 50,
4  depth: 2,
5  followLinks: true,
6  includePatterns: ['/posts/*'],
7  excludePatterns: ['/admin/*', '/login']
8});
9
10// Poll for results
11const results = await job.waitForCompletion();
12console.log(`Crawled ${results.pages.length} pages`);

Search API

The Search API allows you to search the web programmatically and get structured results with optional content extraction.

Endpoint

POST https://api.octivas.com/v1/search

Example

javascript

1const results = await client.search({
2  query: 'machine learning tutorials',
3  numResults: 10,
4  extractContent: true,
5  region: 'us'
6});
7
8results.forEach(result => {
9  console.log(result.title);
10  console.log(result.url);
11  console.log(result.snippet);
12});

Python SDK

Install and use the official Python SDK for Octivas.

Installation

bash

pip install octivas

Usage

python

1import octivas
2
3client = octivas.Client("your_api_key")
4
5# Extract data
6result = client.extract(
7    url="https://example.com",
8    schema={"title": "string", "price": "number"}
9)
10print(result.data)
11
12# Crawl a website
13job = client.crawl(
14    start_url="https://example.com",
15    max_pages=100
16)
17for page in job.pages():
18    print(page.url)
19
20# Search the web
21results = client.search(query="python tutorials")
22for result in results:
23    print(result.title)

JavaScript SDK

Install and use the official JavaScript/TypeScript SDK for Octivas.

Installation

bash

npm install octivas

Usage

javascript

1import Octivas from 'octivas';
2
3const client = new Octivas('your_api_key');
4
5// Extract data
6const extractResult = await client.extract({
7  url: 'https://example.com',
8  schema: { title: 'string', price: 'number' }
9});
10console.log(extractResult.data);
11
12// Crawl a website
13const crawlJob = await client.crawl({
14  startUrl: 'https://example.com',
15  maxPages: 100
16});
17for await (const page of crawlJob.pages()) {
18  console.log(page.url);
19}
20
21// Search the web
22const searchResults = await client.search({
23  query: 'javascript tutorials'
24});
25searchResults.forEach(result => console.log(result.title));