Enhanced Search API
Advanced search and scraping with extraction strategies, content filtering, and adaptive crawling.
Enhanced Search API
Advanced search and scraping endpoints with configurable extraction strategies, content filtering, adaptive crawling, and more.
Enhanced Search
POST /api/v1/enhanced/search
Search with advanced scraping features like extraction strategies, content filters, and link analysis.
Body Parameters
All parameters from the Search API plus:
| Parameter | Type | Default | Description |
|---|---|---|---|
extraction_strategy | string | "none" | "none", "cosine", "json_css", "regex", "llm" |
content_filter | string | "none" | "none", "pruning", "bm25", "llm" |
markdown_generation | boolean | false | Generate clean markdown output |
adaptive_crawling | boolean | false | Enable AI-guided adaptive crawling |
virtual_scrolling | boolean | false | Handle infinite scroll pages |
link_analysis | boolean | false | Analyze and score links |
Extraction Strategies
cosine— Cluster content by semantic similarity, extract relevant clustersjson_css— Extract structured data using CSS selectors and JSON schemaregex— Extract data using regular expression patternsllm— Use an LLM to extract structured information
Content Filters
pruning— Remove boilerplate, ads, navigation using heuristicsbm25— Filter by BM25 relevance scoring against a queryllm— Use an LLM to determine content relevance
Example Request
curl -X POST 'https://api.unsearch.dev/api/v1/enhanced/search' \
-H 'X-API-Key: sk_live_xxxxx' \
-H 'Content-Type: application/json' \
-d '{
"query": "latest LLM benchmarks 2025",
"max_results": 5,
"scrape_content": true,
"extraction_strategy": "cosine",
"content_filter": "bm25",
"markdown_generation": true
}'Enhanced Scrape
POST /api/v1/enhanced/scrape
Advanced multi-engine scraping with full configuration control.
Body Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
urls | string[] | required | URLs to scrape (1-50) |
extract_text | boolean | true | Extract text content |
extract_images | boolean | true | Extract images |
extract_links | boolean | true | Extract links |
extract_metadata | boolean | true | Extract page metadata |
javascript_rendering | boolean | false | Render JavaScript |
response_format | string | "json" | "json" or "markdown" |
screenshot | boolean | false | Capture screenshot |
pdf | boolean | false | Generate PDF |
include_html | boolean | false | Include raw HTML |
wait_until | string | — | "load", "domcontentloaded", "networkidle0", "networkidle2" |
cache_mode | string | "enabled" | "enabled", "read_only", "write_only", "bypass", "disabled" |
Example Request
curl -X POST 'https://api.unsearch.dev/api/v1/enhanced/scrape' \
-H 'X-API-Key: sk_live_xxxxx' \
-H 'Content-Type: application/json' \
-d '{
"urls": ["https://example.com/article"],
"javascript_rendering": true,
"response_format": "markdown",
"extract_metadata": true
}'Response
{
"request_id": "req_abc123",
"scraped_content": [
{
"url": "https://example.com/article",
"title": "Article Title",
"text": "Full extracted text...",
"images": ["https://example.com/img1.png"],
"links": ["https://example.com/related"],
"metadata": {
"title": "Article Title",
"description": "Meta description...",
"author": "Author Name",
"published_date": "2025-03-01",
"keywords": ["AI", "LLM"]
},
"word_count": 2500,
"extraction_success": true,
"extraction_time_ms": 1200
}
],
"metadata": {
"total_urls": 1,
"successful": 1,
"failed": 0
}
}Extract Tables
POST /api/v1/enhanced/extract-tables
Extract structured tables from HTML content.
Body Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
html_content | string | required | HTML to extract tables from |
base_url | string | — | Base URL for resolving relative links |
strategy | string | "auto" | Extraction strategy |
Chunk Content
POST /api/v1/enhanced/chunk-content
Split text into optimized chunks for vector storage.
Body Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
text | string | required | Text to chunk |
strategy | string | "auto" | Chunking strategy |
Discover URLs
POST /api/v1/enhanced/discover-urls
Discover URLs from a source page.
Body Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
base_url | string | required | Starting URL |
max_urls | integer | 100 | Max URLs to discover |
pattern | string | — | URL pattern filter |
List Features
GET /api/v1/enhanced/features
Returns all available enhanced features and their configurations.
Performance Metrics
GET /api/v1/enhanced/performance
Returns performance metrics for the enhanced search service.