UnSearch Docs

Enhanced Search API

Advanced search and scraping with extraction strategies, content filtering, and adaptive crawling.

Enhanced Search API

Advanced search and scraping endpoints with configurable extraction strategies, content filtering, adaptive crawling, and more.

POST /api/v1/enhanced/search

Search with advanced scraping features like extraction strategies, content filters, and link analysis.

Body Parameters

All parameters from the Search API plus:

ParameterTypeDefaultDescription
extraction_strategystring"none""none", "cosine", "json_css", "regex", "llm"
content_filterstring"none""none", "pruning", "bm25", "llm"
markdown_generationbooleanfalseGenerate clean markdown output
adaptive_crawlingbooleanfalseEnable AI-guided adaptive crawling
virtual_scrollingbooleanfalseHandle infinite scroll pages
link_analysisbooleanfalseAnalyze and score links

Extraction Strategies

  • cosine — Cluster content by semantic similarity, extract relevant clusters
  • json_css — Extract structured data using CSS selectors and JSON schema
  • regex — Extract data using regular expression patterns
  • llm — Use an LLM to extract structured information

Content Filters

  • pruning — Remove boilerplate, ads, navigation using heuristics
  • bm25 — Filter by BM25 relevance scoring against a query
  • llm — Use an LLM to determine content relevance

Example Request

curl -X POST 'https://api.unsearch.dev/api/v1/enhanced/search' \
  -H 'X-API-Key: sk_live_xxxxx' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "latest LLM benchmarks 2025",
    "max_results": 5,
    "scrape_content": true,
    "extraction_strategy": "cosine",
    "content_filter": "bm25",
    "markdown_generation": true
  }'

Enhanced Scrape

POST /api/v1/enhanced/scrape

Advanced multi-engine scraping with full configuration control.

Body Parameters

ParameterTypeDefaultDescription
urlsstring[]requiredURLs to scrape (1-50)
extract_textbooleantrueExtract text content
extract_imagesbooleantrueExtract images
extract_linksbooleantrueExtract links
extract_metadatabooleantrueExtract page metadata
javascript_renderingbooleanfalseRender JavaScript
response_formatstring"json""json" or "markdown"
screenshotbooleanfalseCapture screenshot
pdfbooleanfalseGenerate PDF
include_htmlbooleanfalseInclude raw HTML
wait_untilstring"load", "domcontentloaded", "networkidle0", "networkidle2"
cache_modestring"enabled""enabled", "read_only", "write_only", "bypass", "disabled"

Example Request

curl -X POST 'https://api.unsearch.dev/api/v1/enhanced/scrape' \
  -H 'X-API-Key: sk_live_xxxxx' \
  -H 'Content-Type: application/json' \
  -d '{
    "urls": ["https://example.com/article"],
    "javascript_rendering": true,
    "response_format": "markdown",
    "extract_metadata": true
  }'

Response

{
  "request_id": "req_abc123",
  "scraped_content": [
    {
      "url": "https://example.com/article",
      "title": "Article Title",
      "text": "Full extracted text...",
      "images": ["https://example.com/img1.png"],
      "links": ["https://example.com/related"],
      "metadata": {
        "title": "Article Title",
        "description": "Meta description...",
        "author": "Author Name",
        "published_date": "2025-03-01",
        "keywords": ["AI", "LLM"]
      },
      "word_count": 2500,
      "extraction_success": true,
      "extraction_time_ms": 1200
    }
  ],
  "metadata": {
    "total_urls": 1,
    "successful": 1,
    "failed": 0
  }
}

Extract Tables

POST /api/v1/enhanced/extract-tables

Extract structured tables from HTML content.

Body Parameters

ParameterTypeDefaultDescription
html_contentstringrequiredHTML to extract tables from
base_urlstringBase URL for resolving relative links
strategystring"auto"Extraction strategy

Chunk Content

POST /api/v1/enhanced/chunk-content

Split text into optimized chunks for vector storage.

Body Parameters

ParameterTypeDefaultDescription
textstringrequiredText to chunk
strategystring"auto"Chunking strategy

Discover URLs

POST /api/v1/enhanced/discover-urls

Discover URLs from a source page.

Body Parameters

ParameterTypeDefaultDescription
base_urlstringrequiredStarting URL
max_urlsinteger100Max URLs to discover
patternstringURL pattern filter

List Features

GET /api/v1/enhanced/features

Returns all available enhanced features and their configurations.

Performance Metrics

GET /api/v1/enhanced/performance

Returns performance metrics for the enhanced search service.

On this page