Thordata Firecrawl - Turn websites into LLM-ready data

Main Features

Scrape

Get LLM-ready data from websites. Markdown, JSON, HTML, screenshots, and more.

from thordata_firecrawl import ThordataCrawl

client = ThordataCrawl(api_key="td-YOUR_API_KEY")

result = client.scrape(
    url="https://www.thordata.com",
    formats=["markdown", "json"]
)

print(result["data"]["markdown"])

curl -X POST "https://thordata-firecrawl-api.onrender.com/v1/scrape" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.thordata.com",
    "formats": ["markdown", "json"]
  }'

const response = await fetch('https://thordata-firecrawl-api.onrender.com/v1/scrape', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://www.thordata.com',
    formats: ['markdown', 'json']
  })
});

const data = await response.json();
console.log(data.data.markdown);

Search

Search the web and get full content from results. Powered by Thordata SERP API.

result = client.search(
    query="Thordata web scraping API",
    limit=5,
    engine="google"
)

for item in result["data"]:
    print(item["title"], item["url"])

curl -X POST "https://thordata-firecrawl-api.onrender.com/v1/search" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Thordata web scraping API",
    "limit": 5,
    "engine": "google"
  }'

const response = await fetch('https://thordata-firecrawl-api.onrender.com/v1/search', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    query: 'Thordata web scraping API',
    limit: 5,
    engine: 'google'
  })
});

const data = await response.json();
console.log(data.data);

Map

Discover all URLs on a website. Build sitemaps and understand site structure.

result = client.map(
    url="https://www.thordata.com"
)

for url in result["data"]:
    print(url)

curl -X POST "https://thordata-firecrawl-api.onrender.com/v1/map" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.thordata.com"
  }'

const response = await fetch('https://thordata-firecrawl-api.onrender.com/v1/map', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://www.thordata.com'
  })
});

const data = await response.json();
console.log(data.data);

Crawl

Crawl entire websites with BFS traversal. Async jobs with webhook callbacks.

job = client.crawl(
    url="https://www.thordata.com",
    limit=10,
    formats=["markdown"]
)

# Check job status
status = client.get_crawl_status(job["jobId"])
print(status)

curl -X POST "https://thordata-firecrawl-api.onrender.com/v1/crawl" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.thordata.com",
    "limit": 10,
    "formats": ["markdown"]
  }'

const response = await fetch('https://thordata-firecrawl-api.onrender.com/v1/crawl', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://www.thordata.com',
    limit: 10,
    formats: ['markdown']
  })
});

const job = await response.json();
console.log(job.jobId);

Agent

Extract structured data using LLM prompts. Schema-based extraction with AI.

result = client.agent(
    prompt="Extract company name and description",
    urls=["https://www.thordata.com"],
    schema={
        "type": "object",
        "properties": {
            "company_name": {"type": "string"},
            "description": {"type": "string"}
        }
    }
)

print(result["data"])

curl -X POST "https://thordata-firecrawl-api.onrender.com/v1/agent" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Extract company name and description",
    "urls": ["https://www.thordata.com"],
    "schema": {
      "type": "object",
      "properties": {
        "company_name": {"type": "string"},
        "description": {"type": "string"}
      }
    }
  }'

const response = await fetch('https://thordata-firecrawl-api.onrender.com/v1/agent', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    prompt: 'Extract company name and description',
    urls: ['https://www.thordata.com'],
    schema: {
      type: 'object',
      properties: {
        company_name: { type: 'string' },
        description: { type: 'string' }
      }
    }
  })
});

const data = await response.json();
console.log(data.data);

Quick Start

Install

pip install thordata-firecrawl

Scrape Example

from thordata_firecrawl import ThordataCrawl

client = ThordataCrawl(api_key="td-YOUR_API_KEY")

result = client.scrape(
    url="https://www.thordata.com",
    formats=["markdown"]
)

print(result["data"]["markdown"])

Scrape Example

curl -X POST "https://thordata-firecrawl-api.onrender.com/v1/scrape" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.thordata.com",
    "formats": ["markdown"]
  }'

Crawl Example

curl -X POST "https://thordata-firecrawl-api.onrender.com/v1/crawl" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.thordata.com",
    "limit": 10,
    "formats": ["markdown"]
  }'

💡 For local development, replace the URL with http://localhost:3002

Scrape Example

const response = await fetch('https://thordata-firecrawl-api.onrender.com/v1/scrape', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://www.thordata.com',
    formats: ['markdown']
  })
});

const data = await response.json();
console.log(data.data.markdown);

💡 For local development, replace the URL with http://localhost:3002

Frequently Asked Questions

Thordata Firecrawl is a Firecrawl-compatible web scraping API that turns websites into LLM-ready data. It provides clean Markdown, JSON, HTML, and screenshots from any website, making it perfect for AI applications, RAG systems, and data pipelines.

Thordata Firecrawl is designed to be API-compatible with Firecrawl, making migration easy. It's powered by Thordata's web data infrastructure and is fully open-source (MIT license) and self-hostable. You can deploy it anywhere without vendor lock-in.

Yes! Thordata Firecrawl is open-source and free to self-host. You only need a Thordata API key for the underlying scraping infrastructure. The code itself is MIT-licensed and can be used in commercial projects.

Absolutely! We provide Docker support and a Render Blueprint (`render.yaml`) for one-click cloud deployment. You can also deploy to any platform that supports Python/Docker, including AWS, GCP, Azure, Fly.io, and more.

Thordata Firecrawl supports Markdown (LLM-ready), JSON (structured data), HTML (raw), and screenshots. You can request multiple formats in a single API call.

Yes! Thordata's infrastructure handles JavaScript-rendered content automatically. You don't need to configure anything - it just works.

Turn websites into
LLM-ready data

Main Features

Scrape

Search

Map

Crawl

Agent

Built to outperform

Speed that feels invisible

No proxy headaches

Zero configuration

Quick Start

Install

Scrape Example

Scrape Example

Crawl Example

Scrape Example

Interactive Playground

Quick Start

Response:

Use Cases

🤖 AI Platforms

🔍 SEO Teams

📊 Competitive Intelligence

🔬 Deep Research

📈 Lead Enrichment

🔄 Data Sync

Integrations

Python SDK

CLI Tool

REST API

LangChain

Frequently Asked Questions

Turn websites intoLLM-ready data

Main Features

Scrape

Search

Map

Crawl

Agent

Built to outperform

Speed that feels invisible

No proxy headaches

Zero configuration

Quick Start

Install

Scrape Example

Scrape Example

Crawl Example

Scrape Example

Interactive Playground

Quick Start

Response:

Use Cases

🤖 AI Platforms

🔍 SEO Teams

📊 Competitive Intelligence

🔬 Deep Research

📈 Lead Enrichment

🔄 Data Sync

Integrations

Python SDK

CLI Tool

REST API

LangChain

Frequently Asked Questions

Turn websites into
LLM-ready data