Extraction API · /extract + /generate

Stop building the extraction layer. Just call the endpoint.

Define a schema, pass a URL, and get back JSON that matches. No parsing code, no downstream LLM call, no extraction layer to maintain. The intelligence happens inside the call.

Schema-matched JSONClean Markdown OutputReasoning with /generate

/extract/json

Request

url:nike.com/t/pegasus-premium-womens-road-running-shoes-OHKEqA2b/HQ2593-004
schema:{ name, price, sizes: [{ size, in_stock }] }

Response

matches your schema
Nike Pegasus Premium$220
sizes
W 7 / M 5.5in stock
W 8 / M 6.5sold out
W 9 / M 7.5in stock

One call. Down to per-size stock.

Schema-driven

You define the shape. Tabstack returns it.

Pass a URL and the JSON schema you need. Tabstack enforces it on server-rendered, client-rendered, and JS-heavy pages, with no parsing code, no Zod pass, and no prompt engineering on your side. You get the output. You never touch what produced it.

Features

  • Schema-driven extraction from any URL with /extract/json
  • You define the json_schema; Tabstack enforces the shape
  • Schema compliance on every call, even when the page changes
extract.ts
TypeScript
import Tabstack from '@tabstack/sdk'

const client = new Tabstack()

try {
  const pricing = await client.extract.json({
    url: 'https://competitor.com/pricing',
    json_schema: {
      type: 'object',
      properties: {
        plans: { type: 'array', items: {
          type: 'object', properties: {
            name: { type: 'string' },
            price: { type: 'number', description: 'Monthly USD' },
            features: { type: 'array', items: { type: 'string' } }
          }
        }}
      }
    }
  })
  console.log(pricing.plans)
} catch (err) { console.error(err) }
generate.ts
TypeScript
try {
  const analysis = await client.generate.json({
    url: 'https://competitor.com/pricing',
    instructions: 'Analyze the pricing. What segment does each tier target, and why?',
    json_schema: {
      type: 'object',
      properties: { tiers: { type: 'array', items: { type: 'object', properties: {
        name: { type: 'string' },
        target_segment: { type: 'string' },
        positioning_rationale: { type: 'string' }
      }}}}
    }
  })
  console.log(analysis.tiers)
} catch (err) { console.error(err) }

Reasoning

Go beyond fields. Get structured answers, not just values.

/generate/json adds instructions on top of the URL, so you get output that required reasoning, not just a pulled field. The move from getting the price to telling you what that pricing reveals about who they sell to.

Features

  • AI transformation with your custom instructions
  • Reasoning over content, not just field extraction
  • Clean Markdown for LLM input when you need the full page

Control

Keep your extraction running without babysitting it.

nocache forces fresh data for monitoring and change detection. effort scales cost to what the page needs. geo_target fetches pages as seen from any country. The call adapts to what the page requires, not what you hardcoded.

Features

  • nocache: true bypasses cache for fresh data every call
  • effort (min / standard / max): pay for what the page needs
  • geo_target: fetch a page as seen from a specific country
monitor.ts
TypeScript
// Monitoring: fresh data every run, cost scaled to the page
try {
  const current = await client.extract.json({
    url: 'https://competitor.com/pricing',
    json_schema: { /* your schema */ },
    nocache: true,            // always fresh
    effort: 'standard',       // scale to page complexity
    geo_target: { country: 'US' }
  })
  console.log(current)
} catch (err) { console.error(err) }

Who builds on this

Teams that turn web pages into structured data.

Price & catalog monitoring

Track competitor pricing and inventory as it changes.

Lead enrichment

Turn a URL into structured company and contact data.

Listings & marketplace data

Pull products, jobs, and listings into a fixed shape.

RAG & content ingestion

Clean, structured input for retrieval and indexing.

Mozilla-backed

Privacy, Transparency, and Control

When you build on /extract and /generate, the pages you fetch and the data you pull stay yours. Tabstack is a Mozilla-backed platform, and nothing you send is sold or used to train models.

Private by default.Requests and fetched pages are used to build your response and support you, then purged. Never sold, never used to train models.

Transparent by design.Mozilla-documented data practices and robots.txt compliance by default. See exactly how every endpoint sources and handles your data.

Yours to control.You set the schema, effort, and scope of every call. No retained corpus, no lock-in, just clean data you own.

See exactly how we source and handle data in the documentation.

Mozilla Manifesto

Pricing & Plans

All prices in USD.

Free Trial

Every new account includes 10,000 free credits to explore the full platform.

Individual

$0/ month

Pay-as-you-go

  • $0.35 / 1k credits
  • Access to all API endpoints
  • Fast research mode
  • Standard rate limits

For tinkerers & hobbyists who want to simply connect their systems to the internet.

Get Started

Team

$99/ month

500,000 credits included

  • Everything in Individual, plus:
  • Fast + Balanced research modes
  • Increased rate limits
  • $0.30 / 1k credits overage

Low-latency, predictable cost automation so you can focus on your core product.

Get Started

Pro

$499/ month

3,000,000 credits included

  • Everything in Team, plus:
  • Highest rate limits
  • $0.25 / 1k credits overage

For teams deploying and managing production workloads efficiently at scale.

Get Started

Enterprise Plan

Need Custom Pricing?

Custom API quotas, dedicated support, and SLAs for high-volume teams.

Contact Sales

From the developers

I signed in with Google and had a working API key right away. Then I made my first call and got back exactly what I asked for in minutes.

Developer Experience Researcher

Tech Stackups

Start extracting in minutes.

Define a schema, call one endpoint, and get matching JSON back. Free to start.