API Reference

Scraping

The scrape endpoint loads a URL in a headless browser and extracts structured content. Returns raw HTML, cleaned HTML, markdown, readability text, links, and page metadata. No session management required.

POST/v1/scrape

Scrape a URL and extract structured content.

Parameters

NameTypeRequiredDefaultDescription
urlstringYesThe URL to scrape.
waitForstring | numberNoCSS selector to wait for, or milliseconds to wait after page load.
headersRecord<string, string>NoCustom HTTP headers to set on the page request.
cookiesCookie[]NoCookies to inject before navigation. Each: { name, value, domain }.
proxyUrlstringNoProxy URL for this request (HTTP or SOCKS5).
stealth"none" | "basic" | "full"No"full"Anti-detection level.
timeoutnumberNo30000Navigation timeout in milliseconds.

Request Body

{
  "url": "https://example.com",
  "waitFor": 2000,
  "stealth": "full",
  "headers": {
    "Accept-Language": "en-US"
  }
}

Response

{
  "url": "https://example.com/",
  "statusCode": 200,
  "title": "Example Domain",
  "html": "<!doctype html>\n<html>\n<head>...",
  "cleanedHtml": "<h1>Example Domain</h1>\n<p>This domain is for use in...",
  "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...",
  "readability": "Example Domain\n\nThis domain is for use in illustrative examples in documents.",
  "links": [
    { "href": "https://www.iana.org/domains/example", "text": "More information..." }
  ],
  "metadata": {
    "description": "Example domain for documentation",
    "ogImage": null,
    "canonical": "https://example.com/"
  }
}
curl -X POST https://api.browsefleet.com/v1/scrape \
  -H "Content-Type: application/json" \
  -H "x-api-key: bf_your_api_key" \
  -d '{
    "url": "https://example.com",
    "waitFor": 2000
  }'

Response Fields

FieldTypeDescription
urlstringFinal URL after redirects
statusCodenumberHTTP status code of the page
titlestringPage title from the document
htmlstringFull raw HTML of the page
cleanedHtmlstringHTML with scripts, styles, and non-content elements removed
markdownstringPage content converted to Markdown
readabilitystringPlain text extracted via readability algorithm
linksArrayAll links on the page, each with href and text
metadataobjectPage metadata: description, ogImage, canonical

Scrape vs Sessions

Use CaseRecommended
Extract content from a single URLScrape endpoint
Navigate through multiple pagesSession + Puppeteer
Fill forms and interact with UISession + Computer API
One-off screenshot or PDFQuick action endpoints
Long-running automationSession with extended timeout