API Reference

Agent

The Agent API provides autonomous browser automation powered by vision-capable LLMs. Give it a task in natural language, and it will iteratively screenshot the browser, reason about what to do, execute actions, and repeat until the task is complete. Supports Claude (Anthropic) and GPT-4o (OpenAI).

How the Agent Works

text
1. Create a browser session (stealth: full, viewport: 1280x900)
2. Navigate to the starting URL (if provided)
3. Take a screenshot of the current page
4. Send screenshot + task description to the LLM
5. LLM returns structured actions with reasoning
6. Execute the actions (click, type, scroll, navigate, etc.)
7. Repeat steps 3-6 until:
   - The LLM returns a "done" action with a result
   - The LLM returns a "fail" action with a reason
   - Maximum iterations reached (default: 15, max: 30)
8. Release the session and return the result

Autonomous Agent

POST/v1/agent

Run an autonomous agent task. Creates a session automatically, runs the task, and releases the session when done.

Parameters

NameTypeRequiredDefaultDescription
taskstringYesNatural language description of the task to perform.
urlstringNoStarting URL. The agent navigates here before beginning the task.
provider"anthropic" | "openai"No"anthropic"LLM provider to use.
modelstringNo"claude-sonnet-4-20250514" or "gpt-4o"Model ID. Defaults based on provider.
maxIterationsnumberNo15Maximum number of screenshot-action loops (max: 30).
apiKeystringNoLLM API key. If omitted, uses server-configured ANTHROPIC_API_KEY or OPENAI_API_KEY.

Request Body

{
  "task": "Go to Hacker News and find the top post title",
  "url": "https://news.ycombinator.com",
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514",
  "maxIterations": 10
}

Response

{
  "success": true,
  "result": "The top post is 'Show HN: BrowseFleet - Cloud Browser API'",
  "steps": [
    {
      "iteration": 0,
      "reasoning": "I see the Hacker News homepage. The top post is visible.",
      "actions": [
        { "type": "done", "result": "The top post is 'Show HN: BrowseFleet - Cloud Browser API'" }
      ],
      "screenshot": "<base64-png>"
    }
  ],
  "totalIterations": 1,
  "sessionId": "sess_abc123"
}
curl -X POST https://api.browsefleet.com/v1/agent \
  -H "Content-Type: application/json" \
  -H "x-api-key: bf_your_api_key" \
  -d '{
    "task": "Go to Hacker News and find the top post title",
    "url": "https://news.ycombinator.com",
    "provider": "anthropic",
    "maxIterations": 10
  }'

Agent on Existing Session

POST/v1/sessions/:id/agent

Run an agent task on an already-created session. The session is not released automatically — you control its lifecycle.

Parameters

NameTypeRequiredDefaultDescription
taskstringYesNatural language description of the task.
urlstringNoURL to navigate to before starting.
provider"anthropic" | "openai"No"anthropic"LLM provider.
modelstringNoModel ID.
maxIterationsnumberNo15Maximum iterations.
apiKeystringNoLLM API key override.

Request Body

{
  "task": "Fill in the contact form with test data and submit it",
  "provider": "anthropic"
}

Response

{
  "success": true,
  "result": "Successfully filled and submitted the contact form",
  "steps": [ ... ],
  "totalIterations": 5
}
# First create a session
SESSION_ID=$(curl -s -X POST https://api.browsefleet.com/v1/sessions \
  -H "Content-Type: application/json" \
  -H "x-api-key: bf_your_api_key" \
  -d '{"stealth":"full"}' | jq -r .id)

# Run agent on the session
curl -X POST "https://api.browsefleet.com/v1/sessions/$SESSION_ID/agent" \
  -H "Content-Type: application/json" \
  -H "x-api-key: bf_your_api_key" \
  -d '{
    "task": "Fill in the contact form with test data and submit it"
  }'

# Release the session when done
curl -X POST "https://api.browsefleet.com/v1/sessions/$SESSION_ID/release" \
  -H "x-api-key: bf_your_api_key"

Streaming Agent (SSE)

POST/v1/agent/stream

Run an agent task with Server-Sent Events streaming. Receive screenshots, reasoning, and actions in real time as the agent works.

Parameters

NameTypeRequiredDefaultDescription
taskstringYesNatural language description of the task.
urlstringNoStarting URL.
provider"anthropic" | "openai"No"anthropic"LLM provider.
modelstringNoModel ID.
maxIterationsnumberNo15Maximum iterations.
apiKeystringNoLLM API key override.

SSE Event Types

Event TypeFieldsDescription
screenshotiteration, screenshotScreenshot taken before LLM call
stepiteration, reasoning, actionsLLM response with reasoning and planned actions
doneresult, totalIterationsTask completed successfully
failreason, totalIterationsTask could not be completed
errorerror, iteration?An error occurred
text
data: {"type":"screenshot","iteration":0,"screenshot":"<base64>"}

data: {"type":"step","iteration":0,"reasoning":"I see the HN homepage...","actions":[{"type":"click","x":200,"y":100}]}

data: {"type":"screenshot","iteration":1,"screenshot":"<base64>"}

data: {"type":"step","iteration":1,"reasoning":"I can now see...","actions":[{"type":"done","result":"Found the answer"}]}

data: {"type":"done","result":"Found the answer","totalIterations":2}
const response = await fetch('https://api.browsefleet.com/v1/agent/stream', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-api-key': 'bf_your_api_key',
  },
  body: JSON.stringify({
    task: 'Find the price of the first product on the page',
    url: 'https://example-shop.com',
  }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const text = decoder.decode(value);
  const lines = text.split('\n');

  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const event = JSON.parse(line.slice(6));
      console.log(event.type, event.reasoning || event.result || '');
    }
  }
}

Supported Providers

ProviderDefault ModelEnv Variable
anthropicclaude-sonnet-4-20250514ANTHROPIC_API_KEY
openaigpt-4oOPENAI_API_KEY

You can pass the LLM API key in the request body (apiKey), or configure it on the server via environment variables. Request-level keys take precedence.