API Reference
Agent
The Agent API provides autonomous browser automation powered by vision-capable LLMs. Give it a task in natural language, and it will iteratively screenshot the browser, reason about what to do, execute actions, and repeat until the task is complete. Supports Claude (Anthropic) and GPT-4o (OpenAI).
How the Agent Works
1. Create a browser session (stealth: full, viewport: 1280x900)
2. Navigate to the starting URL (if provided)
3. Take a screenshot of the current page
4. Send screenshot + task description to the LLM
5. LLM returns structured actions with reasoning
6. Execute the actions (click, type, scroll, navigate, etc.)
7. Repeat steps 3-6 until:
- The LLM returns a "done" action with a result
- The LLM returns a "fail" action with a reason
- Maximum iterations reached (default: 15, max: 30)
8. Release the session and return the resultAutonomous Agent
/v1/agentRun an autonomous agent task. Creates a session automatically, runs the task, and releases the session when done.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
task | string | Yes | — | Natural language description of the task to perform. |
url | string | No | — | Starting URL. The agent navigates here before beginning the task. |
provider | "anthropic" | "openai" | No | "anthropic" | LLM provider to use. |
model | string | No | "claude-sonnet-4-20250514" or "gpt-4o" | Model ID. Defaults based on provider. |
maxIterations | number | No | 15 | Maximum number of screenshot-action loops (max: 30). |
apiKey | string | No | — | LLM API key. If omitted, uses server-configured ANTHROPIC_API_KEY or OPENAI_API_KEY. |
Request Body
{
"task": "Go to Hacker News and find the top post title",
"url": "https://news.ycombinator.com",
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"maxIterations": 10
}Response
{
"success": true,
"result": "The top post is 'Show HN: BrowseFleet - Cloud Browser API'",
"steps": [
{
"iteration": 0,
"reasoning": "I see the Hacker News homepage. The top post is visible.",
"actions": [
{ "type": "done", "result": "The top post is 'Show HN: BrowseFleet - Cloud Browser API'" }
],
"screenshot": "<base64-png>"
}
],
"totalIterations": 1,
"sessionId": "sess_abc123"
}curl -X POST https://api.browsefleet.com/v1/agent \
-H "Content-Type: application/json" \
-H "x-api-key: bf_your_api_key" \
-d '{
"task": "Go to Hacker News and find the top post title",
"url": "https://news.ycombinator.com",
"provider": "anthropic",
"maxIterations": 10
}'Agent on Existing Session
/v1/sessions/:id/agentRun an agent task on an already-created session. The session is not released automatically — you control its lifecycle.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
task | string | Yes | — | Natural language description of the task. |
url | string | No | — | URL to navigate to before starting. |
provider | "anthropic" | "openai" | No | "anthropic" | LLM provider. |
model | string | No | — | Model ID. |
maxIterations | number | No | 15 | Maximum iterations. |
apiKey | string | No | — | LLM API key override. |
Request Body
{
"task": "Fill in the contact form with test data and submit it",
"provider": "anthropic"
}Response
{
"success": true,
"result": "Successfully filled and submitted the contact form",
"steps": [ ... ],
"totalIterations": 5
}# First create a session
SESSION_ID=$(curl -s -X POST https://api.browsefleet.com/v1/sessions \
-H "Content-Type: application/json" \
-H "x-api-key: bf_your_api_key" \
-d '{"stealth":"full"}' | jq -r .id)
# Run agent on the session
curl -X POST "https://api.browsefleet.com/v1/sessions/$SESSION_ID/agent" \
-H "Content-Type: application/json" \
-H "x-api-key: bf_your_api_key" \
-d '{
"task": "Fill in the contact form with test data and submit it"
}'
# Release the session when done
curl -X POST "https://api.browsefleet.com/v1/sessions/$SESSION_ID/release" \
-H "x-api-key: bf_your_api_key"Streaming Agent (SSE)
/v1/agent/streamRun an agent task with Server-Sent Events streaming. Receive screenshots, reasoning, and actions in real time as the agent works.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
task | string | Yes | — | Natural language description of the task. |
url | string | No | — | Starting URL. |
provider | "anthropic" | "openai" | No | "anthropic" | LLM provider. |
model | string | No | — | Model ID. |
maxIterations | number | No | 15 | Maximum iterations. |
apiKey | string | No | — | LLM API key override. |
SSE Event Types
| Event Type | Fields | Description |
|---|---|---|
screenshot | iteration, screenshot | Screenshot taken before LLM call |
step | iteration, reasoning, actions | LLM response with reasoning and planned actions |
done | result, totalIterations | Task completed successfully |
fail | reason, totalIterations | Task could not be completed |
error | error, iteration? | An error occurred |
data: {"type":"screenshot","iteration":0,"screenshot":"<base64>"}
data: {"type":"step","iteration":0,"reasoning":"I see the HN homepage...","actions":[{"type":"click","x":200,"y":100}]}
data: {"type":"screenshot","iteration":1,"screenshot":"<base64>"}
data: {"type":"step","iteration":1,"reasoning":"I can now see...","actions":[{"type":"done","result":"Found the answer"}]}
data: {"type":"done","result":"Found the answer","totalIterations":2}const response = await fetch('https://api.browsefleet.com/v1/agent/stream', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-api-key': 'bf_your_api_key',
},
body: JSON.stringify({
task: 'Find the price of the first product on the page',
url: 'https://example-shop.com',
}),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
const lines = text.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const event = JSON.parse(line.slice(6));
console.log(event.type, event.reasoning || event.result || '');
}
}
}Supported Providers
| Provider | Default Model | Env Variable |
|---|---|---|
| anthropic | claude-sonnet-4-20250514 | ANTHROPIC_API_KEY |
| openai | gpt-4o | OPENAI_API_KEY |
You can pass the LLM API key in the request body (apiKey), or configure it on the server via environment variables. Request-level keys take precedence.