API Reference

Computer API

The Computer API lets you interact with browser sessions through structured actions. Click, type, scroll, navigate, and take screenshots. Every action returns a screenshot of the resulting browser state, making it ideal for AI agents that use vision models.

POST/v1/sessions/:id/actions

Execute one or more browser actions on an active session.

Parameters

NameTypeRequiredDefaultDescription
actionsBrowserAction[]YesArray of actions to execute sequentially. Each action returns a result.

Request Body

{
  "actions": [
    { "type": "navigate", "url": "https://example.com" },
    { "type": "screenshot" },
    { "type": "click", "x": 200, "y": 350 },
    { "type": "type", "text": "hello world" },
    { "type": "press_key", "key": "Enter" },
    { "type": "screenshot" }
  ]
}

Response

{
  "results": [
    { "type": "navigate", "success": true, "screenshot": "<base64-png>" },
    { "type": "screenshot", "success": true, "screenshot": "<base64-png>" },
    { "type": "click", "success": true, "screenshot": "<base64-png>" },
    { "type": "type", "success": true, "screenshot": "<base64-png>" },
    { "type": "press_key", "success": true, "screenshot": "<base64-png>" },
    { "type": "screenshot", "success": true, "screenshot": "<base64-png>" }
  ]
}
curl -X POST https://api.browsefleet.com/v1/sessions/sess_abc123/actions \
  -H "Content-Type: application/json" \
  -H "x-api-key: bf_your_api_key" \
  -d '{
    "actions": [
      { "type": "navigate", "url": "https://example.com" },
      { "type": "screenshot" }
    ]
  }'

Action Types

screenshot

Capture a PNG screenshot of the current page state.

json
{ "type": "screenshot" }

click

Click at the specified coordinates. Returns a screenshot after the click.

FieldTypeRequiredDescription
xnumberYesX coordinate in pixels
ynumberYesY coordinate in pixels
button"left" | "right" | "middle"NoMouse button (default: "left")
clickCountnumberNoNumber of clicks (default: 1, use 2 for double-click)
json
{ "type": "click", "x": 500, "y": 300, "button": "left", "clickCount": 1 }

type

Type text into the currently focused element. Characters are typed with a 30ms delay for realism.

FieldTypeRequiredDescription
textstringYesText to type
json
{ "type": "type", "text": "hello world" }

press_key

Press a keyboard key. Supports all Puppeteer key names (Enter, Tab, Escape, ArrowDown, etc.).

FieldTypeRequiredDescription
keystringYesKey name (e.g. "Enter", "Tab", "Escape")
json
{ "type": "press_key", "key": "Enter" }

scroll

Scroll the page by the specified delta. Waits 500ms after scrolling for content to load.

FieldTypeRequiredDescription
deltaXnumberNoHorizontal scroll pixels (default: 0)
deltaYnumberNoVertical scroll pixels (default: 0, positive = down)
json
{ "type": "scroll", "deltaX": 0, "deltaY": 500 }

move_mouse

Move the mouse cursor to the specified coordinates without clicking.

json
{ "type": "move_mouse", "x": 500, "y": 300 }

wait

Wait for the specified duration before continuing. Maximum 30 seconds.

json
{ "type": "wait", "duration": 2000 }

navigate

Navigate the browser to a new URL. Waits for network idle before completing.

json
{ "type": "navigate", "url": "https://example.com/page" }

AI Agent Integration

The Computer API is designed for integration with vision-capable LLMs. The typical loop is: take a screenshot, send it to the LLM for analysis, execute the LLM's recommended actions, take another screenshot, and repeat.

import Anthropic from '@anthropic-ai/sdk';
import { BrowseFleet } from 'browsefleet';

const bf = new BrowseFleet({ apiKey: 'bf_...' });
const anthropic = new Anthropic();

const session = await bf.sessions.create({ stealth: 'full' });

// Navigate and get initial screenshot
let result = await bf.sessions.actions(session.id, [
  { type: 'navigate', url: 'https://example.com' },
  { type: 'screenshot' },
]);

let screenshot = result.results[1].screenshot;

// Send to Claude for analysis
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: [
      { type: 'image', source: { type: 'base64', media_type: 'image/png', data: screenshot } },
      { type: 'text', text: 'Click the "Learn More" link on this page.' },
    ],
  }],
});

// Parse Claude's response and execute actions
// ... parse coordinates from response ...
await bf.sessions.actions(session.id, [
  { type: 'click', x: parsedX, y: parsedY },
  { type: 'screenshot' },
]);