AI-powered browser automation for Claude Code. Control real web browsers directly from Claude -- navigate websites, fill forms, extract data, inspect accessibility trees, and automate multi-step workflows. - Chrome or Chromium installed on your system - Python 3.12+ with uv package manager - Claude Code CLI This installs the MCP server, 5 skills, and auto-enables the plugin. Restart Claude Code to
Add this skill
npx mdskills install billy-enrizky/openbrowser-aiPowerful browser automation with token-efficient architecture and comprehensive functions
1# OpenBrowser - Claude Code Plugin23AI-powered browser automation for Claude Code. Control real web browsers directly from Claude -- navigate websites, fill forms, extract data, inspect accessibility trees, and automate multi-step workflows.45## Prerequisites67- **Chrome or Chromium** installed on your system8- **Python 3.12+** with [uv](https://docs.astral.sh/uv/) package manager9- **Claude Code** CLI1011## Installation1213### From GitHub marketplace1415```bash16# Add the OpenBrowser marketplace (one-time)17claude plugin marketplace add billy-enrizky/openbrowser-ai1819# Install the plugin20claude plugin install openbrowser@openbrowser-ai21```2223This installs the MCP server, 5 skills, and auto-enables the plugin. Restart Claude Code to activate.2425### Local development2627```bash28# Test from a local clone without installing29claude --plugin-dir /path/to/openbrowser-ai/plugin30```3132### OpenClaw3334[OpenClaw](https://openclaw.ai) does not natively support MCP servers, but the community35[openclaw-mcp-adapter](https://github.com/androidStern-personal/openclaw-mcp-adapter) plugin36bridges MCP servers to OpenClaw agents.37381. Install the MCP adapter plugin (see its README for setup).39402. Add OpenBrowser as an MCP server in `~/.openclaw/openclaw.json`:4142```json43{44 "plugins": {45 "entries": {46 "mcp-adapter": {47 "enabled": true,48 "config": {49 "servers": [50 {51 "name": "openbrowser",52 "transport": "stdio",53 "command": "uvx",54 "args": ["openbrowser-ai[mcp]", "--mcp"]55 }56 ]57 }58 }59 }60 }61}62```6364The `execute_code` tool will be registered as a native OpenClaw agent tool.6566For OpenClaw plugin documentation, see [docs.openclaw.ai/tools/plugin](https://docs.openclaw.ai/tools/plugin).6768### Standalone MCP server (without plugin)6970Add to your project's `.mcp.json`:7172```json73{74 "mcpServers": {75 "openbrowser": {76 "command": "uvx",77 "args": ["openbrowser-ai[mcp]", "--mcp"]78 }79 }80}81```8283## Available Tool8485The MCP server exposes a single `execute_code` tool that runs Python code in a persistent namespace with browser automation functions. The LLM writes Python code to navigate, interact, and extract data.8687**Functions** (all async, use `await`):8889| Category | Functions |90|----------|-----------|91| **Navigation** | `navigate(url, new_tab)`, `go_back()`, `wait(seconds)` |92| **Interaction** | `click(index)`, `input_text(index, text, clear)`, `scroll(down, pages, index)`, `send_keys(keys)`, `upload_file(index, path)` |93| **Dropdowns** | `select_dropdown(index, text)`, `dropdown_options(index)` |94| **Tabs** | `switch(tab_id)`, `close(tab_id)` |95| **JavaScript** | `evaluate(code)` -- run JS in page context, returns Python objects |96| **State** | `browser.get_browser_state_summary()` -- page metadata and interactive elements |97| **CSS** | `get_selector_from_index(index)` -- CSS selector for an element |98| **Completion** | `done(text, success)` -- signal task completion |99100**Pre-imported libraries**: `json`, `csv`, `re`, `datetime`, `asyncio`, `Path`, `requests`, `numpy`, `pandas`, `matplotlib`, `BeautifulSoup`101102## Benchmark: Token Efficiency103104<<<<<<< HEAD105### E2E LLM Benchmark (6 Real-World Tasks, N=5 runs)106107Six browser tasks run through Claude Sonnet 4.6 on AWS Bedrock (Converse API). The LLM autonomously decides which tools to call. All three servers pass **6/6 tasks**. 5 runs per server with 10,000-sample bootstrap CIs. Bedrock API tokens measured from the Converse API `usage` field.108109| MCP Server | Tools | Bedrock API Tokens | Tool Calls (mean) | vs OpenBrowser |110|------------|------:|-------------------:|-----------:|---------------:|111| **Playwright MCP** | 22 | 158,787 | 9.4 | **3.2x more tokens** |112| **Chrome DevTools MCP** (Google) | 26 | 299,486 | 19.4 | **6.0x more tokens** |113| **OpenBrowser MCP** | 1 | **50,195** | 13.8 | baseline |114115### Cost per Benchmark Run (6 Tasks)116117Based on Bedrock API token usage (input + output tokens at respective rates).118119| Model | Playwright MCP | Chrome DevTools MCP | OpenBrowser MCP |120|-------|---------------:|--------------------:|----------------:|121| Claude Sonnet 4.6 ($3/$15 per M) | $0.50 | $0.92 | **$0.18** |122| Claude Opus 4.6 ($5/$25 per M) | $0.83 | $1.53 | **$0.30** |123124### Per-Task MCP Response Size125126MCP tool response sizes show the architectural difference. Playwright and Chrome DevTools dump full page snapshots; OpenBrowser returns only extracted data.127128| Task | Playwright MCP | Chrome DevTools MCP | OpenBrowser MCP |129|------|---------------:|--------------------:|----------------:|130| fact_lookup | 520,742 chars | 509,058 chars | 3,144 chars |131| form_fill | 4,075 chars | 3,150 chars | 2,305 chars |132| multi_page_extract | 58,392 chars | 38,880 chars | 294 chars |133| search_navigate | 519,241 chars | 595,590 chars | 2,848 chars |134| deep_navigation | 14,875 chars | 195 chars | 113 chars |135| content_analysis | 485 chars | 501 chars | 499 chars |136=======137### E2E LLM Benchmark (6 Real-World Tasks)138139Six browser tasks run through Claude Sonnet 4.6 on AWS Bedrock. The LLM autonomously decides which tools to call. All three servers pass **6/6 tasks**. Token usage measured from actual MCP tool response sizes.140141| MCP Server | Tools | Response Tokens | Tool Calls | vs OpenBrowser |142|------------|------:|----------------:|-----------:|---------------:|143| **Playwright MCP** | 22 | 283,853 | 10 | **170x more tokens** |144| **Chrome DevTools MCP** (Google) | 26 | 301,030 | 21 | **181x more tokens** |145| **OpenBrowser MCP** | 1 | **1,665** | 20 | baseline |146147### Cost per Benchmark Run (6 Tasks)148149| Model | Playwright MCP | Chrome DevTools MCP | OpenBrowser MCP |150|-------|---------------:|--------------------:|----------------:|151| Claude Sonnet ($3/M) | $0.852 | $0.903 | **$0.005** |152| Claude Opus ($15/M) | $4.258 | $4.515 | **$0.025** |153154### Per-Task Response Size155156| Task | Playwright MCP | Chrome DevTools MCP | OpenBrowser MCP |157|------|---------------:|--------------------:|----------------:|158| fact_lookup | 477,003 chars | 509,059 chars | 1,041 chars |159| form_fill | 4,075 chars | 3,150 chars | 2,410 chars |160| multi_page_extract | 58,099 chars | 38,593 chars | 513 chars |161| search_navigate | 518,461 chars | 594,458 chars | 1,996 chars |162| deep_navigation | 77,292 chars | 58,359 chars | 113 chars |163| content_analysis | 493 chars | 513 chars | 594 chars |164>>>>>>> origin/main165166Playwright completes tasks in fewer tool calls (1-2 per task) because it dumps the full a11y snapshot on every navigation. OpenBrowser takes more round-trips but each response is compact -- the code extracts only what's needed.167168[Full comparison with methodology](https://docs.openbrowser.me/comparison)169170## Configuration171172Optional environment variables:173174| Variable | Description |175|----------|-------------|176| `OPENBROWSER_HEADLESS` | Set to `true` to run browser without GUI |177| `OPENBROWSER_ALLOWED_DOMAINS` | Comma-separated domain whitelist |178179Set these in your `.mcp.json`:180181```json182{183 "mcpServers": {184 "openbrowser": {185 "command": "uvx",186 "args": ["openbrowser-ai[mcp]", "--mcp"],187 "env": {188 "OPENBROWSER_HEADLESS": "true"189 }190 }191 }192}193```194195## Skills196197The plugin includes 5 built-in skills that provide guided workflows for common browser automation tasks. Each skill is triggered automatically when the user's request matches its description.198199| Skill | Directory | Description |200|-------|-----------|-------------|201| `web-scraping` | `skills/web-scraping/` | Extract structured data from websites, handle pagination, and multi-tab scraping |202| `form-filling` | `skills/form-filling/` | Fill out web forms, handle login/registration flows, and multi-step wizards |203| `e2e-testing` | `skills/e2e-testing/` | Test web applications end-to-end by simulating user interactions and verifying outcomes |204| `page-analysis` | `skills/page-analysis/` | Analyze page content, structure, metadata, and interactive elements |205| `accessibility-audit` | `skills/accessibility-audit/` | Audit pages for WCAG compliance, heading structure, labels, alt text, ARIA, and landmarks |206207Each skill file (`SKILL.md`) contains YAML frontmatter with trigger conditions and a step-by-step workflow using the `execute_code` tool.208209## Testing and Benchmarks210211```bash212# E2E test the MCP server against the published PyPI package213uv run python benchmarks/e2e_published_test.py214215# Run MCP benchmarks (5-step Wikipedia workflow)216uv run python benchmarks/openbrowser_benchmark.py217uv run python benchmarks/playwright_benchmark.py218uv run python benchmarks/cdp_benchmark.py219```220221## Troubleshooting222223**Browser does not launch**: Ensure Chrome or Chromium is installed and accessible from PATH.224225**MCP server not found**: Verify `uvx` is installed (`pip install uv`) and the MCP server starts (`uvx openbrowser-ai[mcp] --mcp`).226227**Session timeout**: Browser sessions auto-close after 10 minutes of inactivity. Use any tool to keep the session alive.228229## License230231MIT232
Full transparency — inspect the skill content before installing.