The Web, Readable. Your AI agent spends 60,000 tokens just to look at a web page. Charlotte does it in 336. Charlotte is an MCP server that gives AI agents structured, token-efficient access to the web. Instead of dumping the full accessibility tree on every call, Charlotte returns only what the agent needs: a compact page summary on arrival, targeted queries for specific elements, and full detail
Add this skill
npx mdskills install TickTockBent/charlotteToken-efficient browser automation MCP with 42 tools, structured page representation, and 25-182x data reduction
1# Charlotte23**The Web, Readable.**45Your AI agent spends 60,000 tokens just to look at a web page. Charlotte does it in 336.67Charlotte is an MCP server that gives AI agents structured, token-efficient access to the web.8Instead of dumping the full accessibility tree on every call, Charlotte returns only what9the agent needs: a compact page summary on arrival, targeted queries for specific elements,10and full detail only when explicitly requested. The result is 25-182x less data per page11compared to [Playwright MCP](https://github.com/anthropics/playwright-mcp), saving thousands of dollars across production workloads.1213## Why Charlotte?1415Most browser MCP servers dump the entire accessibility tree on every call — a flat text blob that can exceed a million characters on content-heavy pages. Agents pay for all of it whether they need it or not.1617Charlotte decomposes each page into a typed, structured representation — landmarks, headings, interactive elements, forms, content summaries — and lets agents control how much they receive with three detail levels. When an agent navigates to a new page, it gets a compact orientation (336 characters for Hacker News) instead of the full element dump (61,000+ characters). When it needs specifics, it asks for them.1819### Benchmarks2021Charlotte v0.5.0 vs Playwright MCP, measured by characters returned per tool call on real websites:2223**Navigation** (first contact with a page):2425| Site | Charlotte `navigate` | Playwright `browser_navigate` |26|:---|---:|---:|27| example.com | 612 | 817 |28| Wikipedia (AI article) | 7,667 | 1,040,636 |29| Hacker News | 336 | 61,230 |30| GitHub repo | 3,185 | 80,297 |3132Charlotte's `navigate` returns minimal detail by default — landmarks, headings, and interactive element counts grouped by page region. Enough to orient, not enough to overwhelm. On Wikipedia, that's **135x smaller** than Playwright's response.3334**Tool definition overhead** (invisible cost per API call):3536| Profile | Tools | Def. tokens/call | Savings vs full |37|:---|---:|---:|---:|38| full | 42 | ~7,400 | — |39| browse (default) | 23 | ~3,900 | **~47%** |40| core | 7 | 1,677 | **~77%** |4142Tool definitions are sent on every API round-trip. With the default `browse` profile, Charlotte carries ~47% less definition overhead than loading all tools. Over a 20-call browsing session, that's **~38% fewer total tokens**. See the [profile benchmark report](docs/charlotte-profile-benchmark-report.md) for full results.4344**The workflow difference:** Playwright agents receive 61K+ characters every time they look at Hacker News, whether they're reading headlines or looking for a login button. Charlotte agents get 336 characters on arrival, call `find({ type: "link", text: "login" })` to get exactly what they need, and never pay for the rest.4546## How It Works4748Charlotte maintains a persistent headless Chromium session and acts as a translation layer between the visual web and the agent's text-native reasoning. Every page is decomposed into a structured representation:4950```51┌─────────────┐ MCP Protocol ┌──────────────────┐52│ AI Agent │<────────────────────>│ Charlotte │53└─────────────┘ │ │54 │ ┌────────────┐ │55 │ │ Renderer │ │56 │ │ Pipeline │ │57 │ └─────┬──────┘ │58 │ │ │59 │ ┌─────▼──────┐ │60 │ │ Headless │ │61 │ │ Chromium │ │62 │ └────────────┘ │63 └──────────────────┘64```6566Agents receive landmarks, headings, interactive elements with typed metadata, bounding boxes, form structures, and content summaries — all derived from what the browser already knows about every page.6768## Features6970**Navigation** — `navigate`, `back`, `forward`, `reload`7172**Observation** — `observe` (3 detail levels, structural tree view), `find` (spatial + semantic search, CSS selector mode), `screenshot` (with persistent artifact management), `screenshots`, `screenshot_get`, `screenshot_delete`, `diff` (structural comparison against snapshots)7374**Interaction** — `click`, `click_at` (coordinate-based), `type`, `select`, `toggle`, `submit`, `scroll`, `hover`, `drag`, `key` (single/sequence with element targeting), `wait_for` (async condition polling), `upload` (file input), `dialog` (accept/dismiss JS dialogs)7576**Monitoring** — `console` (all severity levels, filtering, timestamps), `requests` (full HTTP history, method/status/resource type filtering)7778**Session Management** — `tabs`, `tab_open`, `tab_switch`, `tab_close`, `viewport` (device presets), `network` (throttling, URL blocking), `set_cookies`, `get_cookies`, `clear_cookies`, `set_headers`, `configure`7980**Development Mode** — `dev_serve` (static server + file watching with auto-reload), `dev_inject` (CSS/JS injection), `dev_audit` (a11y, performance, SEO, contrast, broken links)8182**Utilities** — `evaluate` (arbitrary JS execution in page context)8384## Tool Profiles8586Charlotte ships 42 tools (41 registered + the `charlotte:tools` meta-tool), but most workflows only need a subset. Startup profiles control which tools load into the agent's context, reducing definition overhead by up to 77%.8788```bash89charlotte --profile browse # 23 tools (default) — navigate, observe, interact, tabs90charlotte --profile core # 7 tools — navigate, observe, find, click, type, submit91charlotte --profile full # 42 tools — everything92charlotte --profile interact # 30 tools — full interaction + dialog + evaluate93charlotte --profile develop # 33 tools — interact + dev_serve, dev_inject, dev_audit94charlotte --profile audit # 14 tools — navigation + observation + dev_audit + viewport95```9697Agents can activate more tools mid-session without restarting:9899```100charlotte:tools enable dev_mode → activates dev_serve, dev_audit, dev_inject101charlotte:tools disable dev_mode → deactivates them102charlotte:tools list → see what's loaded103```104105## Quick Start106107### Prerequisites108109- Node.js >= 22110- npm111112### Installation113114Charlotte is listed on the [MCP Registry](https://registry.modelcontextprotocol.io) as `io.github.TickTockBent/charlotte` and published on npm as [`@ticktockbent/charlotte`](https://www.npmjs.com/package/@ticktockbent/charlotte):115116```bash117npm install -g @ticktockbent/charlotte118```119120Docker images are available on [Docker Hub](https://hub.docker.com/r/ticktockbent/charlotte) and [GitHub Container Registry](https://github.com/ticktockbent/charlotte/pkgs/container/charlotte):121122```bash123# Alpine (default, smaller)124docker pull ticktockbent/charlotte:alpine125126# Debian (if you need glibc compatibility)127docker pull ticktockbent/charlotte:debian128129# Or from GHCR130docker pull ghcr.io/ticktockbent/charlotte:latest131```132133Or install from source:134135```bash136git clone https://github.com/ticktockbent/charlotte.git137cd charlotte138npm install139npm run build140```141142### Run143144Charlotte communicates over stdio using the MCP protocol:145146```bash147# If installed globally (default browse profile)148charlotte149150# With a specific profile151charlotte --profile core152153# If installed from source154npm start155```156157### MCP Client Configuration158159#### Claude Code160161Create `.mcp.json` in your project root:162163```json164{165 "mcpServers": {166 "charlotte": {167 "type": "stdio",168 "command": "npx",169 "args": ["@ticktockbent/charlotte"],170 "env": {}171 }172 }173}174```175176#### Claude Desktop177178Add to `claude_desktop_config.json`:179180```json181{182 "mcpServers": {183 "charlotte": {184 "command": "npx",185 "args": ["@ticktockbent/charlotte"]186 }187 }188}189```190191#### Cursor192193Add to `.cursor/mcp.json`:194195```json196{197 "mcpServers": {198 "charlotte": {199 "command": "npx",200 "args": ["@ticktockbent/charlotte"]201 }202 }203}204```205206#### Windsurf207208Add to `~/.codeium/windsurf/mcp_config.json`:209210```json211{212 "mcpServers": {213 "charlotte": {214 "command": "npx",215 "args": ["@ticktockbent/charlotte"]216 }217 }218}219```220221#### VS Code (Copilot)222223Add to `.vscode/mcp.json`:224225```json226{227 "servers": {228 "charlotte": {229 "type": "stdio",230 "command": "npx",231 "args": ["@ticktockbent/charlotte"]232 }233 }234}235```236237#### Cline238239Add to Cline MCP settings (via the Cline sidebar > MCP Servers > Configure):240241```json242{243 "mcpServers": {244 "charlotte": {245 "command": "npx",246 "args": ["@ticktockbent/charlotte"]247 }248 }249}250```251252#### Amp253254Add to `~/.amp/settings.json`:255256```json257{258 "mcpServers": {259 "charlotte": {260 "command": "npx",261 "args": ["@ticktockbent/charlotte"]262 }263 }264}265```266267See [docs/mcp-setup.md](docs/mcp-setup.md) for the full setup guide, including development mode, generic MCP clients, verification steps, and troubleshooting.268269## Usage Examples270271Once connected, an agent can use Charlotte's tools:272273### Browse a website274275```276navigate({ url: "https://example.com" })277// → 612 chars: landmarks, headings, interactive element counts278279find({ type: "link", text: "More information" })280// → just the matching element with its ID281282click({ element_id: "lnk-a3f1" })283```284285### Fill out a form286287```288navigate({ url: "https://httpbin.org/forms/post" })289find({ type: "text_input" })290type({ element_id: "inp-c7e2", text: "hello@example.com" })291select({ element_id: "sel-e8a3", value: "option-2" })292submit({ form_id: "frm-b1d4" })293```294295### Local development feedback loop296297```298dev_serve({ path: "./my-site", watch: true })299observe({ detail: "full" })300dev_audit({ checks: ["a11y", "contrast"] })301dev_inject({ css: "body { font-size: 18px; }" })302```303304## Page Representation305306Charlotte returns structured representations with three detail levels that let agents control how much context they consume:307308### Minimal (default for `navigate`)309310Landmarks, headings, and interactive element counts grouped by page region. Designed for orientation — "what's on this page?" — without listing every element.311312```json313{314 "url": "https://news.ycombinator.com",315 "title": "Hacker News",316 "viewport": { "width": 1280, "height": 720 },317 "structure": {318 "headings": [{ "level": 1, "text": "Hacker News", "id": "h-a1b2" }]319 },320 "interactive_summary": {321 "total": 93,322 "by_landmark": {323 "(page root)": { "link": 91, "text_input": 1, "button": 1 }324 }325 }326}327```328329### Summary (default for `observe`)330331Full interactive element list with typed metadata, form structures, and content summaries.332333```json334{335 "url": "https://example.com/dashboard",336 "title": "Dashboard",337 "viewport": { "width": 1280, "height": 720 },338 "structure": {339 "landmarks": [340 { "id": "rgn-b2c1", "role": "banner", "label": "Site header", "bounds": { "x": 0, "y": 0, "w": 1280, "h": 64 } },341 { "id": "rgn-d4e5", "role": "main", "label": "Content", "bounds": { "x": 240, "y": 64, "w": 1040, "h": 656 } }342 ],343 "headings": [{ "level": 1, "text": "Dashboard", "id": "h-1a2b" }],344 "content_summary": "main: 2 headings, 5 links, 1 form"345 },346 "interactive": [347 {348 "id": "btn-a3f1",349 "type": "button",350 "label": "Create Project",351 "bounds": { "x": 960, "y": 80, "w": 160, "h": 40 },352 "state": {}353 }354 ],355 "forms": []356}357```358359### Full360361Everything in summary, plus all visible text content on the page.362363## Detail Levels364365| Level | Tokens | Use case |366|:---|:---|:---|367| `minimal` | ~50-200 | Orientation after navigation. What regions exist? How many interactive elements? |368| `summary` | ~500-5000 | Working with the page. Full element list, form structures, content summaries. |369| `full` | variable | Reading page content. All visible text included. |370371Navigation tools default to `minimal`. The `observe` tool defaults to `summary`. Both accept an optional `detail` parameter to override.372373## Element IDs374375Element IDs are stable across minor DOM mutations. They're generated by hashing a composite key of element type, ARIA role, accessible name, and DOM path signature:376377```378btn-a3f1 (button) inp-c7e2 (text input)379lnk-d4b9 (link) sel-e8a3 (select)380chk-f1a2 (checkbox) frm-b1d4 (form)381rgn-e0d2 (landmark) hdg-0f40 (heading)382dom-b2c3 (DOM element, from CSS selector queries)383```384385IDs survive unrelated DOM changes and element reordering within the same container. When an agent navigates at minimal detail (no individual element IDs), it uses `find` to locate elements by text, type, or spatial proximity — the returned elements include IDs ready for interaction.386387## Development388389```bash390# Run in watch mode391npm run dev392393# Run all tests394npm test395396# Run only unit tests397npm run test:unit398399# Run only integration tests400npm run test:integration401402# Type check403npx tsc --noEmit404```405406### Project Structure407408```409src/410 browser/ # Puppeteer lifecycle, tab management, CDP sessions411 renderer/ # Accessibility tree extraction, layout, content, element IDs412 state/ # Snapshot store, structural differ413 tools/ # MCP tool definitions (navigation, observation, interaction, session, dev-mode)414 dev/ # Static server, file watcher, auditor415 types/ # TypeScript interfaces416 utils/ # Logger, hash, wait utilities417tests/418 unit/ # Fast tests with mocks419 integration/ # Full Puppeteer tests against fixture HTML420 fixtures/pages/ # Test HTML files421```422423### Architecture424425The **Renderer Pipeline** is the core — it calls extractors in order and assembles a `PageRepresentation`:4264271. Accessibility tree extraction (CDP `Accessibility.getFullAXTree`)4282. Layout extraction (CDP `DOM.getBoxModel`)4293. Landmark, heading, interactive element, and content extraction4304. Element ID generation (hash-based, stable across re-renders)431432All tools go through `renderActivePage()` which handles snapshots, reload events, dialog detection, and response formatting.433434## Sandbox435436Charlotte includes a test website in `tests/sandbox/` that exercises all tools without touching the public internet. Serve it locally with:437438```439dev_serve({ path: "tests/sandbox" })440```441442Four pages cover navigation, forms, interactive elements, delayed content, scroll containers, and more. See [docs/sandbox.md](docs/sandbox.md) for the full page reference and a tool-by-tool exercise checklist.443444## Known Issues445446**Tool naming convention** — Charlotte uses `:` as a namespace separator in tool names (e.g., `charlotte:navigate`, `charlotte:observe`). MCP SDK v1.26.0+ logs validation warnings for this character, as the emerging [SEP standard](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/986) restricts tool names to `[A-Za-z0-9_.-]`. This does not affect functionality — all tools work correctly — but produces stderr warnings on server startup. Will be addressed in a future release to comply with the SEP standard.447448**Shadow DOM** — Open shadow DOM works transparently. Chromium's accessibility tree pierces open shadow boundaries, so web components (e.g., GitHub's `<relative-time>`, `<tool-tip>`) render their content into Charlotte's representation without special handling. Closed shadow roots are opaque to the accessibility tree and will not be captured.449450## Roadmap451452### Interaction Gaps453454**Batch Form Fill** — Add a `charlotte:fill_form` tool that accepts an array of `{element_id, value}` pairs and fills an entire form in a single tool call, reducing N sequential `type`/`select`/`toggle` calls to one.455456**Slow Typing** — Add a `slowly` or `character_delay` parameter to `charlotte:type` for character-by-character input. Required for sites with key-by-key event handlers (autocomplete, search-as-you-type, input validation).457458### Session & Configuration459460**Connect to Existing Browser** — Add a `--cdp-endpoint` CLI argument so Charlotte can attach to an already-running browser via `puppeteer.connect()` instead of always launching a new instance. Enables working with logged-in sessions and browser extensions.461462**Persistent Init Scripts** — Add a `--init-script` CLI argument to inject JavaScript on every page load via `page.evaluateOnNewDocument()`. Charlotte's `dev_inject` currently applies CSS/JS once and does not persist across navigations.463464**Configuration File** — Support a `--config` CLI argument to load settings from a JSON file, simplifying repeatable setups and CI/CD integration.465466**Full Device Emulation** — Extend `charlotte:viewport` to accept named devices (e.g., "iPhone 15") and configure user agent, touch support, and device pixel ratio via CDP, not just viewport dimensions.467468### Feature Roadmap469470**Video Recording** — Record interactions as video, capturing the full sequence of agent-driven navigation and manipulation for debugging, documentation, and review.471472**ARM64 Docker Images** — Add `linux/arm64` platform support to the Docker publish workflow for native performance on Apple Silicon Macs and ARM servers.473474See [docs/playwright-mcp-gap-analysis.md](docs/playwright-mcp-gap-analysis.md) for the full gap analysis against Playwright MCP, including lower-priority items (vision tools, testing/verification, tracing, transport, security) and areas where Charlotte has advantages.475476## Full Specification477478See [docs/CHARLOTTE_SPEC.md](docs/CHARLOTTE_SPEC.md) for the complete specification including all tool parameters, the page representation format, element identity strategy, and architecture details.479480## License481482[MIT](LICENSE)483484## Contributing485486See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.487---488489*Part of a growing suite of literary-named MCP servers. See more at [github.com/TickTockBent](https://github.com/TickTockBent).*490
Full transparency — inspect the skill content before installing.