Spreadsheet automation for AI agents. Read, profile, edit, and recalculate .xlsx workbooks with tooling designed to be token-efficient, structurally aware, and agent-friendly. spreadsheet-kit ships two surfaces: Both share the same core engine and support .xlsx / .xlsm (read + write) and .xls / .xlsb (discovery only). Downloads a prebuilt native binary for your platform. No Rust toolchain required
Add this skill
npx mdskills install PSU3D0/spreadsheet-mcpComprehensive spreadsheet automation MCP server with rich CLI and structured data operations
1# spreadsheet-kit23[](https://github.com/PSU3D0/spreadsheet-mcp/actions/workflows/ci.yml)4[](https://crates.io/crates/spreadsheet-mcp)5[](https://www.npmjs.com/package/agent-spreadsheet)6[](https://github.com/PSU3D0/spreadsheet-mcp/blob/main/LICENSE)78910**Spreadsheet automation for AI agents.** Read, profile, edit, and recalculate `.xlsx` workbooks with tooling designed to be token-efficient, structurally aware, and agent-friendly.1112spreadsheet-kit ships two surfaces:1314| Surface | Binary | Mode | Best for |15| --- | --- | --- | --- |16| **[agent-spreadsheet](#quickstart-cli)** | `agent-spreadsheet` | Stateless CLI | Scripts, pipelines, one-shot agent tasks |17| **[spreadsheet-mcp](#quickstart-mcp-server)** | `spreadsheet-mcp` | Stateful MCP server | Multi-turn agent sessions with caching and fork/recalc |1819Both share the same core engine and support `.xlsx` / `.xlsm` (read + write) and `.xls` / `.xlsb` (discovery only).2021---2223## Install2425### npm (recommended for CLI)2627```bash28npm i -g agent-spreadsheet29agent-spreadsheet --help30```3132Downloads a prebuilt native binary for your platform. No Rust toolchain required.3334### Cargo3536```bash37# CLI38cargo install agent-spreadsheet3940# MCP server41cargo install spreadsheet-mcp42```4344Formualizer (native Rust recalc engine) is included by default. To build without it, use `--no-default-features`.4546### Prebuilt binaries4748Download from [GitHub Releases](https://github.com/PSU3D0/spreadsheet-mcp/releases).49Builds are published for Linux x86_64, macOS x86_64/aarch64, and Windows x86_64.5051### Docker (MCP server)5253```bash54# Read-only (~15 MB)55docker pull ghcr.io/psu3d0/spreadsheet-mcp:latest5657# With write/recalc support (~800 MB, includes LibreOffice)58docker pull ghcr.io/psu3d0/spreadsheet-mcp:full59```6061---6263## Quickstart: CLI6465The CLI is the fastest path to working with spreadsheets from code. Every command is stateless — pass a file, get JSON.6667```bash68# List sheets69agent-spreadsheet list-sheets data.xlsx7071# Profile structure and detected regions72agent-spreadsheet sheet-overview data.xlsx "Sheet1"7374# Read a table as structured data75agent-spreadsheet read-table data.xlsx --sheet "Sheet1"7677# Read one or more raw ranges78agent-spreadsheet range-values data.xlsx Sheet1 A1:C207980# Search values directly (default mode is value)81agent-spreadsheet find-value data.xlsx "Revenue" --mode value8283# Label lookup: match a label cell, then read an adjacent value84agent-spreadsheet find-value data.xlsx "Net Income" --mode label --label-direction below8586# Describe workbook metadata87agent-spreadsheet describe data.xlsx88```8990### Deterministic pagination loops9192```bash93# sheet-page continuation94agent-spreadsheet sheet-page data.xlsx Sheet1 --format compact --page-size 20095agent-spreadsheet sheet-page data.xlsx Sheet1 --format compact --page-size 200 --start-row 2019697# read-table continuation98agent-spreadsheet read-table data.xlsx --sheet "Sheet1" --table-format values --limit 200 --offset 099agent-spreadsheet read-table data.xlsx --sheet "Sheet1" --table-format values --limit 200 --offset 200100```101102### Edit → recalculate → diff103104```bash105agent-spreadsheet copy data.xlsx /tmp/draft.xlsx106agent-spreadsheet edit /tmp/draft.xlsx Sheet1 "B2=500" "C2==B2*1.1"107agent-spreadsheet recalculate /tmp/draft.xlsx108agent-spreadsheet diff data.xlsx /tmp/draft.xlsx109```110111### Stateless batch writes (`--ops @...`)112113```bash114agent-spreadsheet transform-batch data.xlsx --ops @ops.json --dry-run115agent-spreadsheet style-batch data.xlsx --ops @style_ops.json --dry-run116agent-spreadsheet apply-formula-pattern data.xlsx --ops @formula_ops.json --in-place117agent-spreadsheet structure-batch data.xlsx --ops @structure_ops.json --dry-run118agent-spreadsheet column-size-batch data.xlsx --ops @column_size_ops.json --output resized.xlsx119agent-spreadsheet sheet-layout-batch data.xlsx --ops @layout_ops.json --dry-run120agent-spreadsheet rules-batch data.xlsx --ops @rules_ops.json --output ruled.xlsx --force121```122123#### Batch payload examples (JSON body passed via `--ops @file.json`)124125All batch payloads use a top-level envelope object. Most commands require `{"ops":[...]}`; `column-size-batch` prefers `{"sheet_name":"...","ops":[...]}` and also accepts per-op `sheet_name` inside `{"ops":[...]}`.126127##### transform-batch payloads (`@transform_ops.json`)128- Minimal: `{"ops":[{"kind":"fill_range","sheet_name":"Sheet1","target":{"kind":"range","range":"B2:B4"},"value":"0"}]}`129- Advanced: `{"ops":[{"kind":"replace_in_range","sheet_name":"Sheet1","target":{"kind":"region","region_id":1},"find":"N/A","replace":"","match_mode":"contains","case_sensitive":false,"include_formulas":true}]}`130131##### style-batch payloads (`@style_ops.json`)132- Minimal: `{"ops":[{"sheet_name":"Sheet1","target":{"kind":"range","range":"B2:B2"},"patch":{"font":{"bold":true}}}]}`133- Advanced: `{"ops":[{"sheet_name":"Sheet1","target":{"kind":"cells","cells":["B2","B3"]},"patch":{"number_format":"$#,##0.00","alignment":{"horizontal":"right"}},"op_mode":"merge"}]}`134135##### apply-formula-pattern payloads (`@formula_ops.json`)136- Minimal: `{"ops":[{"sheet_name":"Sheet1","target_range":"C2:C4","anchor_cell":"C2","base_formula":"B2*2"}]}`137- Advanced: `{"ops":[{"sheet_name":"Sheet1","target_range":"C2:E4","anchor_cell":"C2","base_formula":"B2*2","fill_direction":"both","relative_mode":"excel"}]}`138- `relative_mode` valid values: `excel`, `abs_cols`, `abs_rows`139140##### structure-batch payloads (`@structure_ops.json`)141- Minimal: `{"ops":[{"kind":"rename_sheet","old_name":"Summary","new_name":"Dashboard"}]}`142- Advanced: `{"ops":[{"kind":"copy_range","sheet_name":"Sheet1","dest_sheet_name":"Summary","src_range":"A1:C4","dest_anchor":"A1","include_styles":true,"include_formulas":true}]}`143144##### column-size-batch payloads (`@column_size_ops.json`)145- Minimal (preferred): `{"sheet_name":"Sheet1","ops":[{"range":"A:A","size":{"kind":"width","width_chars":12.0}}]}`146- Advanced (preferred): `{"sheet_name":"Sheet1","ops":[{"target":{"kind":"columns","range":"A:C"},"size":{"kind":"auto","min_width_chars":8.0,"max_width_chars":24.0}}]}`147- Also accepted (harmonized shape): `{"ops":[{"sheet_name":"Sheet1","range":"A:A","size":{"kind":"width","width_chars":12.0}}]}`148149##### sheet-layout-batch payloads (`@layout_ops.json`)150- Minimal: `{"ops":[{"kind":"freeze_panes","sheet_name":"Sheet1","freeze_rows":1,"freeze_cols":1}]}`151- Advanced: `{"ops":[{"kind":"set_page_setup","sheet_name":"Sheet1","orientation":"landscape","fit_to_width":1,"fit_to_height":1}]}`152153##### rules-batch payloads (`@rules_ops.json`)154- Minimal: `{"ops":[{"kind":"set_data_validation","sheet_name":"Sheet1","target_range":"B2:B4","validation":{"kind":"list","formula1":"\"A,B,C\""}}]}`155- Advanced: `{"ops":[{"kind":"set_conditional_format","sheet_name":"Sheet1","target_range":"C2:C10","rule":{"kind":"expression","formula":"C2>100"},"style":{"fill_color":"#FFF2CC","bold":true}}]}`156157`apply-formula-pattern` clears cached results for touched formula cells; run `recalculate` to refresh computed values.158159All output is JSON by default.160Use `--shape canonical|compact` (default: `canonical`) to control response shape.161162### Formula parse policy163164Commands that tokenize or validate formulas accept `--formula-parse-policy <fail|warn|off>`:165166| Mode | Behavior | Default for |167| --- | --- | --- |168| **fail** | Abort on any formula parse error | `edit` (single-write) |169| **warn** | Continue; attach `formula_parse_diagnostics` to the response | `scan-volatiles`, `formula-map`, `formula-trace`, `transform-batch`, `structure-batch`, `rules-batch` |170| **off** | Silently skip parse errors | — |171172When the policy is `warn` (or `fail` with errors), the response includes a `formula_parse_diagnostics` object:173174```json175{176 "formula_parse_diagnostics": {177 "policy": "warn",178 "total_errors": 400,179 "groups_truncated": false,180 "groups": [181 {182 "error_code": "FORMULA_PARSE_FAILED",183 "error_message": "No matching opener for closer at position 161",184 "sheet_name": "Assessments",185 "formula_preview": "=IF(C4=\"\",\"\",IF(C4=\"N/A\",…",186 "count": 400,187 "sample_addresses": ["D4", "D5", "D10", "D11", "D12"]188 }189 ]190 }191}192```193194Errors are grouped by structural pattern — formulas that differ only in cell references (e.g., `=IF(C4=…)` and `=IF(C5=…)`) collapse into a single group with a count and sample addresses.195196Shape policy:197- **Canonical (default/omitted):** preserve the full response schema.198- **range-values canonical:** return `values: [...]` when entries are present; omit `values` when all requested ranges are pruned (for example, invalid ranges).199- **range-values `--include-formulas`:** adds a `formulas` matrix aligned to `rows` (formula cells have formula text; literal cells are `null`).200- **Compact (single range):** flatten that entry to top-level fields (`range`, payload, and optional `next_start_row`).201- **Compact (multiple ranges):** keep `values: [...]` with per-entry `range` for correlation.202- **read-table and sheet-page: compact preserves the active branch and continuation fields (`next_offset`, `next_start_row`)**.203- **formula-trace compact:** omits per-layer `highlights` while preserving `layers` and `next_cursor`.204205#### `sheet-page` machine contract206- Inspect top-level `format` before reading payload fields.207- `format=full`: read top-level `rows` plus optional `header_row` and `next_start_row`.208- `format=compact`: read `compact.headers`, `compact.header_row`, `compact.rows` plus optional `next_start_row`.209- `format=values_only`: read `values_only.rows` plus optional `next_start_row`.210- Continuation is always driven by top-level `next_start_row` when present.211- Global `--shape compact` preserves the active `sheet-page` branch; it does not flatten `sheet-page` payloads.212213Machine continuation example:2141. Request page 1 without `--start-row`.2152. If `next_start_row` is present, call `sheet-page` again with `--start-row <next_start_row>`.2163. Stop when `next_start_row` is omitted.217218Use `--compact` to minimize whitespace and `--quiet` to suppress warnings.219Global `--output-format csv` is currently unsupported; use command-specific CSV options like `read-table --table-format csv`.220221### CLI command reference222223| Command | Description |224| --- | --- |225| `list-sheets <file>` | List sheets with summaries |226| `sheet-overview <file> <sheet>` | Region detection + orientation |227| `describe <file>` | Workbook metadata |228| `read-table <file> [--sheet S] [--range R] [--table-name T] [--region-id ID] [--limit N] [--offset N] [--sample-mode first\|last\|distributed] [--table-format json\|values\|csv]` | Structured table read with deterministic offset pagination |229| `sheet-page <file> <sheet> --format <full|compact|values_only> [--start-row ROW] [--page-size N]` | Deterministic row paging with `next_start_row` continuation |230| `range-values <file> <sheet> <range> [range...] [--include-formulas]` | Raw cell values, optionally with aligned formula text matrix |231| `inspect-cells <file> <sheet> <range>` | Unified per-cell formula/value/cached/style snapshot for triage |232| `find-value <file> <query> [--sheet S] [--mode value\|label] [--label-direction right\|below\|any]` | Search cell values (`value`) or match labels and return adjacent values (`label`) |233| `named-ranges <file> [--sheet S] [--name-prefix P]` | List named ranges/tables/formula items |234| `find-formula <file> <query> [--sheet S] [--limit N] [--offset N]` | Formula text search with continuation |235| `scan-volatiles <file> [--sheet S] [--limit N] [--offset N] [--formula-parse-policy P]` | Scan formulas for volatile functions |236| `sheet-statistics <file> <sheet>` | Per-sheet density and type stats |237| `formula-map <file> <sheet> [--sort-by complexity\|count] [--limit N] [--formula-parse-policy P]` | Formula inventory summary |238| `formula-trace <file> <sheet> <cell> <precedents\|dependents> [--depth N] [--page-size N] [--cursor-depth N --cursor-offset N] [--formula-parse-policy P]` | Trace formula dependencies with cursor continuation |239| `table-profile <file> [--sheet S]` | Column types, cardinality, distributions |240| `create-workbook <path> [--sheets Inputs,Calc,...] [--overwrite]` | Create a blank workbook with configurable initial sheets |241| `copy <source> <dest>` | Copy workbook (for edit workflows) |242| `edit <file> <sheet> [--dry-run\|--in-place\|--output PATH] [--force] <edits...> [--formula-parse-policy P]` | Apply cell edits with preview/output safety modes (`A1=42` literal, `B2==SUM(...)` formula) |243| `transform-batch <file> --ops @ops.json (--dry-run\|--in-place\|--output PATH) [--formula-parse-policy P]` | Generic stateless transform batch pipeline |244| `style-batch <file> --ops @ops.json (--dry-run|--in-place|--output PATH)` | Stateless style operations |245| `apply-formula-pattern <file> --ops @ops.json (--dry-run|--in-place|--output PATH)` | Stateless formula fill/pattern operations (clears touched formula caches; run `recalculate`) |246| `structure-batch <file> --ops @ops.json (--dry-run\|--in-place\|--output PATH) [--formula-parse-policy P]` | Stateless structure operations (sheet rows/columns) |247| `column-size-batch <file> --ops @ops.json (--dry-run|--in-place|--output PATH)` | Stateless column sizing operations |248| `sheet-layout-batch <file> --ops @ops.json (--dry-run|--in-place|--output PATH)` | Stateless layout operations (freeze/split/hide/view) |249| `rules-batch <file> --ops @ops.json (--dry-run\|--in-place\|--output PATH) [--formula-parse-policy P]` | Stateless validation/conditional-format operations |250| `recalculate <file>` | Recalculate formulas via backend |251| `diff <original> <modified>` | Diff two workbook versions |252253#### Formula write-path provenance (`write_path_provenance`)254Formula-writing commands emit optional provenance metadata for troubleshooting:255- `written_via`: write path (`edit`, `transform_batch`, `apply_formula_pattern`)256- `formula_targets`: sheet/cell or sheet/range targets touched by formula writes257258Debug compare workflow (same-target by write path):2591. Apply the same formula target via two paths (for example `edit` vs `apply-formula-pattern`).2602. Compare `write_path_provenance.written_via` and `formula_targets` in responses.2613. Use `inspect-cells` plus `recalculate` to compare resulting behavior.262263#### Financial presentation starter defaults264- Keep label columns (often column A) explicitly sized (roughly `24–36` chars) to prevent clipping.265- Apply consistent number formats by semantic type:266 - Currency: `"$"#,##0.00_);[Red]("$"#,##0.00)`267 - Percent: `0.0%`268 - Integer/count: `#,##0`269- Apply `sheet-layout-batch` freeze panes after header layout stabilizes.270271---272273## Quickstart: MCP Server274275The MCP server provides agents a stateful session with workbook caching, fork management, and recalculation. Connect any MCP-compatible client.276277### Claude Code / Claude Desktop278279Add to `~/.claude.json` or project `.mcp.json`:280281```json282{283 "mcpServers": {284 "spreadsheet": {285 "command": "spreadsheet-mcp",286 "args": ["--workspace-root", "/path/to/workbooks", "--transport", "stdio"]287 }288 }289}290```291292Or with Docker:293294```json295{296 "mcpServers": {297 "spreadsheet": {298 "command": "docker",299 "args": [300 "run", "-i", "--rm",301 "-v", "/path/to/workbooks:/data",302 "ghcr.io/psu3d0/spreadsheet-mcp:latest",303 "--transport", "stdio"304 ]305 }306 }307}308```309310### Cursor / VS Code311312```json313{314 "mcp.servers": {315 "spreadsheet": {316 "command": "spreadsheet-mcp",317 "args": ["--workspace-root", "${workspaceFolder}", "--transport", "stdio"]318 }319 }320}321```322323### HTTP mode324325```bash326spreadsheet-mcp --workspace-root /path/to/workbooks327# → http://127.0.0.1:8079 — POST /mcp328```329330<details>331<summary>More MCP client configurations</summary>332333**Claude Code — Docker with VBA:**334```json335{336 "mcpServers": {337 "spreadsheet": {338 "command": "docker",339 "args": ["run", "-i", "--rm", "-v", "/path/to/workbooks:/data", "ghcr.io/psu3d0/spreadsheet-mcp:latest", "--transport", "stdio", "--vba-enabled"]340 }341 }342}343```344345**Claude Code — Docker with write/recalc:**346```json347{348 "mcpServers": {349 "spreadsheet": {350 "command": "docker",351 "args": ["run", "-i", "--rm", "-v", "/path/to/workbooks:/data", "ghcr.io/psu3d0/spreadsheet-mcp:full", "--transport", "stdio", "--recalc-enabled"]352 }353 }354}355```356357**Cursor / VS Code — Docker read-only:**358```json359{360 "mcp.servers": {361 "spreadsheet": {362 "command": "docker",363 "args": ["run", "-i", "--rm", "-v", "${workspaceFolder}:/data", "ghcr.io/psu3d0/spreadsheet-mcp:latest", "--transport", "stdio"]364 }365 }366}367```368369**Cursor / VS Code — Docker write/recalc:**370```json371{372 "mcp.servers": {373 "spreadsheet": {374 "command": "docker",375 "args": ["run", "-i", "--rm", "-v", "${workspaceFolder}:/data", "ghcr.io/psu3d0/spreadsheet-mcp:full", "--transport", "stdio", "--recalc-enabled"]376 }377 }378}379```380381</details>382383---384385## When to use what386387| You want to… | Use |388| --- | --- |389| One-shot reads from scripts or pipelines | **CLI** (`agent-spreadsheet`) |390| Agent sessions with caching across calls | **MCP** (`spreadsheet-mcp`) |391| Fork → edit → recalc → diff workflows | **MCP** (fork lifecycle + recalc backend) |392| Embed in an LLM tool-use loop | **MCP** (designed for multi-turn agent use) |393| Quick CLI edits without running a server | **CLI** (`copy` → `edit` → `diff`) |394| npm/npx install with zero Rust toolchain | **npm** (`npm i -g agent-spreadsheet`) |395396---397398## Token efficiency399400Dumping a 50,000-row spreadsheet into context is expensive and usually unnecessary. spreadsheet-kit tools are built around **discover → profile → extract** — agents get structural awareness without burning tokens on cells they don't need.401402### Output profiles403404The server defaults to **token-dense** output:405406- `read_table` → CSV format (flat string, minimal overhead)407- `range_values` → values array (no metadata wrapper)408- `sheet_page` → compact format (no formulas/styles unless requested)409- `table_profile` / `sheet_statistics` → summary only (no samples unless requested)410- Pagination fields (`next_offset`, `next_start_row`) only appear when more data exists411412Switch to verbose output with `--output-profile verbose` or `SPREADSHEET_MCP_OUTPUT_PROFILE=verbose`.413414### Recommended agent workflow4154164174181. `list_workbooks` → `list_sheets` → `workbook_summary` for orientation4192. `sheet_overview` to get detected regions (ids, bounds, kind, confidence)4203. `table_profile` → `read_table` with `region_id` and small `limit`4214. `find_value` (label mode) or `range_values` for targeted pulls4225. Reserve `sheet_page` for unknown layouts; prefer `compact` format4236. Page and filter — avoid full-sheet reads424425---426427## Tool surface (MCP)428429### Read-only tools (always available)430431| Tool | Purpose |432| --- | --- |433| `list_workbooks` | List spreadsheet files in workspace |434| `describe_workbook` | Workbook metadata |435| `list_sheets` | Sheets with summaries |436| `workbook_summary` | Summary + optional entry points / named ranges |437| `sheet_overview` | Orientation + region detection (cached) |438| `sheet_page` | Page through cells (compact / full / values_only) |439| `read_table` | Structured region/table read (csv / values / json) |440| `range_values` | Raw cell values for specific ranges |441| `table_profile` | Column types, cardinality, sample distributions |442| `find_value` | Search values or labels |443| `find_formula` | Search formulas with paging |444| `sheet_statistics` | Density, nulls, duplicates |445| `sheet_formula_map` | Formulas by complexity / count |446| `formula_trace` | Precedent / dependent tracing |447| `scan_volatiles` | Find volatile formulas (NOW, RAND, etc.) |448| `named_ranges` | Defined names + tables |449| `sheet_styles` | Style inspection (sheet-scoped) |450| `workbook_style_summary` | Workbook-wide style summary |451| `get_manifest_stub` | Generate SheetPort manifest scaffold |452| `close_workbook` | Evict from cache |453454### VBA tools (opt-in: `--vba-enabled`)455456| Tool | Purpose |457| --- | --- |458| `vba_project_summary` | VBA project metadata + module list |459| `vba_module_source` | Paged VBA module source extraction |460461Read-only — does not execute macros, only extracts source text from `.xlsm` files.462463### Write & recalc tools (opt-in: `--recalc-enabled`)464465| Tool | Purpose |466| --- | --- |467| `create_fork` / `list_forks` / `discard_fork` | Fork lifecycle |468| `checkpoint_fork` / `restore_checkpoint` / `list_checkpoints` / `delete_checkpoint` | Snapshot + rollback |469| `edit_batch` | Cell value / formula edits |470| `transform_batch` | Range-first clear / fill / replace |471| `style_batch` | Batch style edits (range / region / cells) |472| `apply_formula_pattern` | Autofill-like formula fills |473| `structure_batch` | Rows / cols / sheets + copy / move ranges |474| `column_size_batch` | Column width operations |475| `sheet_layout_batch` | Freeze panes, zoom, print area, page setup |476| `rules_batch` | Data validation + conditional formatting |477| `recalculate` | Trigger formula recalculation |478| `get_changeset` | Diff fork vs original (paged, filterable) |479| `screenshot_sheet` | Render range to PNG (max 100x30 cells) |480| `save_fork` | Export fork to file |481| `list_staged_changes` / `apply_staged_change` / `discard_staged_change` | Manage previewed changes |482| `get_edits` | List applied edits on a fork |483484---485486## Recalc backends487488Formula recalculation is handled by a pluggable backend. Two are supported:489490| Backend | How | Default | When to use |491| --- | --- | --- | --- |492| **Formualizer** | Native Rust engine (320+ functions) | **Yes** (default feature) | Fast, zero external deps. Ships with every `cargo install`. |493| **LibreOffice** | Headless `soffice` process | Docker `:full` only | Full Excel formula compatibility. Used by the `:full` Docker image. |494495**Default behavior:** `cargo install` includes Formualizer out of the box — recalculation works immediately with no extra setup. LibreOffice is only used in the Docker `:full` image, which bundles `soffice` with pre-configured macros for maximum formula coverage.496497Compile-time feature flags:498- `recalc-formualizer` (**default**) — pure Rust, bundled via Formualizer499- `recalc-libreoffice` — uses LibreOffice (requires `soffice` on PATH)500- `--no-default-features` — read-only mode, no recalc support501502If recalc is disabled, all read, edit, and diff operations still work — only `recalculate` requires a backend.503504---505506## Deployment modes507508| Mode | Binary | State | Recalc | Transport |509| --- | --- | --- | --- | --- |510| **CLI** | `agent-spreadsheet` | Stateless | Yes (Formualizer, default) | stdin → stdout JSON |511| **MCP (stdio)** | `spreadsheet-mcp` | Stateful (LRU cache) | Optional (`--recalc-enabled`) | MCP over stdio |512| **MCP (HTTP)** | `spreadsheet-mcp` | Stateful (LRU cache) | Optional (`--recalc-enabled`) | Streamable HTTP `POST /mcp` |513| **Docker slim** | `spreadsheet-mcp` | Stateful | No | HTTP (default) or stdio |514| **Docker full** | `spreadsheet-mcp` | Stateful | Yes (LibreOffice) | HTTP (default) or stdio |515| **WASM** | — | — | — | *Planned* |516517---518519## Workspace layout520521```522spreadsheet-kit/523├── crates/524│ ├── spreadsheet-kit/ # Shared engine + agent-spreadsheet CLI binary525│ └── spreadsheet-mcp/ # Stateful MCP server adapter526├── npm/527│ └── agent-spreadsheet/ # npm binary distribution package528├── formualizer/ # Formula recalc engine (default backend)529├── docs/ # Architecture and design docs530└── .github/workflows/ # CI, release, Docker builds531```532533| Crate | Role |534| --- | --- |535| [`spreadsheet-kit`](crates/spreadsheet-kit/) | Shared engine + `agent-spreadsheet` CLI binary |536| [`spreadsheet-mcp`](crates/spreadsheet-mcp/) | MCP server adapter + transport layer |537| [`agent-spreadsheet` (npm)](npm/agent-spreadsheet/) | npm wrapper — downloads prebuilt native binary on install |538539---540541## Region detection542543544545Spreadsheets often contain multiple logical tables, parameter blocks, and output areas on a single sheet. The server detects these automatically:5465471. **Gutter detection** — scans for empty rows/columns separating content blocks5482. **Recursive splitting** — subdivides areas along detected gutters5493. **Border trimming** — removes sparse edges to tighten bounds5504. **Header detection** — identifies header rows (including multi-row merged headers)5515. **Classification** — labels each region: `data`, `parameters`, `outputs`, `calculator`, `metadata`5526. **Confidence scoring** — higher scores for well-structured regions with clear headers553554Regions are cached per sheet. Tools like `read_table` accept a `region_id` to scope reads without manually specifying ranges. See [docs/HEURISTICS.md](docs/HEURISTICS.md) for details.555556---557558## Architecture559560561562- **LRU cache** keeps recently-accessed workbooks in memory (configurable capacity)563- **Lazy sheet metrics** computed once per sheet, reused across tools564- **Region detection on demand** runs for `sheet_overview` and is cached for `region_id` lookups565- **Fork isolation** — write operations work on copies, never mutate originals in-place566- **Sampling modes** — `distributed` sampling reads evenly across rows without loading everything567- **Output caps** — truncated by default; use tool params to expand568569---570571## Configuration reference572573All flags can also be set via environment variables prefixed with `SPREADSHEET_MCP_`.574575| Flag | Env | Default | Description |576| --- | --- | --- | --- |577| `--workspace-root <DIR>` | `SPREADSHEET_MCP_WORKSPACE` | cwd | Directory to scan for workbooks |578| `--cache-capacity <N>` | `SPREADSHEET_MCP_CACHE_CAPACITY` | `5` | LRU workbook cache size |579| `--extensions <csv>` | `SPREADSHEET_MCP_EXTENSIONS` | `xlsx,xlsm,xls,xlsb` | Allowed file extensions |580| `--workbook <FILE>` | `SPREADSHEET_MCP_WORKBOOK` | — | Single-workbook mode |581| `--enabled-tools <csv>` | `SPREADSHEET_MCP_ENABLED_TOOLS` | all | Whitelist exposed tools |582| `--transport <T>` | `SPREADSHEET_MCP_TRANSPORT` | `http` | `http` or `stdio` |583| `--http-bind <ADDR>` | `SPREADSHEET_MCP_HTTP_BIND` | `127.0.0.1:8079` | HTTP bind address |584| `--output-profile <P>` | `SPREADSHEET_MCP_OUTPUT_PROFILE` | `token-dense` | `token-dense` or `verbose` |585| `--recalc-enabled` | `SPREADSHEET_MCP_RECALC_ENABLED` | `false` | Enable write/recalc tools |586| `--max-concurrent-recalcs <N>` | `SPREADSHEET_MCP_MAX_CONCURRENT_RECALCS` | `2` | Parallel recalc limit |587| `--tool-timeout-ms <MS>` | `SPREADSHEET_MCP_TOOL_TIMEOUT_MS` | `30000` | Per-tool timeout (0 = disabled) |588| `--max-response-bytes <N>` | `SPREADSHEET_MCP_MAX_RESPONSE_BYTES` | `1000000` | Max response size (0 = disabled) |589| `--allow-overwrite` | `SPREADSHEET_MCP_ALLOW_OVERWRITE` | `false` | Allow `save_fork` to overwrite originals |590| `--vba-enabled` | `SPREADSHEET_MCP_VBA_ENABLED` | `false` | Enable VBA inspection tools |591| `--screenshot-dir <DIR>` | `SPREADSHEET_MCP_SCREENSHOT_DIR` | `<workspace>/screenshots` | Screenshot output directory |592| `--path-map <MAP>` | `SPREADSHEET_MCP_PATH_MAP` | — | Docker path remapping (`/data=/host/path`) |593594---595596## Docker deployment597598### Image variants599600| Image | Size | Recalc | Use case |601| --- | --- | --- | --- |602| `ghcr.io/psu3d0/spreadsheet-mcp:latest` | ~15 MB | No | Read-only analysis |603| `ghcr.io/psu3d0/spreadsheet-mcp:full` | ~800 MB | Yes (LibreOffice) | Write + recalc + screenshots |604605### Basic usage606607```bash608# Read-only609docker run -v /path/to/workbooks:/data -p 8079:8079 \610 ghcr.io/psu3d0/spreadsheet-mcp:latest611612# With VBA tools613docker run -v /path/to/workbooks:/data -p 8079:8079 \614 -e SPREADSHEET_MCP_VBA_ENABLED=true \615 ghcr.io/psu3d0/spreadsheet-mcp:latest616617# Write + recalc618docker run -v /path/to/workbooks:/data -p 8079:8079 \619 ghcr.io/psu3d0/spreadsheet-mcp:full620```621622### Path mapping623624When the server runs in Docker but your agent reads files from the host, configure path mapping so responses include host-visible paths:625626```bash627docker run \628 -v /path/to/workbooks:/data \629 -e SPREADSHEET_MCP_PATH_MAP="/data=/path/to/workbooks" \630 -p 8079:8079 \631 ghcr.io/psu3d0/spreadsheet-mcp:full632```633634This adds `client_path`, `client_output_path`, and `client_saved_to` fields to tool responses.635636### Separate screenshot output637638```bash639docker run \640 -v /path/to/workbooks:/data \641 -v /path/to/screenshots:/screenshots \642 -e SPREADSHEET_MCP_SCREENSHOT_DIR=/screenshots \643 -e SPREADSHEET_MCP_PATH_MAP="/data=/path/to/workbooks,/screenshots=/path/to/screenshots" \644 -p 8079:8079 \645 ghcr.io/psu3d0/spreadsheet-mcp:full646```647648### Privilege handling649650The `:full` image entrypoint drops privileges to match the owner of the mounted workspace directory. Override with `SPREADSHEET_MCP_RUN_UID` / `SPREADSHEET_MCP_RUN_GID` or `docker run --user`.651652---653654## Write & recalc workflows655656Write tools use a **fork-based** model for safety. Edits never mutate the original file — work on a fork, inspect changes, and export when satisfied.657658```659create_fork → edit_batch / transform_batch → recalculate → get_changeset → save_fork660 ↑ |661 checkpoint_fork ←──── restore_checkpoint ←────────────┘662```663664### Edit shorthand665666`edit_batch` accepts both canonical objects and shorthand strings:667668```json669{670 "edits": [671 { "address": "A1", "value": "Revenue" },672 { "address": "B2", "formula": "SUM(B3:B10)" },673 "C1=100",674 "D1==SUM(A1:A2)"675 ]676}677```678679`"A1=100"` sets a value. `"A1==SUM(...)"` sets a formula (double `=`).680681<details>682<summary>Write tool shapes reference</summary>683684**edit_batch**685```json686{687 "tool": "edit_batch",688 "arguments": {689 "fork_id": "fork-123",690 "sheet_name": "Inputs",691 "edits": [692 { "address": "A1", "value": "Financial Model Inputs" },693 { "address": "B2", "formula": "SUM(B3:B10)" },694 "C1=100",695 "D1==SUM(A1:A2)"696 ]697 }698}699```700701**style_batch**702```json703{704 "tool": "style_batch",705 "arguments": {706 "fork_id": "fork-123",707 "ops": [708 {709 "sheet_name": "Accounts",710 "target": { "kind": "range", "range": "A2:F2" },711 "patch": {712 "font": { "bold": true },713 "fill": { "kind": "pattern", "pattern_type": "solid", "foreground_color": "FFF5F7FA" }714 }715 },716 {717 "sheet_name": "Accounts",718 "range": "A3:F3",719 "style": { "fill": { "color": "#F5F7FA" } }720 }721 ]722 }723}724```725726**structure_batch**727```json728{729 "tool": "structure_batch",730 "arguments": {731 "fork_id": "fork-123",732 "ops": [733 { "kind": "create_sheet", "name": "Inputs" },734 { "kind": "insert_rows", "sheet_name": "Data", "start": 5, "count": 3 },735 { "kind": "copy_range", "sheet_name": "Data", "source": "A1:D1", "target": "A5" }736 ]737 }738}739```740741**column_size_batch**742```json743{744 "tool": "column_size_batch",745 "arguments": {746 "fork_id": "fork-123",747 "sheet_name": "Accounts",748 "mode": "apply",749 "ops": [750 { "target": { "kind": "columns", "range": "A:C" }, "size": { "kind": "auto", "max_width_chars": 40.0 } },751 { "range": "D:D", "size": { "kind": "width", "width_chars": 24.0 } }752 ]753 }754}755```756757**sheet_layout_batch**758```json759{760 "tool": "sheet_layout_batch",761 "arguments": {762 "fork_id": "fork-123",763 "mode": "preview",764 "ops": [765 { "kind": "freeze_panes", "sheet_name": "Dashboard", "freeze_rows": 1, "freeze_cols": 1 },766 { "kind": "set_zoom", "sheet_name": "Dashboard", "zoom_percent": 110 },767 { "kind": "set_print_area", "sheet_name": "Dashboard", "range": "A1:G30" },768 { "kind": "set_page_setup", "sheet_name": "Dashboard", "orientation": "landscape", "fit_to_width": 1, "fit_to_height": 1 }769 ]770 }771}772```773774**rules_batch**775```json776{777 "tool": "rules_batch",778 "arguments": {779 "fork_id": "fork-123",780 "mode": "apply",781 "ops": [782 {783 "kind": "set_data_validation",784 "sheet_name": "Inputs",785 "target_range": "B3:B100",786 "validation": { "kind": "list", "formula1": "=Lists!$A$1:$A$10", "allow_blank": false }787 },788 {789 "kind": "set_conditional_format",790 "sheet_name": "Dashboard",791 "target_range": "D3:D100",792 "rule": { "kind": "cell_is", "operator": "less_than", "formula": "0" },793 "style": { "fill_color": "#FFE0E0", "font_color": "#8A0000", "bold": true }794 }795 ]796 }797}798```799800</details>801802### Screenshot tool803804`screenshot_sheet` renders a range to a cropped PNG via LibreOffice (requires `:full` image or `--recalc-enabled`).805806- Max range: **100 rows x 30 columns** per screenshot807- Pixel guard: **4096 px** per side, **12 MP** area (override via `SPREADSHEET_MCP_MAX_PNG_DIM_PX` / `SPREADSHEET_MCP_MAX_PNG_AREA_PX`)808- On rejection, the tool returns suggested sub-range splits809810See [docs/RECALC.md](docs/RECALC.md) for architecture details.811812---813814## Example815816**Request:** Profile a detected region817818```json819{820 "tool": "table_profile",821 "arguments": {822 "workbook_id": "wb-23456789ab",823 "sheet_name": "Q1 Actuals",824 "region_id": 1,825 "sample_size": 10,826 "sample_mode": "distributed"827 }828}829```830831**Response:**832833```json834{835 "sheet_name": "Q1 Actuals",836 "headers": ["Date", "Category", "Amount", "Notes"],837 "column_types": [838 { "name": "Date", "inferred_type": "date", "nulls": 0, "distinct": 87 },839 { "name": "Category", "inferred_type": "text", "nulls": 2, "distinct": 12, "top_values": ["Payroll", "Marketing", "Infrastructure"] },840 { "name": "Amount", "inferred_type": "number", "nulls": 0, "min": 150.0, "max": 84500.0, "mean": 12847.32 },841 { "name": "Notes", "inferred_type": "text", "nulls": 45, "distinct": 38 }842 ],843 "row_count": 1247844}845```846847The agent now knows column types, cardinality, and value distributions — without reading 1,247 rows.848849---850851## Development852853```bash854# Run tests855cargo test856857# Build all crates858cargo build --release859860# Test local binary with an MCP client861```862863Point your MCP client config at the local binary:864865```json866{867 "mcpServers": {868 "spreadsheet": {869 "command": "./target/release/spreadsheet-mcp",870 "args": ["--workspace-root", "/path/to/workbooks", "--transport", "stdio"]871 }872 }873}874```875876Or use the Docker rebuild script for live iteration:877878```bash879WORKSPACE_ROOT=/path/to/workbooks ./scripts/local-docker-mcp.sh880```881882---883884## License885886Apache-2.0887
Full transparency — inspect the skill content before installing.