Lint any URL for LLM readiness. Get a 0-100 score for token efficiency, RAG readiness, agent compatibility, and LLM extraction quality. Context CLI is an LLM Readiness Linter that checks how well a URL is structured for AI consumption. As LLM-powered search engines, RAG pipelines, and AI agents become primary consumers of web content, your pages need to be optimized for token efficiency, structure
Add this skill
npx mdskills install hanselhansel/aeo-cliComprehensive LLM readiness linter with strong tool integration and clear documentation
1# Context CLI23[](https://github.com/hanselhansel/context-cli/actions/workflows/test.yml)4[](https://www.python.org/downloads/)5[](LICENSE)6[](https://pypi.org/project/context-cli/)78**Lint any URL for LLM readiness. Get a 0-100 score for token efficiency, RAG readiness, agent compatibility, and LLM extraction quality.**910## What is Context CLI?1112Context CLI is an LLM Readiness Linter that checks how well a URL is structured for AI consumption. As LLM-powered search engines, RAG pipelines, and AI agents become primary consumers of web content, your pages need to be optimized for token efficiency, structured data extraction, agent interoperability, and machine-readable formatting.1314Context CLI analyzes your content across five pillars (V3 scoring) and returns a structured score from 0 to 100.1516## Features1718- **Robots.txt AI bot access** -- checks 13 AI crawlers (GPTBot, ClaudeBot, DeepSeek-AI, Grok, and more)19- **llms.txt & llms-full.txt** -- detects both standard and extended LLM instruction files20- **Schema.org JSON-LD** -- extracts and evaluates structured data with high-value type weighting (Product, Article, FAQ, HowTo)21- **Content density** -- measures useful content vs. boilerplate with readability scoring, heading structure analysis, and answer-first detection22- **Agent Readiness (V3)** -- 20-point pillar checking AGENTS.md, `Accept: text/markdown`, MCP endpoints, semantic HTML, x402 payment signaling, and NLWeb support23- **Markdown-for-Agents engine** -- convert any URL to clean, token-efficient markdown; open-source alternative to Cloudflare's Markdown for Agents24- **Serve modes** -- reverse proxy, ASGI middleware, and WSGI middleware that serve markdown to agents via `Accept: text/markdown`25- **Batch mode** -- lint multiple URLs from a file with `--file` and configurable `--concurrency`26- **Custom bot list** -- override default bots with `--bots` for targeted checks27- **Verbose output** -- detailed per-pillar breakdown with scoring explanations and recommendations28- **Rich CLI output** -- formatted tables and scores via Rich29- **JSON / CSV / Markdown output** -- machine-readable results for pipelines30- **MCP server** -- expose the linter as a tool for AI agents via FastMCP (8 tools including agent readiness, markdown conversion, and AGENTS.md generation)31- **Context Compiler** -- LLM-powered `llms.txt`, `schema.jsonld`, and `AGENTS.md` generation, with batch mode for multiple URLs32- **Web server config generation** -- generate nginx, Apache, and Caddy configs for `Accept: text/markdown` routing33- **x402 payment config generation** -- generate payment signaling configuration for monetizing agent access34- **CI/CD integration** -- `--fail-under` threshold, `--fail-on-blocked-bots`, per-pillar thresholds, baseline regression detection, GitHub Step Summary35- **GitHub Action** -- composite action for CI pipelines with baseline support36- **Citation Radar** -- query AI models to see what they cite and recommend, with brand tracking and domain classification37- **Share-of-Recommendation Benchmark** -- track how often AI models mention and recommend your brand vs competitors, with LLM-as-judge analysis3839## Installation4041```bash42pip install context-linter43```4445Context CLI uses a headless browser for content extraction. After installing, run:4647```bash48crawl4ai-setup49```5051### Development install5253```bash54git clone https://github.com/your-org/context-cli.git55cd context-cli56pip install -e ".[dev]"57crawl4ai-setup58```5960## Quick Start6162```bash63context-cli lint example.com64```6566This runs a full lint and prints a Rich-formatted report with your LLM readiness score.6768## CLI Usage6970### Single Page Lint7172Lint only the specified URL (skip multi-page discovery):7374```bash75context-cli lint example.com --single76```7778### Multi-Page Site Lint (default)7980Discover pages via sitemap/spider and lint up to 10 pages:8182```bash83context-cli lint example.com84```8586### Limit Pages8788```bash89context-cli lint example.com --max-pages 590```9192### JSON Output9394Get structured JSON for CI pipelines, dashboards, or scripting:9596```bash97context-cli lint example.com --json98```99100### CSV / Markdown Output101102```bash103context-cli lint example.com --format csv104context-cli lint example.com --format markdown105```106107### Verbose Mode108109Show detailed per-pillar breakdown with scoring explanations:110111```bash112context-cli lint example.com --single --verbose113```114115### Timeout116117Set the HTTP timeout (default: 15 seconds):118119```bash120context-cli lint example.com --timeout 30121```122123### Custom Bot List124125Override the default 13 bots with a custom list:126127```bash128context-cli lint example.com --bots "GPTBot,ClaudeBot,PerplexityBot"129```130131### Batch Mode132133Lint multiple URLs from a file (one URL per line, `.txt` or `.csv`):134135```bash136context-cli lint --file urls.txt137context-cli lint --file urls.txt --concurrency 5138context-cli lint --file urls.txt --format csv139```140141### CI Mode142143Fail the build if the score is below a threshold:144145```bash146context-cli lint example.com --fail-under 60147```148149Fail if any AI bot is blocked:150151```bash152context-cli lint example.com --fail-on-blocked-bots153```154155#### Per-Pillar Thresholds156157Gate CI on individual pillar scores:158159```bash160context-cli lint example.com --robots-min 20 --content-min 30 --overall-min 60161```162163Available: `--robots-min`, `--schema-min`, `--content-min`, `--llms-min`, `--overall-min`.164165#### Baseline Regression Detection166167Save a baseline and detect score regressions in future lints:168169```bash170# Save current scores as baseline171context-cli lint example.com --single --save-baseline .context-baseline.json172173# Compare against baseline (exit 1 if any pillar drops > 5 points)174context-cli lint example.com --single --baseline .context-baseline.json175176# Custom regression threshold177context-cli lint example.com --single --baseline .context-baseline.json --regression-threshold 10178```179180Exit codes: 0 = pass, 1 = score below threshold or regression detected, 2 = bots blocked.181182When running in GitHub Actions, a markdown summary is automatically written to `$GITHUB_STEP_SUMMARY`.183184### Quiet Mode185186Suppress output, exit code 0 if score >= 50, 1 otherwise:187188```bash189context-cli lint example.com --quiet190```191192Use `--fail-under` with `--quiet` to override the default threshold:193194```bash195context-cli lint example.com --quiet --fail-under 70196```197198### Markdown Conversion199200Convert any URL to clean, token-efficient markdown optimized for LLM consumption:201202```bash203context-cli markdown https://example.com204```205206Show token reduction statistics (raw HTML tokens vs. clean markdown tokens):207208```bash209context-cli markdown https://example.com --stats210```211212Generate a static markdown site (one `.md` file per discovered page):213214```bash215context-cli markdown https://example.com --static -o ./output/216```217218The markdown engine uses a three-stage pipeline (Sanitize, Extract, Convert) to strip boilerplate, navigation, ads, and scripts, producing clean markdown that typically achieves 70%+ token reduction. See [docs/markdown-engine.md](docs/markdown-engine.md) for details.219220### Reverse Proxy Server221222Serve markdown to AI agents automatically via `Accept: text/markdown` content negotiation:223224```bash225context-cli serve --upstream https://example.com --port 8080226```227228When an AI agent sends a request with `Accept: text/markdown`, the proxy fetches the upstream HTML, converts it through the markdown engine, and returns clean markdown. Regular browser requests receive the original HTML unchanged.229230### V3 Scoring231232Use the V3 scoring model with the Agent Readiness pillar:233234```bash235context-cli lint https://example.com --scoring v3236```237238V3 adds a 20-point Agent Readiness pillar and rebalances the existing pillars. See [docs/scoring-v3.md](docs/scoring-v3.md) for the full methodology.239240### Start MCP server241242```bash243context-cli mcp244```245246Launches a FastMCP stdio server exposing the linter as a tool for AI agents.247248## MCP Integration249250To use Context CLI as a tool in Claude Desktop, add this to your Claude Desktop config (`claude_desktop_config.json`):251252```json253{254 "mcpServers": {255 "context-cli": {256 "command": "context-cli",257 "args": ["mcp"]258 }259 }260}261```262263Once configured, Claude can call the `audit_url` tool directly to check any URL's LLM readiness.264265### New MCP Tools (v3.0)266267In addition to the existing tools (`audit`, `generate`, `compare`, `history`, `recommend`), v3.0 adds:268269- **`agent_readiness_audit`** -- run agent readiness checks (AGENTS.md, Accept: text/markdown, MCP endpoints, semantic HTML, x402, NLWeb) against a URL270- **`convert_to_markdown`** -- convert any URL's HTML to clean, token-efficient markdown271- **`generate_agents_md`** -- generate an AGENTS.md file for a given URL based on its content and structure272273See [docs/mcp-integration.md](docs/mcp-integration.md) for full tool documentation.274275## Context Compiler (Generate)276277Generate `llms.txt` and `schema.jsonld` files from any URL using LLM analysis:278279```bash280pip install context-linter[generate]281context-cli generate example.com282```283284This crawls the URL, sends the content to an LLM, and writes optimized files to `./context-output/`.285286### Batch Generate287288Generate assets for multiple URLs from a file:289290```bash291context-cli generate-batch urls.txt292context-cli generate-batch urls.txt --concurrency 5 --profile ecommerce293context-cli generate-batch urls.txt --json294```295296Each URL's output goes to a subdirectory under `--output-dir`.297298### BYOK (Bring Your Own Key)299300The generate command auto-detects your LLM provider from environment variables:301302| Priority | Env Variable | Model Used |303|----------|-------------|------------|304| 1 | `OPENAI_API_KEY` | gpt-4o-mini |305| 2 | `ANTHROPIC_API_KEY` | claude-3-haiku-20240307 |306| 3 | Ollama running locally | ollama/llama3.2 |307308Override with `--model`:309310```bash311context-cli generate example.com --model gpt-4o312```313314### Industry Profiles315316Tailor the output with `--profile`:317318```bash319context-cli generate example.com --profile saas320context-cli generate example.com --profile ecommerce321```322323Available: `generic`, `cpg`, `saas`, `ecommerce`, `blog`.324325### AGENTS.md Generation326327Generate an [AGENTS.md](https://docs.google.com/document/d/1ON2MRbDC2RVJpKMIoluHFz-bGDAELz3RjMLErxEDqn4) file that tells AI agents how to interact with your site:328329```bash330context-cli generate example.com --agents-md331```332333### Web Server Config Generation334335Generate web server configuration snippets for routing `Accept: text/markdown` requests:336337```bash338context-cli generate-config nginx339context-cli generate-config apache340context-cli generate-config caddy341```342343Each generates a config snippet that detects `Accept: text/markdown` in incoming requests and routes them to the Context CLI markdown endpoint or a local markdown-serving backend.344345### x402 Payment Config Generation346347Generate x402 payment signaling configuration for monetizing AI agent access:348349```bash350context-cli generate-x402351```352353## Serve Modes354355Context CLI provides three ways to serve markdown to AI agents that send `Accept: text/markdown` requests.356357### Reverse Proxy358359Run a standalone reverse proxy that sits in front of your existing site:360361```bash362context-cli serve --upstream https://example.com --port 8080363```364365Requests with `Accept: text/markdown` receive converted markdown. All other requests are proxied to the upstream unchanged.366367### ASGI Middleware (FastAPI / Starlette)368369Add markdown serving to any ASGI application:370371```python372from context_cli.middleware import MarkdownASGIMiddleware373374app = FastAPI()375app = MarkdownASGIMiddleware(app)376```377378### WSGI Middleware (Django / Flask)379380Add markdown serving to any WSGI application:381382```python383from context_cli.middleware import MarkdownWSGIMiddleware384385app = MarkdownWSGIMiddleware(app)386```387388Both middleware variants intercept requests with `Accept: text/markdown`, convert the response HTML through the markdown engine, and return clean markdown with `Content-Type: text/markdown`.389390## Citation Radar391392Query AI models to see what they cite and recommend for any search prompt:393394```bash395pip install context-linter[generate]396context-cli radar "best project management tools" --brand Asana --brand Monday --model gpt-4o-mini397```398399Options:400- `--brand/-b`: Brand name to track (repeatable)401- `--model/-m`: LLM model to query (repeatable, default: gpt-4o-mini)402- `--runs/-r`: Runs per model for statistical significance403- `--json`: Output as JSON404405## Share-of-Recommendation Benchmark406407Track how AI models mention and recommend your brand across multiple prompts:408409```bash410pip install context-linter[generate]411context-cli benchmark prompts.txt -b "YourBrand" -c "Competitor1" -c "Competitor2"412```413414Options:415- `prompts.txt`: CSV (with `prompt,category,intent` columns) or plain text (one prompt per line)416- `--brand/-b`: Target brand to track (required)417- `--competitor/-c`: Competitor brand (repeatable)418- `--model/-m`: LLM model to query (repeatable, default: gpt-4o-mini)419- `--runs/-r`: Runs per model per prompt (default: 3)420- `--yes/-y`: Skip cost confirmation prompt421- `--json`: Output as JSON422423## GitHub Action424425Use Context CLI in your CI pipeline:426427```yaml428- name: Run Context Lint429 uses: hanselhansel/context-cli@main430 with:431 url: 'https://your-site.com'432 fail-under: '60'433```434435With baseline regression detection:436437```yaml438- name: Run Context Lint439 uses: hanselhansel/context-cli@main440 with:441 url: 'https://your-site.com'442 baseline-file: '.context-baseline.json'443 save-baseline: '.context-baseline.json'444 regression-threshold: '5'445```446447The action sets up Python, installs context-cli, and runs the lint. Outputs `score` and `report-json` for downstream steps. See [docs/ci-integration.md](docs/ci-integration.md) for full documentation.448449## Score Breakdown450451Context CLI supports two scoring models. V2 (default) uses four pillars; V3 adds a fifth pillar for agent readiness.452453### V2 Scoring (default)454455| Pillar | Max Points | What it measures |456|---|---|---|457| Content density | 40 | Quality and depth of extractable text content |458| Robots.txt AI bot access | 25 | Whether AI crawlers are allowed in robots.txt |459| Schema.org JSON-LD | 25 | Structured data markup (Product, Article, FAQ, etc.) |460| llms.txt presence | 10 | Whether a /llms.txt file exists for LLM guidance |461462### V3 Scoring (`--scoring v3`)463464| Pillar | Max Points | What it measures |465|---|---|---|466| Content density | 35 | Quality and depth of extractable text content |467| Robots.txt AI bot access | 20 | Whether AI crawlers are allowed in robots.txt |468| Schema.org JSON-LD | 20 | Structured data markup (Product, Article, FAQ, etc.) |469| Agent Readiness | 20 | Preparedness for autonomous AI agent interaction |470| llms.txt presence | 5 | Whether a /llms.txt file exists for LLM guidance |471472The Agent Readiness pillar checks six sub-signals:473474| Sub-check | Points | What it detects |475|---|---|---|476| AGENTS.md | 5 | Presence of an AGENTS.md file describing agent interaction |477| Accept: text/markdown | 5 | Server responds to `Accept: text/markdown` with markdown content |478| MCP endpoint | 4 | Presence of a discoverable MCP (Model Context Protocol) endpoint |479| Semantic HTML | 3 | Quality of semantic HTML structure (landmark elements, ARIA roles) |480| x402 payment signaling | 2 | HTTP 402 or x402 headers indicating payment-gated agent access |481| NLWeb support | 1 | Support for the NLWeb protocol for natural language web queries |482483See [docs/scoring-v3.md](docs/scoring-v3.md) for the full V3 methodology.484485### Scoring rationale486487The V2 weights reflect how AI search engines (ChatGPT, Perplexity, Claude) actually consume web content:488489- **Content density (40 pts)** is weighted highest because it's what LLMs extract and cite when answering questions. Rich, well-structured content with headings and lists gives AI better material to work with.490- **Robots.txt (25 pts)** is the gatekeeper -- if a bot is blocked, it literally cannot crawl. It's critical but largely binary (either you're blocking or you're not).491- **Schema.org (25 pts)** provides structured "cheat sheets" that help AI understand entities. High-value types (Product, Article, FAQ, HowTo, Recipe) receive bonus weighting. Valuable but not required for citation.492- **llms.txt (10 pts)** is an emerging standard. Both `/llms.txt` and `/llms-full.txt` are checked. No major AI search engine heavily weights it yet, but it signals forward-thinking AI readiness.493494V3 rebalances these weights to accommodate the new Agent Readiness pillar, reflecting the growing importance of direct agent interaction alongside traditional crawl-and-index patterns.495496## AI Bots Checked497498Context CLI checks access rules for 13 AI crawlers:499500- GPTBot501- ChatGPT-User502- Google-Extended503- ClaudeBot504- PerplexityBot505- Amazonbot506- OAI-SearchBot507- DeepSeek-AI508- Grok509- Meta-ExternalAgent510- cohere-ai511- AI2Bot512- ByteSpider513514## Development515516```bash517# Install with dev dependencies518pip install -e ".[dev]"519520# Run tests521pytest522523# Lint524ruff check src/ tests/525```526527## License528529MIT530
Full transparency — inspect the skill content before installing.