Fast, token-efficient web content extraction for AI agents - converts websites to clean Markdown. Existing MCP web crawlers are slow and consume large quantities of tokens. This pauses the development process and provides incomplete results as LLMs need to parse whole web pages. This MCP package fetches web pages locally, strips noise, and converts content to clean Markdown while preserving links.
Add this skill
npx mdskills install just-every/mcp-read-website-fastWell-documented MCP server converting web pages to clean Markdown with smart caching and crawling
1# @just-every/mcp-read-website-fast23Fast, token-efficient web content extraction for AI agents - converts websites to clean Markdown.45[](https://www.npmjs.com/package/@just-every/mcp-read-website-fast)6[](https://github.com/just-every/mcp-read-website-fast/actions)78<a href="https://glama.ai/mcp/servers/@just-every/mcp-read-website-fast">9 <img width="380" height="200" src="https://glama.ai/mcp/servers/@just-every/mcp-read-website-fast/badge" alt="read-website-fast MCP server" />10</a>1112## Overview1314Existing MCP web crawlers are slow and consume large quantities of tokens. This pauses the development process and provides incomplete results as LLMs need to parse whole web pages.1516This MCP package fetches web pages locally, strips noise, and converts content to clean Markdown while preserving links. Designed for Claude Code, IDEs and LLM pipelines with minimal token footprint. Crawl sites locally with minimal dependencies.1718**Note:** This package now uses [@just-every/crawl](https://www.npmjs.com/package/@just-every/crawl) for its core crawling and markdown conversion functionality.1920## Features2122- **Fast startup** using official MCP SDK with lazy loading for optimal performance23- **Content extraction** using Mozilla Readability (same as Firefox Reader View)24- **HTML to Markdown** conversion with Turndown + GFM support25- **Smart caching** with SHA-256 hashed URLs26- **Polite crawling** with robots.txt support and rate limiting27- **Concurrent fetching** with configurable depth crawling28- **Stream-first design** for low memory usage29- **Link preservation** for knowledge graphs30- **Optional chunking** for downstream processing3132## Installation3334### Claude Code3536```bash37claude mcp add read-website-fast -s user -- npx -y @just-every/mcp-read-website-fast38```3940### VS Code4142```bash43code --add-mcp '{"name":"read-website-fast","command":"npx","args":["-y","@just-every/mcp-read-website-fast"]}'44```4546### Cursor4748```bash49cursor://anysphere.cursor-deeplink/mcp/install?name=read-website-fast&config=eyJyZWFkLXdlYnNpdGUtZmFzdCI6eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqdXN0LWV2ZXJ5L21jcC1yZWFkLXdlYnNpdGUtZmFzdCJdfX0=50```5152### JetBrains IDEs5354Settings → Tools → AI Assistant → Model Context Protocol (MCP) → Add5556Choose “As JSON” and paste:5758```json59{"command":"npx","args":["-y","@just-every/mcp-read-website-fast"]}60```6162Or, in the chat window, type /add and fill in the same JSON—both paths land the server in a single step. 6364### Raw JSON (works in any MCP client)6566```json67{68 "mcpServers": {69 "read-website-fast": {70 "command": "npx",71 "args": ["-y", "@just-every/mcp-read-website-fast"]72 }73 }74}75```7677Drop this into your client’s mcp.json (e.g. .vscode/mcp.json, ~/.cursor/mcp.json, or .mcp.json for Claude).78798081## Features8283- **Fast startup** using official MCP SDK with lazy loading for optimal performance84- **Content extraction** using Mozilla Readability (same as Firefox Reader View)85- **HTML to Markdown** conversion with Turndown + GFM support86- **Smart caching** with SHA-256 hashed URLs87- **Polite crawling** with robots.txt support and rate limiting88- **Concurrent fetching** with configurable depth crawling89- **Stream-first design** for low memory usage90- **Link preservation** for knowledge graphs91- **Optional chunking** for downstream processing9293### Available Tools9495- `read_website` - Fetches a webpage and converts it to clean markdown96 - Parameters:97 - `url` (required): The HTTP/HTTPS URL to fetch98 - `pages` (optional): Maximum number of pages to crawl (default: 1, max: 100)99100### Available Resources101102- `read-website-fast://status` - Get cache statistics103- `read-website-fast://clear-cache` - Clear the cache directory104105## Development Usage106107### Install108109```bash110npm install111npm run build112```113114### Single page fetch115```bash116npm run dev fetch https://example.com/article117```118119### Crawl with depth120```bash121npm run dev fetch https://example.com --depth 2 --concurrency 5122```123124### Output formats125```bash126# Markdown only (default)127npm run dev fetch https://example.com128129# JSON output with metadata130npm run dev fetch https://example.com --output json131132# Both URL and markdown133npm run dev fetch https://example.com --output both134```135136### CLI Options137138- `-p, --pages <number>` - Maximum number of pages to crawl (default: 1)139- `-c, --concurrency <number>` - Max concurrent requests (default: 3)140- `--no-robots` - Ignore robots.txt141- `--all-origins` - Allow cross-origin crawling142- `-u, --user-agent <string>` - Custom user agent143- `--cache-dir <path>` - Cache directory (default: .cache)144- `-t, --timeout <ms>` - Request timeout in milliseconds (default: 30000)145- `-o, --output <format>` - Output format: json, markdown, or both (default: markdown)146147### Clear cache148```bash149npm run dev clear-cache150```151152## Auto-Restart Feature153154The MCP server includes automatic restart capability by default for improved reliability:155156- Automatically restarts the server if it crashes157- Handles unhandled exceptions and promise rejections158- Implements exponential backoff (max 10 attempts in 1 minute)159- Logs all restart attempts for monitoring160- Gracefully handles shutdown signals (SIGINT, SIGTERM)161162For development/debugging without auto-restart:163```bash164# Run directly without restart wrapper165npm run serve:dev166```167168## Architecture169170```171mcp/172├── src/173│ ├── crawler/ # URL fetching, queue management, robots.txt174│ ├── parser/ # DOM parsing, Readability, Turndown conversion175│ ├── cache/ # Disk-based caching with SHA-256 keys176│ ├── utils/ # Logger, chunker utilities177│ ├── index.ts # CLI entry point178│ ├── serve.ts # MCP server entry point179│ └── serve-restart.ts # Auto-restart wrapper180```181182## Development183184```bash185# Run in development mode186npm run dev fetch https://example.com187188# Build for production189npm run build190191# Run tests192npm test193194# Type checking195npm run typecheck196197# Linting198npm run lint199```200201## Contributing202203Contributions are welcome! Please:2042051. Fork the repository2062. Create a feature branch2073. Add tests for new functionality2084. Submit a pull request209210## Troubleshooting211212### Cache Issues213```bash214npm run dev clear-cache215```216217### Timeout Errors218- Increase timeout with `-t` flag219- Check network connectivity220- Verify URL is accessible221222### Content Not Extracted223- Some sites block automated access224- Try custom user agent with `-u` flag225- Check if site requires JavaScript (not supported)226227## License228229MIT
Full transparency — inspect the skill content before installing.