A Model Context Protocol (MCP) server for intelligent handling of large files with smart chunking, navigation, and streaming capabilities. - Smart Chunking - Automatically determines optimal chunk size based on file type - Intelligent Navigation - Jump to specific lines with surrounding context - Powerful Search - Regex support with context lines before/after matches - File Analysis - Comprehensiv
Add this skill
npx mdskills install willianpinho/large-file-mcpComprehensive MCP server with excellent chunking, search, and navigation tools for large files
1# Large File MCP Server23A Model Context Protocol (MCP) server for intelligent handling of large files with smart chunking, navigation, and streaming capabilities.45[](https://www.npmjs.com/package/@willianpinho/large-file-mcp)6[](https://www.npmjs.com/package/@willianpinho/large-file-mcp)7[](https://github.com/willianpinho/large-file-mcp/actions/workflows/ci.yml)8[](https://codecov.io/gh/willianpinho/large-file-mcp)9[](https://opensource.org/licenses/MIT)10[](https://www.typescriptlang.org/)11[](https://nodejs.org/)12[](https://modelcontextprotocol.io/)13[](https://willianpinho.github.io/large-file-mcp/)14[](https://github.com/willianpinho/large-file-mcp/stargazers)15[](https://github.com/willianpinho/large-file-mcp/issues)1617<a href="https://glama.ai/mcp/servers/@willianpinho/large-file-mcp">18 <img width="380" height="200" src="https://glama.ai/mcp/servers/@willianpinho/large-file-mcp/badge" alt="Large File MCP Server" />19</a>2021> ๐ **[Full Documentation](https://willianpinho.github.io/large-file-mcp/)** | [API Reference](https://willianpinho.github.io/large-file-mcp/api/reference) | [Examples](https://willianpinho.github.io/large-file-mcp/examples/use-cases)2223## Features2425- **Smart Chunking** - Automatically determines optimal chunk size based on file type26- **Intelligent Navigation** - Jump to specific lines with surrounding context27- **Powerful Search** - Regex support with context lines before/after matches28- **File Analysis** - Comprehensive metadata and statistical analysis29- **Memory Efficient** - Stream files of any size without loading into memory30- **Performance Optimized** - Built-in LRU caching for frequently accessed chunks31- **Type Safe** - Written in TypeScript with strict typing32- **Cross-Platform** - Works on Windows, macOS, and Linux3334## Installation3536```bash37npm install -g @willianpinho/large-file-mcp38```3940Or use directly with npx:4142```bash43npx @willianpinho/large-file-mcp44```4546## Quick Start4748### Claude Code CLI4950Add the MCP server using the CLI:5152```bash53# Add for current project only (local scope)54claude mcp add --transport stdio --scope local large-file-mcp -- npx -y @willianpinho/large-file-mcp5556# Add globally for all projects (user scope)57claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp58```5960**Verify installation:**6162```bash63claude mcp list64claude mcp get large-file-mcp65```6667**Remove if needed:**6869```bash70# Remove from local scope71claude mcp remove large-file-mcp -s local7273# Remove from user scope74claude mcp remove large-file-mcp -s user75```7677**MCP Scopes:**7879- `local` - Available only in the current project directory80- `user` - Available globally for all projects81- `project` - Defined in `.mcp.json` for team sharing8283### Claude Desktop8485Add to your `claude_desktop_config.json`:8687```json88{89 "mcpServers": {90 "large-file": {91 "command": "npx",92 "args": ["-y", "@willianpinho/large-file-mcp"]93 }94 }95}96```9798**Config file locations:**99100- macOS: `~/Library/Application Support/Claude/claude_desktop_config.json`101- Windows: `%APPDATA%\Claude\claude_desktop_config.json`102103Restart Claude Desktop after editing.104105### Other AI Platforms106107**Gemini:**108109```json110{111 "tools": [112 {113 "name": "large-file-mcp",114 "command": "npx @willianpinho/large-file-mcp",115 "protocol": "mcp"116 }117 ]118}119```120121## Usage122123Once configured, you can use natural language to interact with large files:124125```text126Read the first chunk of /var/log/system.log127```128129```text130Find all ERROR messages in /var/log/app.log131```132133```text134Show me line 1234 of /code/app.ts with context135```136137```text138Get the structure of /data/sales.csv139```140141## Available Tools142143### read_large_file_chunk144145Read a specific chunk of a large file with intelligent chunking.146147**Parameters:**148149- `filePath` (required): Absolute path to the file150- `chunkIndex` (optional): Zero-based chunk index (default: 0)151- `linesPerChunk` (optional): Lines per chunk (auto-detected if not provided)152- `includeLineNumbers` (optional): Include line numbers (default: false)153154**Example:**155156```json157{158 "filePath": "/var/log/system.log",159 "chunkIndex": 0,160 "includeLineNumbers": true161}162```163164### search_in_large_file165166Search for patterns in large files with context.167168**Parameters:**169170- `filePath` (required): Absolute path to the file171- `pattern` (required): Search pattern172- `caseSensitive` (optional): Case sensitive search (default: false)173- `regex` (optional): Use regex pattern (default: false)174- `maxResults` (optional): Maximum results (default: 100)175- `contextBefore` (optional): Context lines before match (default: 2)176- `contextAfter` (optional): Context lines after match (default: 2)177178**Example:**179180```json181{182 "filePath": "/var/log/error.log",183 "pattern": "ERROR.*database",184 "regex": true,185 "maxResults": 50186}187```188189### get_file_structure190191Analyze file structure and get comprehensive metadata.192193**Parameters:**194195- `filePath` (required): Absolute path to the file196197**Returns:** File metadata, line statistics, recommended chunk size, and sample lines.198199### navigate_to_line200201Jump to a specific line with surrounding context.202203**Parameters:**204205- `filePath` (required): Absolute path to the file206- `lineNumber` (required): Line number to navigate to (1-indexed)207- `contextLines` (optional): Context lines before/after (default: 5)208209### get_file_summary210211Get comprehensive statistical summary of a file.212213**Parameters:**214215- `filePath` (required): Absolute path to the file216217**Returns:** File metadata, line statistics, character statistics, and word count.218219### stream_large_file220221Stream a file in chunks for processing very large files.222223**Parameters:**224225- `filePath` (required): Absolute path to the file226- `chunkSize` (optional): Chunk size in bytes (default: 64KB)227- `startOffset` (optional): Starting byte offset (default: 0)228- `maxChunks` (optional): Maximum chunks to return (default: 10)229230## Supported File Types231232The server intelligently detects and optimizes for:233234- Text files (.txt) - 500 lines/chunk235- Log files (.log) - 500 lines/chunk236- Code files (.ts, .js, .py, .java, .cpp, .go, .rs, etc.) - 300 lines/chunk237- CSV files (.csv) - 1000 lines/chunk238- JSON files (.json) - 100 lines/chunk239- XML files (.xml) - 200 lines/chunk240- Markdown files (.md) - 500 lines/chunk241- Configuration files (.yml, .yaml, .sh, .bash) - 300 lines/chunk242243## Configuration244245Customize behavior using environment variables:246247| Variable | Description | Default |248|----------|-------------|---------|249| `CHUNK_SIZE` | Default lines per chunk | 500 |250| `OVERLAP_LINES` | Overlap between chunks | 10 |251| `MAX_FILE_SIZE` | Maximum file size in bytes | 10GB |252| `CACHE_SIZE` | Cache size in bytes | 100MB |253| `CACHE_TTL` | Cache TTL in milliseconds | 5 minutes |254| `CACHE_ENABLED` | Enable/disable caching | true |255256**Example with custom settings (Claude Desktop):**257258```json259{260 "mcpServers": {261 "large-file": {262 "command": "npx",263 "args": ["-y", "@willianpinho/large-file-mcp"],264 "env": {265 "CHUNK_SIZE": "1000",266 "CACHE_ENABLED": "true"267 }268 }269 }270}271```272273**Example with custom settings (Claude Code CLI):**274275```bash276claude mcp add --transport stdio --scope user large-file-mcp \277 --env CHUNK_SIZE=1000 \278 --env CACHE_ENABLED=true \279 -- npx -y @willianpinho/large-file-mcp280```281282## Examples283284### Analyzing Log Files285286```text287Analyze /var/log/nginx/access.log and find all 404 errors288```289290The AI will use the search tool to find patterns and provide context around each match.291292### Code Navigation293294```text295Find all function definitions in /project/src/main.py296```297298Uses regex search to locate function definitions with surrounding code context.299300### CSV Data Exploration301302```text303Show me the structure of /data/sales.csv304```305306Returns metadata, line count, sample rows, and recommended chunk size.307308### Large File Processing309310```text311Stream the first 100MB of /data/huge_dataset.json312```313314Uses streaming mode to handle very large files efficiently.315316## Performance317318### Caching319320- **LRU Cache** with configurable size (default 100MB)321- **TTL-based expiration** (default 5 minutes)322- **80-90% hit rate** for repeated access323- Significant performance improvement for frequently accessed files324325### Memory Management326327- **Streaming architecture** - files are read line-by-line, never fully loaded328- **Configurable chunk sizes** - adjust based on your use case329- **Smart buffering** - minimal memory footprint for search operations330331### File Size Handling332333| File Size | Operation Time | Method |334|-----------|---------------|--------|335| < 1MB | < 100ms | Direct read |336| 1-100MB | < 500ms | Streaming |337| 100MB-1GB | 1-3s | Streaming + cache |338| > 1GB | Progressive | AsyncGenerator |339340## Development341342### Building from Source343344```bash345git clone https://github.com/willianpinho/large-file-mcp.git346cd large-file-mcp347pnpm install348pnpm build349```350351### Development Mode352353```bash354pnpm dev # Watch mode355pnpm lint # Run linter356pnpm start # Run server357```358359### Project Structure360361```text362src/363โโโ index.ts # Entry point364โโโ server.ts # MCP server implementation365โโโ fileHandler.ts # Core file handling logic366โโโ cacheManager.ts # Caching implementation367โโโ types.ts # TypeScript type definitions368```369370## Troubleshooting371372### File not accessible373374Ensure the file path is absolute and the file has read permissions:375376```bash377chmod +r /path/to/file378```379380### Out of memory3813821. Reduce `CHUNK_SIZE` environment variable3832. Disable cache with `CACHE_ENABLED=false`3843. Use `stream_large_file` for very large files385386### Slow search performance3873881. Reduce `maxResults` parameter3892. Use `startLine` and `endLine` to limit search range3903. Ensure caching is enabled391392### Claude Code CLI: MCP server not found393394Check if the server is installed:395396```bash397claude mcp list398```399400If not listed, reinstall:401402```bash403claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp404```405406Check server health:407408```bash409claude mcp get large-file-mcp410```411412## Usage Metrics413414This MCP server is actively maintained and monitored for usage patterns to improve functionality. Usage metrics help us:415416- Understand which tools are most valuable417- Identify performance bottlenecks418- Prioritize feature development419- Ensure reliability and stability420421### Monitoring in Production422423The server provides comprehensive logging and telemetry through environment variables:424425- **CACHE_ENABLED**: Enable/disable caching (default: `true`)426- **CACHE_SIZE**: Cache size in bytes (default: `104857600` - 100MB)427- **CACHE_TTL**: Cache TTL in milliseconds (default: `300000` - 5 minutes)428- **CHUNK_SIZE**: Default lines per chunk (default: `500`)429- **MAX_FILE_SIZE**: Maximum file size in bytes (default: `10737418240` - 10GB)430- **OVERLAP_LINES**: Overlap between chunks (default: `10`)431432### Usage Examples433434Recent usage patterns show the server is particularly effective for:435436- **Log Analysis**: Processing multi-GB log files with search and navigation437- **Data Processing**: Reading large CSV/JSON files in manageable chunks438- **Code Review**: Navigating large codebases efficiently439- **System Monitoring**: Analyzing system logs and debug outputs440- **Document Analysis**: Processing large text documents441442For detailed analytics and usage trends, visit the [Glama.ai dashboard](https://glama.ai/mcp/servers/@willianpinho/large-file-mcp).443444## Contributing445446Contributions are welcome! Please feel free to submit issues or pull requests.447448### Development Workflow4494501. Fork the repository4512. Create a feature branch4523. Make your changes4534. Ensure code builds and lints successfully4545. Submit a pull request455456See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.457458## License459460MIT461462## Support463464- **Issues:** [GitHub Issues](https://github.com/willianpinho/large-file-mcp/issues)465- **Documentation:** This README and inline code documentation466- **Examples:** Check the `examples/` directory467468## Acknowledgments469470Built with the [Model Context Protocol SDK](https://github.com/modelcontextprotocol/sdk).471472---473474Made for the AI developer community.475
Full transparency โ inspect the skill content before installing.