Give any AI instant automated knowledge of your entire codebase (and infrastructure) — at scale, zero configuration, fully private, completely free. Kindly sponsored by Altaire Limited One thing, done well: deep codebase intelligence — zero setup, no bloat, fully automatic. SocratiCode gives AI assistants deep semantic understanding of your codebase — hybrid search, polyglot code dependency graphs
Add this skill
npx mdskills install giancarloerra/socraticodeEnterprise-grade codebase intelligence with hybrid search, dependency graphs, and zero-config Docker deployment
![]()
"There is only one good, knowledge, and one evil, ignorance." — Socrates
Give any AI instant automated knowledge of your entire codebase (and infrastructure) — at scale, zero configuration, fully private, completely free.
Kindly sponsored by Altaire Limited
If SocratiCode has been useful to you, please ⭐ star this repo — it helps others discover it — and share it with your dev team and fellow developers!
One thing, done well: deep codebase intelligence — zero setup, no bloat, fully automatic. SocratiCode gives AI assistants deep semantic understanding of your codebase — hybrid search, polyglot code dependency graphs, and searchable context artifacts (database schemas, API specs, infra configs, architecture docs). Zero configuration — add it to any MCP host and it manages everything automatically.
Production-ready, battle-tested on enterprise-level large repositories (up to and over ~40 million lines of code). Batched, automatic resumable indexing checkpoints progress — pauses, crashes, restarts, and interruptions don't lose work. The file watcher keeps the index automatically updated at every file change and across sessions.
Private and local by default — Docker handles everything, no API keys required, no data leaves your machine. Cloud ready for embeddings (OpenAI, Google Gemini) and Qdrant, and a full suite of configuration options are all available when you need them.
The first Qdrant‑based MCP server that pairs auto‑managed, zero‑config local Docker deployment with AST‑aware code chunking, hybrid semantic + BM25 (RRF‑fused) code search, polyglot dependency graphs with circular‑dependency visualization, and searchable infra/API/database artifacts in a single focused, zero-config and easy to use code intelligence engine.
Benchmarked on VS Code (2.45M lines): SocratiCode uses 61% less context, 84% fewer tool calls, and is 37x faster than grep‑based exploration — tested live with Claude Opus 4.6. See the full benchmark →
Only Docker (running) required.
One-click install — VS Code and Cursor:
All MCP hosts — add the following to your mcpServers (Claude Desktop, Windsurf, Cline, Roo Code) or servers (VS Code project-local .vscode/mcp.json) config:
"socraticode": {
"command": "npx",
"args": ["-y", "socraticode"]
}
Claude Code — run this command:
claude mcp add socraticode -- npx -y socraticode
OpenAI Codex CLI — add to ~/.codex/config.toml:
[mcp_servers.socraticode]
command = "npx"
args = ["-y", "socraticode"]
Restart your host. On first use SocratiCode automatically pulls Docker images, starts its own Qdrant and Ollama containers, and downloads the embedding model — one-time setup, ~5 minutes depending on your connection. After that, it starts in seconds.
First time on a project — ask your AI: "Index this codebase". Indexing runs in the background; ask "What is the codebase index status?" to monitor progress. Depending on codebase size and whether you're using GPU-accelerated Ollama or cloud embeddings, first-time indexing can take anywhere from a few seconds to a few minutes (it takes under 10 minutes to first-index +3 million lines of code on a Macbook Pro M4). Once complete it doesn't need to be run again, you can search, explore the dependency graph, and query context artifacts.
Every time after that — just use the tools (search, graph, etc.). On server startup SocratiCode automatically detects previously indexed projects, restarts the file watcher, and runs an incremental update to catch any changes made while the server was down. If indexing was interrupted, it resumes automatically from the last checkpoint. You can also explicitly start or restart the watcher with codebase_watch { action: "start" }.
macOS / Windows on large codebases: Docker containers can't use the GPU. For medium-to-large repos, install native Ollama (auto-detected, no config change needed) for Metal/CUDA acceleration, or use OpenAI embeddings for speed without a local install. Full details.
Recommended: For best results, add the Agent Instructions to your AI assistant's system prompt or project instructions file (
CLAUDE.md,AGENTS.md, etc.). The key principle — search before reading — helps your AI use SocratiCode's tools effectively and avoid unnecessary file reads.
Advanced: cloud embeddings (OpenAI / Google), external Qdrant, remote Ollama, native Ollama, and dozens of tuning options are all available. See Configuration below.
I built SocratiCode because I regularly work on existing, large, and complex codebases across different languages and need to quickly understand them and act. Existing solutions were either too limited, insufficiently tested for production use, or bloated with unnecessary complexity. I wanted a single focused tool that does deep codebase intelligence well — zero setup, no bloat, fully automatic — and gets out of the way.
QDRANT_MODE, QDRANT_URL, and QDRANT_API_KEY environment variables.OLLAMA_MODE, OLLAMA_URL, EMBEDDING_MODEL and EMBEDDING_DIMENSIONS environment variables.text-embedding-3-small) or Google Generative AI (gemini-embedding-001) for cloud-based embeddings. Just set EMBEDDING_PROVIDER and your API key..gitignore files (root + nested), plus an optional .socraticodeignore for additional exclusions. Includes sensible built-in defaults. .gitignore processing can be disabled via RESPECT_GITIGNORE=false..tpl, .blade) can be included via EXTRA_EXTENSIONS env var or extraExtensions tool parameter. Works for both indexing and code graph.codebase_index completes, after codebase_update, and on the first codebase_search, codebase_status, or graph query. You can also start it manually with codebase_watch { action: "start" } if needed.codebase_graph_build manually unless you want to force a rebuild.proper-lockfile) prevents multiple MCP instances from simultaneously indexing or watching the same project. Stale locks from crashed processes are automatically reclaimed. When another MCP process is already watching a project, codebase_status reports "active (watched by another process)" instead of incorrectly showing "inactive."codebase_index while indexing is already running, it returns the current progress instead of starting a second operation.codebase_stop. The current batch finishes and checkpoints, preserving all progress. Re-run codebase_index to resume from where it left off.SOCRATICODE_LOG_LEVEL.| Dependency | Purpose | Install |
|---|---|---|
| Docker | Runs Qdrant (vector DB) and by default Ollama (embeddings) | docker.com |
| Node.js 18+ | Runs the MCP server | nodejs.org |
Docker must be running when you use the server in the default managed mode.
The Qdrant container is managed automatically. If you set QDRANT_MODE=external and point QDRANT_URL at a remote or cloud Qdrant instance, Docker is only needed for Ollama (embeddings) in that case.
The Ollama container (embeddings) is also managed automatically in the default auto mode. SocratiCode first checks if Ollama is already running natively — if so it uses it. Otherwise it manages a Docker container for you. First-time download of the docker images or embedding models may take a few minutes, depending on your internet speed, and is required only at first launch.
Docker containers on macOS and Windows cannot access the GPU (no Metal or CUDA passthrough). For small projects this is fine, but for medium-to-large codebases the CPU-only container is noticeably slower.
For best performance, install native Ollama: download and run the installer from ollama.com/download. Once Ollama is running, SocratiCode will automatically detect and use it — no extra configuration needed (first-time download of the embedding model, if not present, might take a few minutes). This gives you Metal GPU acceleration on macOS and CUDA on Windows/Linux.
If you prefer speed without a local install, see OpenAI Embeddings and Google Generative AI Embeddings below for cloud-based options. OpenAI is very fast with no local setup required. Google’s free tier is functional but rate-limited. See Environment Variables for configuration details.
All tools default projectPath to the current working directory, so you never need to specify a path for the active project.
User: "Index this project"
→ codebase_index {}
⚡ Indexing started in the background — call codebase_status to check progress
→ codebase_status {}
⚠ Full index in progress — Phase: generating embeddings (batch 1/1)
Progress: 247/1847 chunks embedded (13%) — Elapsed: 12s
→ codebase_status {}
✓ Indexing complete: 342 files, 1,847 chunks (took 115.2s)
File watcher: active (auto-updating on changes)
User: "Search for how authentication is handled"
→ codebase_search { query: "authentication handling" }
Runs dense semantic search + BM25 keyword search in parallel, fuses results with RRF
Returns top 10 results ranked by combined relevance
User: "What files depend on the auth middleware?"
→ codebase_graph_query { filePath: "src/middleware/auth.ts" }
Returns imports and dependents
(graph was auto-built after indexing — no manual build needed)
User: "Show me the dependency graph"
→ codebase_graph_visualize {}
Returns a Mermaid diagram color-coded by language
User: "Are there any circular dependencies?"
→ codebase_graph_circular {}
Found 2 cycles: src/a.ts → src/b.ts → src/a.ts
For best results, add instructions like the following to your AI assistant's system prompt, CLAUDE.md, AGENTS.md, or equivalent instructions file. The core principle: search before reading. The index gives you a map of the codebase in milliseconds; raw file reading is expensive and context-consuming.
## Codebase Search (SocratiCode)
This project is indexed with SocratiCode. Always use its MCP tools to explore the codebase
before reading any files directly.
### Workflow
1. **Start most explorations with `codebase_search`.**
Hybrid semantic + keyword search (vector + BM25, RRF-fused) runs in a single call.
- Use broad, conceptual queries for orientation: "how is authentication handled",
"database connection setup", "error handling patterns".
- Use precise queries for symbol lookups: exact function names, constants, type names.
- Prefer search results to infer which files to read — do not speculatively open files.
- **When to use grep instead**: If you already know the exact identifier, error string,
or regex pattern, grep/ripgrep is faster and more precise — no semantic gap to bridge.
Use `codebase_search` when you're exploring, asking conceptual questions, or don't
know which files to look in.
2. **Follow the graph before following imports.**
Use `codebase_graph_query` to see what a file imports and what depends on it before
diving into its contents. This prevents unnecessary reading of transitive dependencies.
3. **Read files only after narrowing down via search.**
Once search results clearly point to 1–3 files, read only the relevant sections.
Never read a file just to find out if it's relevant — search first.
4. **Use `codebase_graph_circular` when debugging unexpected behavior.**
Circular dependencies cause subtle runtime issues; check for them proactively.
5. **Check `codebase_status` if search returns no results.**
The project may not be indexed yet. Run `codebase_index` if needed, then wait for
`codebase_status` to confirm completion before searching.
6. **Leverage context artifacts for non-code knowledge.**
Projects can define a `.socraticodecontextartifacts.json` config to expose database
schemas, API specs, infrastructure configs, architecture docs, and other project
knowledge that lives outside source code. These artifacts are auto-indexed alongside
code during `codebase_index` and `codebase_update`.
- Run `codebase_context` early to see what artifacts are available.
- Use `codebase_context_search` to find specific schemas, endpoints, or configs
before asking about database structure or API contracts.
- If `codebase_status` shows artifacts are stale, run `codebase_context_index` to
refresh them.
### When to use each tool
| Goal | Tool |
|------|------|
| Understand what a codebase does / where a feature lives | `codebase_search` (broad query) |
| Find a specific function, constant, or type | `codebase_search` (exact name) or grep if you know already the exact string |
| Find exact error messages, log strings, or regex patterns | grep / ripgrep |
| See what a file imports or what depends on it | `codebase_graph_query` |
| Spot architectural problems | `codebase_graph_circular`, `codebase_graph_stats` |
| Visualise module structure | `codebase_graph_visualize` |
| Verify index is up to date | `codebase_status` |
| Discover what project knowledge (schemas, specs, configs) is available | `codebase_context` |
| Find database tables, API endpoints, infra configs | `codebase_context_search` |
Why semantic search first? A single
codebase_searchcall returns ranked, deduplicated snippets from across the entire codebase in milliseconds. This gives you a broad map at negligible token cost — far cheaper than opening files speculatively. Once you know which files matter, targeted reading is both faster and more accurate. That said, grep remains the right tool when you have an exact string or pattern — use whichever fits the query.
Keep the connection alive during indexing. Indexing runs in the background — the MCP server continues working even when not actively responding to tool calls. However, some MCP hosts might disconnect an idle MCP connection after a period of inactivity, which might cut off the background process. Instruct your AI to call
codebase_statusroughly every 60 seconds after startingcodebase_indexuntil it completes. This keeps the host connection active and provides real-time progress.
Requires Node.js 18+ and Docker (running). Already covered in Quick Start above, add the following to your mcpServers (Claude Desktop, Windsurf, Cline, Roo Code) or servers (VS Code project-local .vscode/mcp.json) config:
"socraticode": {
"command": "npx",
"args": ["-y", "socraticode"]
}
git clone https://github.com/giancarloerra/socraticode.git
cd socraticode
npm install
npm run build
Then use node /absolute/path/to/socraticode/dist/index.js in place of npx -y socraticode in the config examples below.
All
envoptions below apply equally to thenpxinstall. Just add the"env"block to the npx config shown above.
Add to your MCP settings - mcpServers (Claude Desktop, Windsurf, Cline, Roo Code) or servers (VS Code project-local .vscode/mcp.json):
Using npx? Your config is already in Quick Start. Add any
"env"block from the examples below as needed.
{
"mcpServers": {
"socraticode": {
"command": "node",
"args": ["/absolute/path/to/socraticode/dist/index.js"]
}
}
}
Tip: The default
OLLAMA_MODE=autodetects native Ollama (port 11434) on startup and uses it if available, otherwise falls back to a managed Docker container. To make your config self-documenting, add an"env"block with explicit values. See Environment Variables for all options.
If you have Ollama installed natively, set OLLAMA_MODE=external and point to your instance:
{
"mcpServers": {
"socraticode": {
"command": "node",
"args": ["/absolute/path/to/socraticode/dist/index.js"],
"env": {
"OLLAMA_MODE": "external",
"OLLAMA_URL": "http://localhost:11434"
}
}
}
}
The embedding model is pulled automatically on first use. To pre-download: ollama pull nomic-embed-text
{
"mcpServers": {
"socraticode": {
"command": "node",
"args": ["/absolute/path/to/socraticode/dist/index.js"],
"env": {
"OLLAMA_MODE": "external",
"OLLAMA_URL": "http://gpu-server.local:11434"
}
}
}
}
Use OpenAI's cloud embedding API instead of local Ollama. Requires an API key.
{
"mcpServers": {
"socraticode": {
"command": "node",
"args": ["/absolute/path/to/socraticode/dist/index.js"],
"env": {
"EMBEDDING_PROVIDER": "openai",
"OPENAI_API_KEY": "sk-..."
}
}
}
}
Defaults:
EMBEDDING_MODEL=text-embedding-3-small,EMBEDDING_DIMENSIONS=1536. For higher quality, usetext-embedding-3-largewithEMBEDDING_DIMENSIONS=3072.
Use Google's Gemini embedding API. Requires an API key.
{
"mcpServers": {
"socraticode": {
"command": "node",
"args": ["/absolute/path/to/socraticode/dist/index.js"],
"env": {
"EMBEDDING_PROVIDER": "google",
"GOOGLE_API_KEY": "AIza..."
}
}
}
}
Defaults:
EMBEDDING_MODEL=gemini-embedding-001,EMBEDDING_DIMENSIONS=3072.
Once connected, 21 tools are available to your AI assistant:
| Tool | Description |
|---|---|
codebase_index | Start indexing a codebase in the background (poll codebase_status for progress) |
codebase_stop | Gracefully stop an in-progress indexing operation (current batch finishes and checkpoints; resume with codebase_index) |
codebase_update | Incremental update — only re-indexes changed files |
codebase_remove | Remove a project's index (safely stops watcher, cancels in-flight indexing/update, waits for graph build) |
codebase_watch | Start/stop file watching — on start, catches up missed changes then watches for future ones |
| Tool | Description |
|---|---|
codebase_search | Hybrid semantic + keyword search (dense + BM25, RRF-fused) with optional file path and language filters |
codebase_status | Check index status and chunk count |
| Tool | Description |
|---|---|
codebase_graph_build | Build a polyglot dependency graph (runs in background — poll with codebase_graph_status) |
codebase_graph_query | Query imports and dependents for a specific file |
codebase_graph_stats | Get graph statistics (most connected files, orphans, language breakdown) |
codebase_graph_circular | Detect circular dependencies |
codebase_graph_visualize | Generate a Mermaid diagram of the dependency graph |
codebase_graph_status | Check graph build progress or persisted graph metadata |
codebase_graph_remove | Remove a project's persisted code graph (waits for in-flight graph build to finish first) |
| Tool | Description |
|---|---|
codebase_health | Check Docker, Qdrant, and embedding provider status |
codebase_list_projects | List all indexed projects with paths and metadata |
codebase_about | Display info about SocratiCode |
| Tool | Description |
|---|---|
codebase_context | List all context artifacts defined in .socraticodecontextartifacts.json with names, descriptions, and index status |
codebase_context_search | Semantic search across context artifacts (auto-indexes on first use, auto-detects staleness) |
codebase_context_index | Index or re-index all artifacts from .socraticodecontextartifacts.json |
codebase_context_remove | Remove all indexed context artifacts for a project (blocked while indexing is in progress) |
SocratiCode supports languages at three levels:
JavaScript, TypeScript, TSX, Python, Java, Kotlin, Scala, C, C++, C#, Go, Rust, Ruby, PHP, Swift, Bash/Shell, HTML, CSS/SCSS
Dart (import/export/part), Lua (require/dofile/loadfile)
Vue, Svelte, SASS, LESS, JSON, YAML, TOML, XML, INI/CFG, Markdown/MDX, RST, SQL, R, Dockerfile, TXT, and any file matching a supported extension or special filename (Dockerfile, Makefile, Gemfile, Rakefile, etc.)
54 file extensions + 8 special filenames supported out of the box.
The indexer combines three layers of ignore rules:
node_modules, .git, dist, build, lock files, IDE folders, etc..gitignore — All .gitignore files in the project (root and nested subdirectories). Set RESPECT_GITIGNORE=false to skip .gitignore processing entirely..socraticodeignore — Optional file for indexer-specific exclusions. Same syntax as .gitignore.Give the AI awareness of project knowledge beyond source code — database schemas, API specs, infrastructure configs, architecture docs, and more.
Create a .socraticodecontextartifacts.json file in your project root (see .socraticodecontextartifacts.json.example for a starter template):
{
"artifacts": [
{
"name": "database-schema",
"path": "./docs/schema.sql",
"description": "Complete PostgreSQL schema — all tables, indexes, constraints, foreign keys. Use to understand what data the app stores and how tables relate."
},
{
"name": "api-spec",
"path": "./docs/openapi.yaml",
"description": "OpenAPI 3.0 spec for the REST API. All endpoints, request/response schemas, auth requirements."
},
{
"name": "k8s-manifests",
"path": "./deploy/k8s/",
"description": "Kubernetes deployment manifests. Shows how services are deployed, scaled, and networked."
}
]
}
Each artifact has:
name — Unique identifier (used to filter searches)path — Path to a file or directory (relative to project root, or absolute). Directories are read recursively.description — Tells the AI what this artifact is and how to use itArtifacts are chunked and embedded into Qdrant using the same hybrid dense + BM25 search as code. On first search, artifacts are auto-indexed. On subsequent searches, staleness is auto-detected via content hashing — changed files are re-indexed transparently.
codebase_context — lists all defined artifacts and their index statuscodebase_context_search — semantic search across all artifacts (or filter by name)codebase_context_index — force re-index (usually not needed, auto-indexing handles it)codebase_context_remove — remove all indexed artifacts| Category | Examples |
|---|---|
| Database | SQL schema dumps (pg_dump --schema-only), Prisma schemas, Rails schema.rb, Django model dumps, migration files |
| API Contracts | OpenAPI/Swagger specs, GraphQL schemas, Protobuf definitions, AsyncAPI specs (Kafka, RabbitMQ) |
| Infrastructure | Terraform/Pulumi configs, Kubernetes manifests, Docker Compose files, CI/CD pipeline configs |
| Architecture | Architecture Decision Records (ADRs), service topology docs, data flow diagrams, domain glossaries |
| Operations | Monitoring/alerting rules, RBAC/permission matrices, auth flow documentation, feature flag configs |
| External | Third-party API docs, compliance requirements (SOC2, HIPAA, GDPR), SLA definitions |
Tip: For database schemas, every major database can export its entire schema to a single file:
pg_dump --schema-only(PostgreSQL),mysqldump --no-data(MySQL),sqlite3 db.sqlite .schema(SQLite). ORM schemas (Prisma, Rails, Django) are often already in your repo.
| Variable | Default | Description |
|---|---|---|
EMBEDDING_PROVIDER | ollama | Embedding backend: ollama (local, default), openai, or google |
EMBEDDING_MODEL | (per provider) | Model name. Defaults: nomic-embed-text (ollama), text-embedding-3-small (openai), gemini-embedding-001 (google) |
EMBEDDING_DIMENSIONS | (per provider) | Vector dimensions. Defaults: 768 (ollama), 1536 (openai), 3072 (google) |
EMBEDDING_CONTEXT_LENGTH | (auto-detected) | Model context window in tokens. Auto-detected for known models. Set manually for custom models. |
EMBEDDING_PROVIDER=ollama)| Variable | Default | Description |
|---|---|---|
OLLAMA_MODE | auto | auto = use native Ollama on port 11434 if available, otherwise manage a Docker container (recommended). docker = always use managed Docker container on port 11435. external = user-managed Ollama instance (native, remote, etc.) |
OLLAMA_URL | http://localhost:11434 (auto/external) / http://localhost:11435 (docker) | Full Ollama API endpoint |
OLLAMA_PORT | 11435 | Ollama container port (Docker mode). Ignored when OLLAMA_URL is set explicitly. |
OLLAMA_HOST | http://localhost:{OLLAMA_PORT} | Ollama base URL (alternative to OLLAMA_URL) |
OLLAMA_API_KEY | (none) | Optional API key for authenticated Ollama proxies |
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY | (none) | Required when EMBEDDING_PROVIDER=openai. Get from platform.openai.com |
GOOGLE_API_KEY | (none) | Required when EMBEDDING_PROVIDER=google. Get from aistudio.google.com |
| Variable | Default | Description |
|---|---|---|
QDRANT_MODE | managed | managed = Docker-managed local Qdrant (default). external = user-provided remote or cloud Qdrant (no Docker management). |
QDRANT_URL | (none) | Full URL of a remote/cloud Qdrant instance (e.g. https://xyz.aws.cloud.qdrant.io:6333). When set, takes precedence over QDRANT_HOST + QDRANT_PORT. Required (or set QDRANT_HOST) when QDRANT_MODE=external. |
QDRANT_PORT | 16333 | Qdrant REST API port (managed mode, or external without QDRANT_URL) |
QDRANT_GRPC_PORT | 16334 | Qdrant gRPC port (managed mode only) |
QDRANT_HOST | localhost | Qdrant hostname (alternative to QDRANT_URL for non-HTTPS external instances) |
QDRANT_API_KEY | (none) | Qdrant API key (required for Qdrant Cloud and other authenticated deployments) |
| Variable | Default | Description |
|---|---|---|
RESPECT_GITIGNORE | true | Set to false to skip .gitignore processing. Built-in defaults and .socraticodeignore still apply. |
EXTRA_EXTENSIONS | (none) | Comma-separated list of additional file extensions to scan (e.g. .tpl,.blade,.hbs). Applies to both indexing and code graph. Files with extra extensions are indexed as plaintext and appear as leaf nodes in the code graph. Can also be passed per-operation via the extraExtensions tool parameter. |
MAX_FILE_SIZE_MB | 5 | Maximum file size in MB. Files larger than this are skipped during indexing. Increase for repos with large generated or data files you want indexed. |
SEARCH_DEFAULT_LIMIT | 10 | Default number of results returned by codebase_search (1-50). Each result is a ranked code chunk with file path, line range, and content. Higher values give broader coverage but produce more output. Can still be overridden per-query via the limit tool parameter. |
SEARCH_MIN_SCORE | 0.10 | Minimum RRF (Reciprocal Rank Fusion) score threshold (0-1). Results below this score are filtered out. Helps remove low-relevance noise from search results. Set to 0 to disable filtering (returns all results up to limit). Can be overridden per-query via the minScore tool parameter. Works together with limit: results are first filtered by score, then capped at limit. |
SOCRATICODE_LOG_LEVEL | info | Log verbosity: debug, info, warn, error |
SOCRATICODE_LOG_FILE | (none) | Absolute path to a log file. When set, all log entries are appended to this file (a session separator is written on each server start). Useful for debugging when the MCP host doesn't surface log notifications. |
Important: If you change
EMBEDDING_PROVIDER,EMBEDDING_MODEL, orEMBEDDING_DIMENSIONSafter indexing, you must re-index your projects (codebase_removethencodebase_index) since existing vectors have different dimensions.
SocratiCode manages Docker containers and persistent volumes:
| Resource | Name | Purpose | When |
|---|---|---|---|
| Container | socraticode-qdrant | Qdrant vector database (pinned v1.17.0) | managed mode only |
| Container | socraticode-ollama | Ollama embedding server | docker mode only |
| Volume | socraticode_qdrant_data | Persistent vector storage | managed mode only |
| Volume | socraticode_ollama_data | Persistent model storage | docker mode only |
In QDRANT_MODE=external mode, the Qdrant container and volume are not created or started — SocratiCode connects directly to the configured remote endpoint. Server-side BM25 inference (used for hybrid search) requires Qdrant v1.15.2 or later. The managed container runs v1.17.0. If you bring your own Qdrant instance, ensure it meets this minimum.
All containers use --restart unless-stopped for automatic recovery.
Why non-standard ports? SocratiCode intentionally uses non-default ports for its managed containers —
16333/16334instead of Qdrant's defaults (6333/6334), and11435instead of Ollama's default (11434). This avoids conflicts with any Qdrant or Ollama instance you may already be running locally. All ports are overridable via environment variables if needed.
SocratiCode has a comprehensive test suite with 634 tests across unit, integration, and end-to-end layers.
# Run all tests
npm test
# Run only unit tests (no Docker needed)
npm run test:unit
# Run integration tests (requires Docker)
npm run test:integration
# Run end-to-end tests (requires Docker)
npm run test:e2e
# Watch mode (re-runs on file changes)
npm run test:watch
# With coverage report
npm run test:coverage
| Layer | Tests | Docker? | Description |
|---|---|---|---|
Unit (tests/unit/) | 477 | No | Config, constants, ignore rules, cross-process locking, logging, graph analysis, import extraction, path resolution, embedding config, indexer utilities, embeddings, startup lifecycle, watcher cross-process awareness |
Integration (tests/integration/) | 137 | Yes | Docker/Ollama setup, Qdrant CRUD, real embeddings, indexer, watcher, code graph, all MCP tools |
E2E (tests/e2e/) | 20 | Yes | Complete lifecycle: health → index → search → graph → watch → remove |
Integration and E2E tests that require Docker are automatically skipped when Docker is not available.
Modern evaluations on real repositories show that hybrid lexical + semantic code search consistently outperforms plain grep once you care about natural-language queries, large codebases, or coding agents: reports show ~20% search-quality gains from BM25F ranking at scale, AST-aware retrieval improving recall and bug-fix performance on RepoEval and SWE-bench, and hybrid approach with grep (the default in SocratiCode) beats grep in 70% of agentic code-search tasks while cutting search operations by over half.
Running a head-to-head comparison against the VS Code codebase (~2.45 million lines of TypeScript/JavaScript across 5,300+ files, 55,437 indexed chunks) to measure what a Claude Opus 4.6 AI agent actually consumes when answering architectural questions.
Methodology: For each question, the grep approach follows the realistic multi-step workflow an AI agent uses today: grep -rl to find matching files, identify core files, read them in chunks (200 lines at a time), and repeat until it has enough context. The SocratiCode approach performs a single semantic search call that returns the 10 most relevant code chunks from across the entire codebase.
| Question | Grep (bytes) | SocratiCode (bytes) | Reduction | Speedup |
|---|---|---|---|---|
| How does VS Code implement workspace trust restrictions? | 56,383 | 21,149 | 62.5% | 49.7x |
| How does the diff editor compute and display text differences? | 37,650 | 15,961 | 57.6% | 40.2x |
| How does VS Code handle extension activation and lifecycle? | 36,231 | 16,181 | 55.3% | 34.4x |
| How does the integrated terminal spawn and manage shells? | 50,159 | 22,518 | 55.1% | 31.1x |
| How does VS Code implement the command palette and quick pick? | 70,087 | 20,676 | 70.5% | 31.7x |
| Total | 250,510 | 96,485 | 61.5% | 37.2x |
Key findings:
Note: This benchmark is conservative for the grep approach. It assumes the agent already knows which files to read. In practice, a real AI agent needs additional exploratory grep calls, follows dead ends, reads irrelevant files, and often needs multiple rounds of narrowing. The actual savings might be larger.
Natural-language and conceptual queries — Queries like "Where do we handle database connection pooling?" or "How does this library implement exponential backoff?" describe behavior rather than naming a function. Evaluations on repository-level benchmarks (RepoEval, SWE-bench) show that AST-aware semantic retrieval improves recall by up to 4.3 points and downstream code-generation accuracy by ~2.7 points compared to fixed line-based chunks. Agentic evaluations on real open-source repos show a 70% win rate for hybrid search over vanilla grep on hard, conceptual questions — with 56% fewer search operations and ~60,000 fewer tokens per complex query.
Large repos and monorepos — At multi-million LOC scale, full-text scans become expensive. Production search engines report ~20% relevance improvement from BM25F ranking over previous approaches, and use it as the first-stage retriever for semantic reranking. Hybrid search backed by inverted and vector indexes avoids full scans entirely, making it both faster and more precise at scale. Industry practitioners explicitly note that grep and find "don't scale well to millions of files" and that optimized embedding-based indexes can be faster at that scale.
Cross-file and cross-language reasoning — Finding all code paths that eventually call an internal helper across services, or mapping a natural-language spec to implementations in Go and SQL, requires understanding that goes beyond string matching. Evaluations show that hybrid pipelines with tree-sitter parsing and dependency context outperform grep when naming is non-obvious and semantic understanding is needed. AST-based chunking with learned retrievers improves retrieval in cross-language benchmarks, and multi-vector semantic models show large gains over BM25 alone across diverse code search tasks (AppsRetrieval, CodeSearchNet, CosQA) where queries are in natural language and targets span many languages.
Mixed code + context artifacts — Questions like "Where is rate-limiting configured?" might match Nginx configs, Terraform files, or YAML — not just application code. Hybrid search over mixed technical corpora (structured fields + free text) consistently outperforms pure lexical or pure vector approaches in published evaluations.
The same research makes clear when grep (or ripgrep) is entirely reasonable — and sometimes optimal:
On easy or directly-named queries, grep can match or beat semantic methods. That's why the best architectures don't replace grep — they extend it. SocratiCode's hybrid approach runs both BM25 keyword search and dense semantic search on every query, fusing results via RRF, so you get the precision of exact matching and the recall of semantic understanding in a single call.
Yes. Indexing automatically resumes from where it left off. The indexer checkpoints file hashes after every batch of files. When you ask your AI to index again (e.g. "index this project"), it detects the existing data, skips every file that was already successfully embedded, and only re-processes the files that weren't checkpointed before the failure. Already-indexed chunks are never deleted or re-embedded. Just ask your AI to index again and it will pick up where it stopped.
Indexing runs in the background on the MCP server. However, some MCP hosts (VS Code, Claude Desktop, etc.) disconnect an idle connection after a period of inactivity, which kills the background process. To keep the connection alive, ask your AI to check status (e.g. "check indexing status") roughly every 60 seconds after starting indexing until it completes. If the connection does drop and indexing is interrupted, just ask your AI to index again — it resumes automatically (see above).
If indexing repeatedly fails, throws errors on resume, or gets stuck in a loop, the simplest fix is to start fresh: ask your AI to "remove the index for this project", then ask it to index again. This clears all stored chunks and metadata for the project and begins a clean re-index. It won't affect other indexed projects.
Yes. You can stop indexing at any time and resume it later without losing progress:
This makes indexing very large codebases practical even on slower hardware — you can index in multiple sessions across hours or days, and no work is ever repeated or lost.
The file watcher auto-starts on first tool use for any previously indexed project. When it starts, it catches up all files modified while SocratiCode was down before watching for future changes.
If you want to force an immediate catch-up before searching, ask your AI to "start watching this project" or "update the index" — both run an incremental update synchronously and then start watching.
The watcher will not auto-start if a full index or incremental update is currently in progress, if the project has not been indexed yet, or if another MCP process is already watching the same project.
Yes. SocratiCode maintains a separate isolated collection for each project path. Ask your AI to "list all indexed projects" to see everything currently indexed.
Each collection is created with a fixed vector size matching the model used at index time.
If you change EMBEDDING_PROVIDER, EMBEDDING_MODEL, or EMBEDDING_DIMENSIONS in your
MCP config, any projects indexed with the old model will return a dimension mismatch error.
Ask your AI to "remove the index for this project" and then to index again with the new
model. Projects you haven't touched are unaffected.
The code you see behind Socrates is part of the original Apollo 11 guidance computer (AGC) source code for Command Module (Comanche055)!
SocratiCode is dual-licensed:
Open Source — AGPL-3.0. Free to use, modify, and distribute. If you modify SocratiCode and offer it as a network service, you must release your modifications under AGPL-3.0.
Commercial — For organizations that need to use SocratiCode in proprietary products or services without AGPL obligations. See LICENSE-COMMERCIAL or contact giancarlo@altaire.com.
Copyright (C) 2026 Giancarlo Erra - Altaire Limited.
SocratiCode includes open-source dependencies under their own licenses (MIT, Apache 2.0, ISC). See THIRD-PARTY-LICENSES for details.
Contributions are welcome. By submitting a pull request, you agree to the Contributor License Agreement.
Install via CLI
npx mdskills install giancarloerra/socraticodeSocratiCode is a free, open-source AI agent skill. Give any AI instant automated knowledge of your entire codebase (and infrastructure) — at scale, zero configuration, fully private, completely free. Kindly sponsored by Altaire Limited One thing, done well: deep codebase intelligence — zero setup, no bloat, fully automatic. SocratiCode gives AI assistants deep semantic understanding of your codebase — hybrid search, polyglot code dependency graphs
Install SocratiCode with a single command:
npx mdskills install giancarloerra/socraticodeThis downloads the skill files into your project and your AI agent picks them up automatically.
SocratiCode works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Gemini Cli, Amp, Roo Code, Goose. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.