A high-performance, pure Rust Model Context Protocol (MCP) server that provides persistent, semantic, and graph-based memory for AI agents. Works perfectly with: Claude Desktop Claude Code (CLI) Cline / Roo Code Any other MCP-compliant client. Unlike other memory solutions that require a complex stack (Python + Vector DB + Graph DB), this project is a single, self-contained executable. ✅ No Extern
Add this skill
npx mdskills install pomazanbohdan/memory-mcp-1fileComprehensive memory server with semantic search, knowledge graph, and code indexing—excellent docs and setup
1# 🧠 Memory MCP Server23[](https://github.com/pomazanbohdan/memory-mcp-1file/actions/workflows/release.yml)4[](https://github.com/pomazanbohdan/memory-mcp-1file/pkgs/container/memory-mcp-1file)5[](https://opensource.org/licenses/MIT)6[](https://www.rust-lang.org)7[](#)89A high-performance, **pure Rust** Model Context Protocol (MCP) server that provides persistent, semantic, and graph-based memory for AI agents.1011Works perfectly with:12* **Claude Desktop**13* **Claude Code** (CLI)14* **Gemini CLI**15* **Cursor**16* **OpenCode**17* **Cline** / **Roo Code**18* Any other MCP-compliant client.1920### 🏆 The "All-in-One" Advantage2122Unlike other memory solutions that require a complex stack (Python + Vector DB + Graph DB), this project is **a single, self-contained executable**.2324* ✅ **No External Database** (SurrealDB is embedded)25* ✅ **No API Keys, No Cloud, No Python** — Everything runs **100% locally** via an embedded ONNX runtime. The embedding model is baked into the binary and runs on CPU. Nothing leaves your machine.26* ✅ **Zero Setup** (Just run one Docker container or binary)2728It combines:291. **Vector Search** (FastEmbed) for semantic similarity.302. **Knowledge Graph** (PetGraph) for entity relationships.313. **Code Indexing** with **symbol graph** (calls, extends, implements) for deep codebase understanding.324. **Hybrid Retrieval** (Reciprocal Rank Fusion) for best results.3334### 🏗️ Architecture3536```mermaid37graph TD38 User[AI Agent / IDE]3940 subgraph "Memory MCP Server"41 MS[MCP Server]4243 subgraph "Core Engines"44 ES[Embedding Service]45 GS[Graph Service]46 CS[Codebase Service]47 end4849 MS -- "Store / Search" --> ES50 MS -- "Relate Entities" --> GS51 MS -- "Index" --> CS5253 ES -- "Vectorize Text" --> SDB[(SurrealDB Embedded)]54 GS -- "Knowledge Graph" --> SDB55 CS -- "AST Chunks" --> SDB56 end5758 User -- "MCP Protocol" --> MS59```6061> **[Click here for the Detailed Architecture Documentation](./ARCHITECTURE.md)**6263---6465## 🤖 Agent Integration (System Prompt)6667Memory is useless if your agent doesn't check it. To get the "Long-Term Memory" effect, you must instruct your agent to follow a strict protocol.6869We provide a battle-tested **[Memory Protocol (AGENTS.md)](./AGENTS.md)** that you can adapt.7071### 🛡️ Core Workflows (Context Protection)7273The protocol implements specific flows to handle **Context Window Compaction** and **Session Restarts**:74751. **🚀 Session Startup**: The agent *must* search for `TASK: in_progress` immediately. This restores the full context of what was happening before the last session ended or the context was compacted.762. **⏳ Auto-Continue**: A safety mechanism where the agent presents the found task to the user and waits (or auto-continues), ensuring it doesn't hallucinate a new task.773. **🔄 Triple Sync**: Updates **Memory**, **Todo List**, and **Files** simultaneously. If one fails (e.g., context lost), the others serve as backups.784. **🧱 Prefix System**: All memories use prefixes (`TASK:`, `DECISION:`, `RESEARCH:`) so semantic search can precisely target the right type of information, reducing noise.7980These workflows turn the agent from a "stateless chatbot" into a "stateful worker" that survives restarts and context clearing.8182### Recommended System Prompt Snippet8384Instead of scattering instructions across IDE-specific files (like `.cursorrules`), establish `AGENTS.md` as the **Single Source of Truth**.8586Instruct your agent (in its base system prompt) to:871. **Read `AGENTS.md`** at the start of every session.882. **Follow the protocols** defined therein.8990Here is a minimal reference prompt to bootstrap this behavior:9192```markdown93# 🧠 Memory & Protocol94You have access to a persistent memory server and a protocol definition file.95961. **Protocol Adherence**:97 - READ `AGENTS.md` immediately upon starting.98 - Strictly follow the "Session Startup" and "Sync" protocols defined there.991002. **Context Restoration**:101 - Run `search_text("TASK: in_progress")` to restore context.102 - Do NOT ask the user "what should I do?" if a task is already in progress.103```104105### Why this matters?106Without this protocol, the agent loses context after compaction or session restarts. With this protocol, it maintains the **full context of the current task**, ensuring no steps or details are lost, even when the chat history is cleared.107108---109110## 🔌 Client Configuration111112### Universal Docker Configuration (Any IDE/CLI)113114To use this MCP server with any client (**Claude Code**, **OpenCode**, **Cline**, etc.), use the following Docker command structure.115116**Key Requirements:**1171. **Memory Volume**: `-v mcp-data:/data` (Persists your graph, embeddings, **and cached model weights**)1182. **Project Volume**: `-v $(pwd):/project:ro` (Allows the server to read and index your code)1193. **Init Process**: `--init` (Ensures the server shuts down cleanly)120121> [!TIP]122> **Model Caching**: The embedding model (~1 GB) is stored in `/data/models/`. Using a **named volume** (`mcp-data:/data`) ensures the model is downloaded only once. Without a named volume, Docker creates a new anonymous volume on each `docker run`, causing the model to re-download every time.123124#### JSON Configuration (Claude Desktop, etc.)125126Add this to your configuration file (e.g., `claude_desktop_config.json`):127128```json129{130 "mcpServers": {131 "memory": {132 "command": "docker",133 "args": [134 "run",135 "--init",136 "-i",137 "--rm",138 "--memory=3g",139 "-v", "mcp-data:/data",140 "-v", "/absolute/path/to/your/project:/project:ro",141 "ghcr.io/pomazanbohdan/memory-mcp-1file:latest"142 ]143 }144 }145}146```147148> **Note:** Replace `/absolute/path/to/your/project` with the actual path you want to index. In some environments (like Cursor or VSCode extensions), you might be able to use variables like `${workspaceFolder}`, but absolute paths are most reliable for Docker.149150### Cursor (Specific Instructions)1511521. Go to **Cursor Settings** > **Features** > **MCP Servers**.1532. Click **+ Add New MCP Server**.1543. **Type**: `stdio`1554. **Name**: `memory`1565. **Command**:157 ```bash158 docker run --init -i --rm --memory=3g -v mcp-data:/data -v "/Users/yourname/projects/current:/project:ro" ghcr.io/pomazanbohdan/memory-mcp-1file:latest159 ```160 *(Remember to update the project path when switching workspaces if you need code indexing)*161162### OpenCode / CLI163164```bash165docker run --init -i --rm --memory=3g \166 -v mcp-data:/data \167 -v $(pwd):/project:ro \168 ghcr.io/pomazanbohdan/memory-mcp-1file:latest169```170171### NPX / Bunx (No Docker required)172173You can run the server directly via `npx` or `bunx`. The npm package automatically downloads the correct pre-compiled binary for your platform.174175#### Claude Desktop176177Add to `claude_desktop_config.json`:178179```json180{181 "mcpServers": {182 "memory": {183 "command": "npx",184 "args": ["-y", "memory-mcp-1file"]185 }186 }187}188```189190#### Claude Code (CLI)191192```bash193claude mcp add memory -- npx -y memory-mcp-1file194```195196#### Cursor1971981. Go to **Cursor Settings** > **Features** > **MCP Servers**.1992. Click **+ Add New MCP Server**.2003. **Type**: `command`2014. **Name**: `memory`2025. **Command**: `npx -y memory-mcp-1file`203204Or add to `.cursor/mcp.json`:205206```json207{208 "mcpServers": {209 "memory": {210 "command": "npx",211 "args": ["-y", "memory-mcp-1file"]212 }213 }214}215```216217#### Windsurf / VS Code218219Add to your MCP settings:220221```json222{223 "mcpServers": {224 "memory": {225 "command": "npx",226 "args": ["-y", "memory-mcp-1file"]227 }228 }229}230```231232#### Bun233234```json235{236 "mcpServers": {237 "memory": {238 "command": "bunx",239 "args": ["memory-mcp-1file"]240 }241 }242}243```244245> **Note:** Unlike Docker, `npx`/`bunx` runs the binary **locally** — it already has access to your filesystem, so no directory mounting is needed. To customize the data storage path, pass `--data-dir` via args:246> ```json247> "args": ["-y", "memory-mcp-1file", "--", "--data-dir", "/path/to/data"]248> ```249250### Gemini CLI251252Add to your `~/.gemini/settings.json`:253254```json255{256 "mcpServers": {257 "memory": {258 "command": "npx",259 "args": ["-y", "memory-mcp-1file"]260 }261 }262}263```264265Or with Docker:266267```json268{269 "mcpServers": {270 "memory": {271 "command": "docker",272 "args": [273 "run", "--init", "-i", "--rm", "--memory=3g",274 "-v", "mcp-data:/data",275 "-v", "${workspaceFolder}:/project:ro",276 "ghcr.io/pomazanbohdan/memory-mcp-1file:latest"277 ]278 }279 }280}281```282283---284285## ✨ Key Features286287- **Semantic Memory**: Stores text with vector embeddings (`qwen3` by default) for "vibe-based" retrieval.288- **Graph Memory**: Tracks entities (`User`, `Project`, `Tech`) and their relations (`uses`, `likes`). Supports PageRank-based traversal.289- **Code Intelligence**: Indexes local project directories (AST-based chunking) for Rust, Python, TypeScript, JavaScript, Go, Java, and **Dart/Flutter**. Tracks **calls, imports, extends, implements, and mixin** relationships between symbols.290- **Temporal Validity**: Memories can have `valid_from` and `valid_until` dates.291- **SurrealDB Backend**: Fast, embedded, single-file database.292293---294295## 🛠️ Tools Available296297The server exposes **27 tools** to the AI model, organized into logical categories.298299### 🧠 Core Memory Management300| Tool | Description |301|------|-------------|302| `store_memory` | Store a new memory with content and optional metadata. |303| `update_memory` | Update an existing memory (only provided fields). |304| `delete_memory` | Delete a memory by its ID. |305| `list_memories` | List memories with pagination (newest first). |306| `get_memory` | Get a specific memory by ID. |307| `invalidate` | Soft-delete a memory (mark as invalid). |308| `get_valid` | Get currently active memories (filters out expired ones). |309| `get_valid_at` | Get memories that were valid at a specific past timestamp. |310311### 🔎 Search & Retrieval312| Tool | Description |313|------|-------------|314| `recall` | **Hybrid search** (Vector + Keyword + Graph). Best for general questions. |315| `search` | Pure semantic vector search. |316| `search_text` | Exact keyword match (BM25). |317318### 🕸️ Knowledge Graph319| Tool | Description |320|------|-------------|321| `create_entity` | Define a node (e.g., "React", "Authentication"). |322| `create_relation` | Link nodes (e.g., "Project" -> "uses" -> "React"). |323| `get_related` | Find connected concepts via graph traversal. |324| `detect_communities` | Detect communities in the graph using Leiden algorithm. |325326### 💻 Codebase Intelligence327| Tool | Description |328|------|-------------|329| `index_project` | Scan and index a local folder for code search. |330| `get_index_status` | Check if indexing is in progress or failed. |331| `list_projects` | List all indexed projects. |332| `delete_project` | Remove a project and its code chunks from the index. |333| `search_code` | Semantic search over code chunks. |334| `recall_code` | **Hybrid code search** (Vector + BM25 + Symbol Graph PageRank via RRF). Best quality code retrieval. |335| `search_symbols` | Search for functions/classes by name. |336| `get_callers` | Find functions that call a given symbol. |337| `get_callees` | Find functions called by a given symbol. |338| `get_related_symbols` | Get related symbols via graph traversal (calls, extends, implements). |339340### ⚙️ System & Maintenance341| Tool | Description |342|------|-------------|343| `get_status` | Get server health and loading status. |344| `reset_all_memory` | **DANGER**: Wipes all data (memories, graph, code). |345346---347348## ⚙️ Configuration349350Environment variables or CLI args:351352| Arg | Env | Default | Description |353|-----|-----|---------|-------------|354| `--data-dir` | `DATA_DIR` | `./data` | DB location |355| `--model` | `EMBEDDING_MODEL` | `qwen3` | Embedding model (`qwen3`, `gemma`, `bge_m3`, `nomic`, `e5_multi`, `e5_small`) |356| `--mrl-dim` | `MRL_DIM` | *(native)* | Output dimension for MRL-supported models (e.g. 64, 128, 256, 512, 1024 for Qwen3). Defaults to the model's native maximum dimension (1024 for Qwen3). |357| `--batch-size` | `BATCH_SIZE` | `8` | Maximum batch size for embedding inference |358| `--cache-size` | `CACHE_SIZE` | `1000` | LRU cache capacity for embeddings |359| `--timeout` | `TIMEOUT_MS` | `30000` | Timeout in milliseconds |360| `--idle-timeout` | `IDLE_TIMEOUT` | `0` | Idle timeout in minutes. 0 = disabled |361| `--log-level` | `LOG_LEVEL` | `info` | Verbosity |362| *(None)* | `HF_TOKEN` | *(None)* | HuggingFace Token (ONLY required for gated models like `gemma`) |363364### 🧠 Available Models365366You can switch the embedding model using the `--model` arg or `EMBEDDING_MODEL` env var.367368| Argument Value | HuggingFace Repo | Dimensions | Size | Use Case |369| :--- | :--- | :--- | :--- | :--- |370| `qwen3` | `Qwen/Qwen3-Embedding-0.6B` | 1024 (MRL) | 1.2 GB | **Default**. Top open-source 2026 model, 32K context, MRL support. |371| `gemma` | `onnx-community/embeddinggemma-300m-ONNX` | 768 (MRL) | ~195 MB | Lighter alternative with MRL support. (Requires proprietary license agreement) |372| `bge_m3` | `BAAI/bge-m3` | 1024 | 2.3 GB | State-of-the-art multilingual hybrid retrieval. Heavy. |373| `nomic` | `nomic-ai/nomic-embed-text-v1.5` | 768 | 1.9 GB | High quality long-context BERT-compatible. |374| `e5_multi` | `intfloat/multilingual-e5-base` | 768 | 1.1 GB | Legacy; kept for backward compatibility. |375| `e5_small` | `intfloat/multilingual-e5-small` | 384 | 134 MB | Fastest, minimal RAM. Good for dev/testing. |376377### 📉 Matryoshka Representation Learning (MRL)378379Models marked with **(MRL)** support dynamically truncating the output embedding vector to a smaller dimension (e.g., 512, 256, 128) with minimal loss of accuracy. This saves database storage and speeds up vector search.380381Use the `--mrl-dim` argument to specify the desired size. If omitted, the default is the model's native base dimension (e.g., 1024 for Qwen3).382383**Warning:** Once your database is created with a specific dimension, you cannot change it without wiping the data directory.384385### 🔒 Gated Models & Authentication (Gemma)386387By default, the server uses **Qwen3**, which is fully open-source and downloads automatically without any authentication.388389However, if you choose to use **Gemma** (`--model gemma`), you must authenticate because it is a "Gated Model" with a proprietary license.390391To use Gemma:3921. Go to [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) on Hugging Face.3932. Log in and click **"Agree to access repository"**.3943. Generate an Access Token at [HF Tokens](https://huggingface.co/settings/tokens) (Read access is enough).3954. Start the server with the token:396397```bash398# Using environment variable399HF_TOKEN="hf_your_token_here" memory-mcp --model gemma400401# Or via .env file (see .env.example)402```403404> [!WARNING]405> **Changing Models & Data Compatibility**406>407> If you switch to a model with different dimensions (e.g., from `e5_small` to `e5_multi`), **your existing database will be incompatible**.408> You must delete the data directory (volume) and re-index your data.409>410> Switching between models with the same dimensions (e.g., `e5_multi` <-> `nomic`) is theoretically possible but not recommended as semantic spaces differ.411412## 🔮 Future Roadmap (Research & Ideas)413414Based on analysis of advanced memory systems like [Hindsight](https://hindsight.vectorize.io/) (see their documentation for details on these mechanisms), we are exploring these "Cognitive Architecture" features for future releases:415416### 1. Meta-Cognitive Reflection (Consolidation)417* **Problem:** Raw memories accumulate noise over time (e.g., 10 separate memories about fixing the same bug).418* **Solution:** Implement a `reflect` background process (or tool) that periodicallly scans recent memories to:419 * **De-duplicate** redundant entries.420 * **Resolve conflicts** (if two memories contradict, keep the newer one or flag for review).421 * **Synthesize** low-level facts into high-level "Insights" (e.g., "User prefers Rust over Python" derived from 5 code choices).422423### 2. Temporal Decay & "Presence"424* **Problem:** Old memories can sometimes drown out current context in semantic search.425* **Solution:** Integrate **Time Decay** into the Reciprocal Rank Fusion (RRF) algorithm.426 * Give a calculated boost to recent memories for queries implying "current state".427 * Allow the agent to prioritize "working memory" over "historical archives" dynamically.428429### 3. Namespaced Memory Banks430* **Problem:** Running one docker container per project is resource-heavy.431* **Solution:** Add support for `namespace` or `project_id` scoping.432 * Allows a single server instance to host isolated "Memory Banks" for different projects or agent personas.433 * Enables "Switching Context" without restarting the container.434435### 4. Epistemic Confidence Scoring436* **Problem:** The agent treats a guess the same as a verified fact.437* **Solution:** Add a `confidence` score (0.0 - 1.0) to memory schemas.438 * Allows storing hypotheses ("I think the bug is in auth.rs", confidence: 0.3).439 * Retrieval tools can filter out low-confidence memories when answering factual questions.440441---442443## License444445MIT446
Full transparency — inspect the skill content before installing.