How do I install Memory MCP Server?

Install Memory MCP Server with a single command: npx mdskills install pomazanbohdan/memory-mcp-1file. This downloads the skill files into your project and your AI agent picks them up automatically.

What platforms support Memory MCP Server?

Memory MCP Server works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.

← Back to skills

Memory MCP Server

Name: Memory MCP Server: AI Agent Skill
Brand: pomazanbohdan
Availability: InStock
Rating: 9 (1 reviews)
Author: pomazanbohdan

Verified

Git & WorkflowIntermediate

A high-performance, pure Rust Model Context Protocol (MCP) server that provides persistent, semantic, and graph-based memory for AI agents. Works perfectly with: Claude Desktop Claude Code (CLI) Cline / Roo Code Any other MCP-compliant client. Unlike other memory solutions that require a complex stack (Python + Vector DB + Graph DB), this project is a single, self-contained executable. ✅ No Extern

by @pomazanbohdan 16Updated 2/19/2026

Add this skill

npx mdskills install pomazanbohdan/memory-mcp-1file

Fork & Edit

Are you @pomazanbohdan? Sign in with GitHub to claim this listing.

Skill Advisor9.0

Comprehensive memory server with semantic search, knowledge graph, and code indexing—excellent docs and setup

+Provides 27 well-organized tools across memory, search, graph, and code intelligence
+Excellent documentation with detailed setup for multiple clients and deployment methods
+Strong hybrid retrieval combining vector, keyword, and graph search with reciprocal rank fusion
-Broad permissions (write/shell/network) not fully justified in visible documentation
-Agent integration workflow complexity may require significant prompt engineering

SKILL.md

Edit in Browser

1# 🧠 Memory MCP Server
2 
3[![Release](https://github.com/pomazanbohdan/memory-mcp-1file/actions/workflows/release.yml/badge.svg)](https://github.com/pomazanbohdan/memory-mcp-1file/actions/workflows/release.yml)
4[![Docker](https://img.shields.io/badge/docker-ghcr.io-blue.svg)](https://github.com/pomazanbohdan/memory-mcp-1file/pkgs/container/memory-mcp-1file)
5[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6[![Built with Rust](https://img.shields.io/badge/Built%20with-Rust-d64e25.svg)](https://www.rust-lang.org)
7[![Architecture](https://img.shields.io/badge/Architecture-Single%20Binary-success.svg)](#)
8 
9A high-performance, **pure Rust** Model Context Protocol (MCP) server that provides persistent, semantic, and graph-based memory for AI agents.
10 
11Works perfectly with:
12*   **Claude Desktop**
13*   **Claude Code** (CLI)
14*   **Gemini CLI**
15*   **Cursor**
16*   **OpenCode**
17*   **Cline** / **Roo Code**
18*   Any other MCP-compliant client.
19 
20### 🏆 The "All-in-One" Advantage
21 
22Unlike other memory solutions that require a complex stack (Python + Vector DB + Graph DB), this project is **a single, self-contained executable**.
23 
24*   ✅ **No External Database** (SurrealDB is embedded)
25*   ✅ **No API Keys, No Cloud, No Python** — Everything runs **100% locally** via an embedded ONNX runtime. The embedding model is baked into the binary and runs on CPU. Nothing leaves your machine.
26*   ✅ **Zero Setup** (Just run one Docker container or binary)
27 
28It combines:
291.  **Vector Search** (FastEmbed) for semantic similarity.
302.  **Knowledge Graph** (PetGraph) for entity relationships.
313.  **Code Indexing** with **symbol graph** (calls, extends, implements) for deep codebase understanding.
324.  **Hybrid Retrieval** (Reciprocal Rank Fusion) for best results.
33 
34### 🏗️ Architecture
35 
36```mermaid
37graph TD
38    User[AI Agent / IDE]
39    
40    subgraph "Memory MCP Server"
41        MS[MCP Server]
42        
43        subgraph "Core Engines"
44            ES[Embedding Service]
45            GS[Graph Service]
46            CS[Codebase Service]
47        end
48        
49        MS -- "Store / Search" --> ES
50        MS -- "Relate Entities" --> GS
51        MS -- "Index" --> CS
52        
53        ES -- "Vectorize Text" --> SDB[(SurrealDB Embedded)]
54        GS -- "Knowledge Graph" --> SDB
55        CS -- "AST Chunks" --> SDB
56    end
57 
58    User -- "MCP Protocol" --> MS
59```
60 
61> **[Click here for the Detailed Architecture Documentation](./ARCHITECTURE.md)**
62 
63---
64 
65## 🤖 Agent Integration (System Prompt)
66 
67Memory is useless if your agent doesn't check it. To get the "Long-Term Memory" effect, you must instruct your agent to follow a strict protocol.
68 
69We provide a battle-tested **[Memory Protocol (AGENTS.md)](./AGENTS.md)** that you can adapt.
70 
71### 🛡️ Core Workflows (Context Protection)
72 
73The protocol implements specific flows to handle **Context Window Compaction** and **Session Restarts**:
74 
751.  **🚀 Session Startup**: The agent *must* search for `TASK: in_progress` immediately. This restores the full context of what was happening before the last session ended or the context was compacted.
762.  **⏳ Auto-Continue**: A safety mechanism where the agent presents the found task to the user and waits (or auto-continues), ensuring it doesn't hallucinate a new task.
773.  **🔄 Triple Sync**: Updates **Memory**, **Todo List**, and **Files** simultaneously. If one fails (e.g., context lost), the others serve as backups.
784.  **🧱 Prefix System**: All memories use prefixes (`TASK:`, `DECISION:`, `RESEARCH:`) so semantic search can precisely target the right type of information, reducing noise.
79 
80These workflows turn the agent from a "stateless chatbot" into a "stateful worker" that survives restarts and context clearing.
81 
82### Recommended System Prompt Snippet
83 
84Instead of scattering instructions across IDE-specific files (like `.cursorrules`), establish `AGENTS.md` as the **Single Source of Truth**.
85 
86Instruct your agent (in its base system prompt) to:
871.  **Read `AGENTS.md`** at the start of every session.
882.  **Follow the protocols** defined therein.
89 
90Here is a minimal reference prompt to bootstrap this behavior:
91 
92```markdown
93# 🧠 Memory & Protocol
94You have access to a persistent memory server and a protocol definition file.
95 
961.  **Protocol Adherence**:
97    - READ `AGENTS.md` immediately upon starting.
98    - Strictly follow the "Session Startup" and "Sync" protocols defined there.
99 
1002.  **Context Restoration**:
101    - Run `search_text("TASK: in_progress")` to restore context.
102    - Do NOT ask the user "what should I do?" if a task is already in progress.
103```
104 
105### Why this matters?
106Without this protocol, the agent loses context after compaction or session restarts. With this protocol, it maintains the **full context of the current task**, ensuring no steps or details are lost, even when the chat history is cleared.
107 
108---
109 
110## 🔌 Client Configuration
111 
112### Universal Docker Configuration (Any IDE/CLI)
113 
114To use this MCP server with any client (**Claude Code**, **OpenCode**, **Cline**, etc.), use the following Docker command structure.
115 
116**Key Requirements:**
1171.  **Memory Volume**: `-v mcp-data:/data` (Persists your graph, embeddings, **and cached model weights**)
1182.  **Project Volume**: `-v $(pwd):/project:ro` (Allows the server to read and index your code)
1193.  **Init Process**: `--init` (Ensures the server shuts down cleanly)
120 
121> [!TIP]
122> **Model Caching**: The embedding model (~1 GB) is stored in `/data/models/`. Using a **named volume** (`mcp-data:/data`) ensures the model is downloaded only once. Without a named volume, Docker creates a new anonymous volume on each `docker run`, causing the model to re-download every time.
123 
124#### JSON Configuration (Claude Desktop, etc.)
125 
126Add this to your configuration file (e.g., `claude_desktop_config.json`):
127 
128```json
129{
130  "mcpServers": {
131    "memory": {
132      "command": "docker",
133      "args": [
134        "run",
135        "--init",
136        "-i",
137        "--rm",
138        "--memory=3g",
139        "-v", "mcp-data:/data",
140        "-v", "/absolute/path/to/your/project:/project:ro",
141        "ghcr.io/pomazanbohdan/memory-mcp-1file:latest"
142      ]
143    }
144  }
145}
146```
147 
148> **Note:** Replace `/absolute/path/to/your/project` with the actual path you want to index. In some environments (like Cursor or VSCode extensions), you might be able to use variables like `${workspaceFolder}`, but absolute paths are most reliable for Docker.
149 
150### Cursor (Specific Instructions)
151 
1521.  Go to **Cursor Settings** > **Features** > **MCP Servers**.
1532.  Click **+ Add New MCP Server**.
1543.  **Type**: `stdio`
1554.  **Name**: `memory`
1565.  **Command**:
157    ```bash
158    docker run --init -i --rm --memory=3g -v mcp-data:/data -v "/Users/yourname/projects/current:/project:ro" ghcr.io/pomazanbohdan/memory-mcp-1file:latest
159    ```
160    *(Remember to update the project path when switching workspaces if you need code indexing)*
161 
162### OpenCode / CLI
163 
164```bash
165docker run --init -i --rm --memory=3g \
166  -v mcp-data:/data \
167  -v $(pwd):/project:ro \
168  ghcr.io/pomazanbohdan/memory-mcp-1file:latest
169```
170 
171### NPX / Bunx (No Docker required)
172 
173You can run the server directly via `npx` or `bunx`. The npm package automatically downloads the correct pre-compiled binary for your platform.
174 
175#### Claude Desktop
176 
177Add to `claude_desktop_config.json`:
178 
179```json
180{
181  "mcpServers": {
182    "memory": {
183      "command": "npx",
184      "args": ["-y", "memory-mcp-1file"]
185    }
186  }
187}
188```
189 
190#### Claude Code (CLI)
191 
192```bash
193claude mcp add memory -- npx -y memory-mcp-1file
194```
195 
196#### Cursor
197 
1981.  Go to **Cursor Settings** > **Features** > **MCP Servers**.
1992.  Click **+ Add New MCP Server**.
2003.  **Type**: `command`
2014.  **Name**: `memory`
2025.  **Command**: `npx -y memory-mcp-1file`
203 
204Or add to `.cursor/mcp.json`:
205 
206```json
207{
208  "mcpServers": {
209    "memory": {
210      "command": "npx",
211      "args": ["-y", "memory-mcp-1file"]
212    }
213  }
214}
215```
216 
217#### Windsurf / VS Code
218 
219Add to your MCP settings:
220 
221```json
222{
223  "mcpServers": {
224    "memory": {
225      "command": "npx",
226      "args": ["-y", "memory-mcp-1file"]
227    }
228  }
229}
230```
231 
232#### Bun
233 
234```json
235{
236  "mcpServers": {
237    "memory": {
238      "command": "bunx",
239      "args": ["memory-mcp-1file"]
240    }
241  }
242}
243```
244 
245> **Note:** Unlike Docker, `npx`/`bunx` runs the binary **locally** — it already has access to your filesystem, so no directory mounting is needed. To customize the data storage path, pass `--data-dir` via args:
246> ```json
247> "args": ["-y", "memory-mcp-1file", "--", "--data-dir", "/path/to/data"]
248> ```
249 
250### Gemini CLI
251 
252Add to your `~/.gemini/settings.json`:
253 
254```json
255{
256  "mcpServers": {
257    "memory": {
258      "command": "npx",
259      "args": ["-y", "memory-mcp-1file"]
260    }
261  }
262}
263```
264 
265Or with Docker:
266 
267```json
268{
269  "mcpServers": {
270    "memory": {
271      "command": "docker",
272      "args": [
273        "run", "--init", "-i", "--rm", "--memory=3g",
274        "-v", "mcp-data:/data",
275        "-v", "${workspaceFolder}:/project:ro",
276        "ghcr.io/pomazanbohdan/memory-mcp-1file:latest"
277      ]
278    }
279  }
280}
281```
282 
283---
284 
285## ✨ Key Features
286 
287- **Semantic Memory**: Stores text with vector embeddings (`qwen3` by default) for "vibe-based" retrieval.
288- **Graph Memory**: Tracks entities (`User`, `Project`, `Tech`) and their relations (`uses`, `likes`). Supports PageRank-based traversal.
289- **Code Intelligence**: Indexes local project directories (AST-based chunking) for Rust, Python, TypeScript, JavaScript, Go, Java, and **Dart/Flutter**. Tracks **calls, imports, extends, implements, and mixin** relationships between symbols.
290- **Temporal Validity**: Memories can have `valid_from` and `valid_until` dates.
291- **SurrealDB Backend**: Fast, embedded, single-file database.
292 
293---
294 
295## 🛠️ Tools Available
296 
297The server exposes **27 tools** to the AI model, organized into logical categories.
298 
299### 🧠 Core Memory Management
300| Tool | Description |
301|------|-------------|
302| `store_memory` | Store a new memory with content and optional metadata. |
303| `update_memory` | Update an existing memory (only provided fields). |
304| `delete_memory` | Delete a memory by its ID. |
305| `list_memories` | List memories with pagination (newest first). |
306| `get_memory` | Get a specific memory by ID. |
307| `invalidate` | Soft-delete a memory (mark as invalid). |
308| `get_valid` | Get currently active memories (filters out expired ones). |
309| `get_valid_at` | Get memories that were valid at a specific past timestamp. |
310 
311### 🔎 Search & Retrieval
312| Tool | Description |
313|------|-------------|
314| `recall` | **Hybrid search** (Vector + Keyword + Graph). Best for general questions. |
315| `search` | Pure semantic vector search. |
316| `search_text` | Exact keyword match (BM25). |
317 
318### 🕸️ Knowledge Graph
319| Tool | Description |
320|------|-------------|
321| `create_entity` | Define a node (e.g., "React", "Authentication"). |
322| `create_relation` | Link nodes (e.g., "Project" -> "uses" -> "React"). |
323| `get_related` | Find connected concepts via graph traversal. |
324| `detect_communities` | Detect communities in the graph using Leiden algorithm. |
325 
326### 💻 Codebase Intelligence
327| Tool | Description |
328|------|-------------|
329| `index_project` | Scan and index a local folder for code search. |
330| `get_index_status` | Check if indexing is in progress or failed. |
331| `list_projects` | List all indexed projects. |
332| `delete_project` | Remove a project and its code chunks from the index. |
333| `search_code` | Semantic search over code chunks. |
334| `recall_code` | **Hybrid code search** (Vector + BM25 + Symbol Graph PageRank via RRF). Best quality code retrieval. |
335| `search_symbols` | Search for functions/classes by name. |
336| `get_callers` | Find functions that call a given symbol. |
337| `get_callees` | Find functions called by a given symbol. |
338| `get_related_symbols` | Get related symbols via graph traversal (calls, extends, implements). |
339 
340### ⚙️ System & Maintenance
341| Tool | Description |
342|------|-------------|
343| `get_status` | Get server health and loading status. |
344| `reset_all_memory` | **DANGER**: Wipes all data (memories, graph, code). |
345 
346---
347 
348## ⚙️ Configuration
349 
350Environment variables or CLI args:
351 
352| Arg | Env | Default | Description |
353|-----|-----|---------|-------------|
354| `--data-dir` | `DATA_DIR` | `./data` | DB location |
355| `--model` | `EMBEDDING_MODEL` | `qwen3` | Embedding model (`qwen3`, `gemma`, `bge_m3`, `nomic`, `e5_multi`, `e5_small`) |
356| `--mrl-dim` | `MRL_DIM` | *(native)* | Output dimension for MRL-supported models (e.g. 64, 128, 256, 512, 1024 for Qwen3). Defaults to the model's native maximum dimension (1024 for Qwen3). |
357| `--batch-size` | `BATCH_SIZE` | `8` | Maximum batch size for embedding inference |
358| `--cache-size` | `CACHE_SIZE` | `1000` | LRU cache capacity for embeddings |
359| `--timeout` | `TIMEOUT_MS` | `30000` | Timeout in milliseconds |
360| `--idle-timeout` | `IDLE_TIMEOUT` | `0` | Idle timeout in minutes. 0 = disabled |
361| `--log-level` | `LOG_LEVEL` | `info` | Verbosity |
362| *(None)* | `HF_TOKEN` | *(None)* | HuggingFace Token (ONLY required for gated models like `gemma`) |
363 
364### 🧠 Available Models
365 
366You can switch the embedding model using the `--model` arg or `EMBEDDING_MODEL` env var.
367 
368| Argument Value | HuggingFace Repo | Dimensions | Size | Use Case |
369| :--- | :--- | :--- | :--- | :--- |
370| `qwen3` | `Qwen/Qwen3-Embedding-0.6B` | 1024 (MRL) | 1.2 GB | **Default**. Top open-source 2026 model, 32K context, MRL support. |
371| `gemma` | `onnx-community/embeddinggemma-300m-ONNX` | 768 (MRL) | ~195 MB | Lighter alternative with MRL support. (Requires proprietary license agreement) |
372| `bge_m3` | `BAAI/bge-m3` | 1024 | 2.3 GB | State-of-the-art multilingual hybrid retrieval. Heavy. |
373| `nomic` | `nomic-ai/nomic-embed-text-v1.5` | 768 | 1.9 GB | High quality long-context BERT-compatible. |
374| `e5_multi` | `intfloat/multilingual-e5-base` | 768 | 1.1 GB | Legacy; kept for backward compatibility. |
375| `e5_small` | `intfloat/multilingual-e5-small` | 384 | 134 MB | Fastest, minimal RAM. Good for dev/testing. |
376 
377### 📉 Matryoshka Representation Learning (MRL)
378 
379Models marked with **(MRL)** support dynamically truncating the output embedding vector to a smaller dimension (e.g., 512, 256, 128) with minimal loss of accuracy. This saves database storage and speeds up vector search.
380 
381Use the `--mrl-dim` argument to specify the desired size. If omitted, the default is the model's native base dimension (e.g., 1024 for Qwen3).
382 
383**Warning:** Once your database is created with a specific dimension, you cannot change it without wiping the data directory.
384 
385### 🔒 Gated Models & Authentication (Gemma)
386 
387By default, the server uses **Qwen3**, which is fully open-source and downloads automatically without any authentication.
388 
389However, if you choose to use **Gemma** (`--model gemma`), you must authenticate because it is a "Gated Model" with a proprietary license. 
390 
391To use Gemma:
3921. Go to [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) on Hugging Face.
3932. Log in and click **"Agree to access repository"**.
3943. Generate an Access Token at [HF Tokens](https://huggingface.co/settings/tokens) (Read access is enough).
3954. Start the server with the token:
396 
397```bash
398# Using environment variable
399HF_TOKEN="hf_your_token_here" memory-mcp --model gemma
400 
401# Or via .env file (see .env.example)
402```
403 
404> [!WARNING]
405> **Changing Models & Data Compatibility**
406>
407> If you switch to a model with different dimensions (e.g., from `e5_small` to `e5_multi`), **your existing database will be incompatible**.
408> You must delete the data directory (volume) and re-index your data.
409>
410> Switching between models with the same dimensions (e.g., `e5_multi` <-> `nomic`) is theoretically possible but not recommended as semantic spaces differ.
411 
412## 🔮 Future Roadmap (Research & Ideas)
413 
414Based on analysis of advanced memory systems like [Hindsight](https://hindsight.vectorize.io/) (see their documentation for details on these mechanisms), we are exploring these "Cognitive Architecture" features for future releases:
415 
416### 1. Meta-Cognitive Reflection (Consolidation)
417*   **Problem:** Raw memories accumulate noise over time (e.g., 10 separate memories about fixing the same bug).
418*   **Solution:** Implement a `reflect` background process (or tool) that periodicallly scans recent memories to:
419    *   **De-duplicate** redundant entries.
420    *   **Resolve conflicts** (if two memories contradict, keep the newer one or flag for review).
421    *   **Synthesize** low-level facts into high-level "Insights" (e.g., "User prefers Rust over Python" derived from 5 code choices).
422 
423### 2. Temporal Decay & "Presence"
424*   **Problem:** Old memories can sometimes drown out current context in semantic search.
425*   **Solution:** Integrate **Time Decay** into the Reciprocal Rank Fusion (RRF) algorithm.
426    *   Give a calculated boost to recent memories for queries implying "current state".
427    *   Allow the agent to prioritize "working memory" over "historical archives" dynamically.
428 
429### 3. Namespaced Memory Banks
430*   **Problem:** Running one docker container per project is resource-heavy.
431*   **Solution:** Add support for `namespace` or `project_id` scoping.
432    *   Allows a single server instance to host isolated "Memory Banks" for different projects or agent personas.
433    *   Enables "Switching Context" without restarting the container.
434 
435### 4. Epistemic Confidence Scoring
436*   **Problem:** The agent treats a guess the same as a verified fact.
437*   **Solution:** Add a `confidence` score (0.0 - 1.0) to memory schemas.
438    *   Allows storing hypotheses ("I think the bug is in auth.rs", confidence: 0.3).
439    *   Retrieval tools can filter out low-confidence memories when answering factual questions.
440 
441---
442 
443## License
444 
445MIT
446

Full transparency — inspect the skill content before installing.

New to skill.md files?

See what a SKILL.md file is, how to install one, and how it differs from AGENTS.md or cursorrules.

Read the guide →