A high-performance, pure Rust Model Context Protocol (MCP) server that provides persistent, semantic, and graph-based memory for AI agents. Works perfectly with: Claude Desktop Claude Code (CLI) Cline / Roo Code Any other MCP-compliant client. Unlike other memory solutions that require a complex stack (Python + Vector DB + Graph DB), this project is a single, self-contained executable. ✅ No Extern
Add this skill
npx mdskills install pomazanbohdan/memory-mcp-1fileComprehensive memory server with semantic search, knowledge graph, and code indexing—excellent docs and setup
A high-performance, pure Rust Model Context Protocol (MCP) server that provides persistent, semantic, and graph-based memory for AI agents.
Works perfectly with:
Unlike other memory solutions that require a complex stack (Python + Vector DB + Graph DB), this project is a single, self-contained executable.
It combines:
graph TD
User[AI Agent / IDE]
subgraph "Memory MCP Server"
MS[MCP Server]
subgraph "Core Engines"
ES[Embedding Service]
GS[Graph Service]
CS[Codebase Service]
end
MS -- "Store / Search" --> ES
MS -- "Relate Entities" --> GS
MS -- "Index" --> CS
ES -- "Vectorize Text" --> SDB[(SurrealDB Embedded)]
GS -- "Knowledge Graph" --> SDB
CS -- "AST Chunks" --> SDB
end
User -- "MCP Protocol" --> MS
Memory is useless if your agent doesn't check it. To get the "Long-Term Memory" effect, you must instruct your agent to follow a strict protocol.
We provide a battle-tested Memory Protocol (AGENTS.md) that you can adapt.
The protocol implements specific flows to handle Context Window Compaction and Session Restarts:
TASK: in_progress immediately. This restores the full context of what was happening before the last session ended or the context was compacted.TASK:, DECISION:, RESEARCH:) so semantic search can precisely target the right type of information, reducing noise.These workflows turn the agent from a "stateless chatbot" into a "stateful worker" that survives restarts and context clearing.
Instead of scattering instructions across IDE-specific files (like .cursorrules), establish AGENTS.md as the Single Source of Truth.
Instruct your agent (in its base system prompt) to:
AGENTS.md at the start of every session.Here is a minimal reference prompt to bootstrap this behavior:
# 🧠 Memory & Protocol
You have access to a persistent memory server and a protocol definition file.
1. **Protocol Adherence**:
- READ `AGENTS.md` immediately upon starting.
- Strictly follow the "Session Startup" and "Sync" protocols defined there.
2. **Context Restoration**:
- Run `search_text("TASK: in_progress")` to restore context.
- Do NOT ask the user "what should I do?" if a task is already in progress.
Without this protocol, the agent loses context after compaction or session restarts. With this protocol, it maintains the full context of the current task, ensuring no steps or details are lost, even when the chat history is cleared.
To use this MCP server with any client (Claude Code, OpenCode, Cline, etc.), use the following Docker command structure.
Key Requirements:
-v mcp-data:/data (Persists your graph, embeddings, and cached model weights)-v $(pwd):/project:ro (Allows the server to read and index your code)--init (Ensures the server shuts down cleanly)Tip:
Model Caching: The embedding model (~1 GB) is stored in
/data/models/. Using a named volume (mcp-data:/data) ensures the model is downloaded only once. Without a named volume, Docker creates a new anonymous volume on eachdocker run, causing the model to re-download every time.
Add this to your configuration file (e.g., claude_desktop_config.json):
{
"mcpServers": {
"memory": {
"command": "docker",
"args": [
"run",
"--init",
"-i",
"--rm",
"--memory=3g",
"-v", "mcp-data:/data",
"-v", "/absolute/path/to/your/project:/project:ro",
"ghcr.io/pomazanbohdan/memory-mcp-1file:latest"
]
}
}
}
Note: Replace
/absolute/path/to/your/projectwith the actual path you want to index. In some environments (like Cursor or VSCode extensions), you might be able to use variables like${workspaceFolder}, but absolute paths are most reliable for Docker.
stdiomemorydocker run --init -i --rm --memory=3g -v mcp-data:/data -v "/Users/yourname/projects/current:/project:ro" ghcr.io/pomazanbohdan/memory-mcp-1file:latest
(Remember to update the project path when switching workspaces if you need code indexing)docker run --init -i --rm --memory=3g \
-v mcp-data:/data \
-v $(pwd):/project:ro \
ghcr.io/pomazanbohdan/memory-mcp-1file:latest
You can run the server directly via npx or bunx. The npm package automatically downloads the correct pre-compiled binary for your platform.
Add to claude_desktop_config.json:
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "memory-mcp-1file"]
}
}
}
claude mcp add memory -- npx -y memory-mcp-1file
commandmemorynpx -y memory-mcp-1fileOr add to .cursor/mcp.json:
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "memory-mcp-1file"]
}
}
}
Add to your MCP settings:
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "memory-mcp-1file"]
}
}
}
{
"mcpServers": {
"memory": {
"command": "bunx",
"args": ["memory-mcp-1file"]
}
}
}
Note: Unlike Docker,
npx/bunxruns the binary locally — it already has access to your filesystem, so no directory mounting is needed. To customize the data storage path, pass--data-dirvia args:"args": ["-y", "memory-mcp-1file", "--", "--data-dir", "/path/to/data"]
Add to your ~/.gemini/settings.json:
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "memory-mcp-1file"]
}
}
}
Or with Docker:
{
"mcpServers": {
"memory": {
"command": "docker",
"args": [
"run", "--init", "-i", "--rm", "--memory=3g",
"-v", "mcp-data:/data",
"-v", "${workspaceFolder}:/project:ro",
"ghcr.io/pomazanbohdan/memory-mcp-1file:latest"
]
}
}
}
qwen3 by default) for "vibe-based" retrieval.User, Project, Tech) and their relations (uses, likes). Supports PageRank-based traversal.valid_from and valid_until dates.The server exposes 27 tools to the AI model, organized into logical categories.
| Tool | Description |
|---|---|
store_memory | Store a new memory with content and optional metadata. |
update_memory | Update an existing memory (only provided fields). |
delete_memory | Delete a memory by its ID. |
list_memories | List memories with pagination (newest first). |
get_memory | Get a specific memory by ID. |
invalidate | Soft-delete a memory (mark as invalid). |
get_valid | Get currently active memories (filters out expired ones). |
get_valid_at | Get memories that were valid at a specific past timestamp. |
| Tool | Description |
|---|---|
recall | Hybrid search (Vector + Keyword + Graph). Best for general questions. |
search | Pure semantic vector search. |
search_text | Exact keyword match (BM25). |
| Tool | Description |
|---|---|
create_entity | Define a node (e.g., "React", "Authentication"). |
create_relation | Link nodes (e.g., "Project" -> "uses" -> "React"). |
get_related | Find connected concepts via graph traversal. |
detect_communities | Detect communities in the graph using Leiden algorithm. |
| Tool | Description |
|---|---|
index_project | Scan and index a local folder for code search. |
get_index_status | Check if indexing is in progress or failed. |
list_projects | List all indexed projects. |
delete_project | Remove a project and its code chunks from the index. |
search_code | Semantic search over code chunks. |
recall_code | Hybrid code search (Vector + BM25 + Symbol Graph PageRank via RRF). Best quality code retrieval. |
search_symbols | Search for functions/classes by name. |
get_callers | Find functions that call a given symbol. |
get_callees | Find functions called by a given symbol. |
get_related_symbols | Get related symbols via graph traversal (calls, extends, implements). |
| Tool | Description |
|---|---|
get_status | Get server health and loading status. |
reset_all_memory | DANGER: Wipes all data (memories, graph, code). |
Environment variables or CLI args:
| Arg | Env | Default | Description |
|---|---|---|---|
--data-dir | DATA_DIR | ./data | DB location |
--model | EMBEDDING_MODEL | qwen3 | Embedding model (qwen3, gemma, bge_m3, nomic, e5_multi, e5_small) |
--mrl-dim | MRL_DIM | (native) | Output dimension for MRL-supported models (e.g. 64, 128, 256, 512, 1024 for Qwen3). Defaults to the model's native maximum dimension (1024 for Qwen3). |
--batch-size | BATCH_SIZE | 8 | Maximum batch size for embedding inference |
--cache-size | CACHE_SIZE | 1000 | LRU cache capacity for embeddings |
--timeout | TIMEOUT_MS | 30000 | Timeout in milliseconds |
--idle-timeout | IDLE_TIMEOUT | 0 | Idle timeout in minutes. 0 = disabled |
--log-level | LOG_LEVEL | info | Verbosity |
| (None) | HF_TOKEN | (None) | HuggingFace Token (ONLY required for gated models like gemma) |
You can switch the embedding model using the --model arg or EMBEDDING_MODEL env var.
| Argument Value | HuggingFace Repo | Dimensions | Size | Use Case |
|---|---|---|---|---|
qwen3 | Qwen/Qwen3-Embedding-0.6B | 1024 (MRL) | 1.2 GB | Default. Top open-source 2026 model, 32K context, MRL support. |
gemma | onnx-community/embeddinggemma-300m-ONNX | 768 (MRL) | ~195 MB | Lighter alternative with MRL support. (Requires proprietary license agreement) |
bge_m3 | BAAI/bge-m3 | 1024 | 2.3 GB | State-of-the-art multilingual hybrid retrieval. Heavy. |
nomic | nomic-ai/nomic-embed-text-v1.5 | 768 | 1.9 GB | High quality long-context BERT-compatible. |
e5_multi | intfloat/multilingual-e5-base | 768 | 1.1 GB | Legacy; kept for backward compatibility. |
e5_small | intfloat/multilingual-e5-small | 384 | 134 MB | Fastest, minimal RAM. Good for dev/testing. |
Models marked with (MRL) support dynamically truncating the output embedding vector to a smaller dimension (e.g., 512, 256, 128) with minimal loss of accuracy. This saves database storage and speeds up vector search.
Use the --mrl-dim argument to specify the desired size. If omitted, the default is the model's native base dimension (e.g., 1024 for Qwen3).
Warning: Once your database is created with a specific dimension, you cannot change it without wiping the data directory.
By default, the server uses Qwen3, which is fully open-source and downloads automatically without any authentication.
However, if you choose to use Gemma (--model gemma), you must authenticate because it is a "Gated Model" with a proprietary license.
To use Gemma:
# Using environment variable
HF_TOKEN="hf_your_token_here" memory-mcp --model gemma
# Or via .env file (see .env.example)
Warning:
Changing Models & Data Compatibility
If you switch to a model with different dimensions (e.g., from
e5_smalltoe5_multi), your existing database will be incompatible. You must delete the data directory (volume) and re-index your data.Switching between models with the same dimensions (e.g.,
e5_multinomic) is theoretically possible but not recommended as semantic spaces differ.
Based on analysis of advanced memory systems like Hindsight (see their documentation for details on these mechanisms), we are exploring these "Cognitive Architecture" features for future releases:
reflect background process (or tool) that periodicallly scans recent memories to:
namespace or project_id scoping.
confidence score (0.0 - 1.0) to memory schemas.
MIT
Install via CLI
npx mdskills install pomazanbohdan/memory-mcp-1fileMemory MCP Server is a free, open-source AI agent skill. A high-performance, pure Rust Model Context Protocol (MCP) server that provides persistent, semantic, and graph-based memory for AI agents. Works perfectly with: Claude Desktop Claude Code (CLI) Cline / Roo Code Any other MCP-compliant client. Unlike other memory solutions that require a complex stack (Python + Vector DB + Graph DB), this project is a single, self-contained executable. ✅ No Extern
Install Memory MCP Server with a single command:
npx mdskills install pomazanbohdan/memory-mcp-1fileThis downloads the skill files into your project and your AI agent picks them up automatically.
Memory MCP Server works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.