Open-source memory backend for multi-agent systems. Agents store decisions, share causal knowledge graphs, and retrieve context in 5ms — without cloud lock-in or API costs. Works with LangGraph · CrewAI · AutoGen · any HTTP client · Claude Desktop Key capabilities for agent pipelines: - Framework-agnostic REST API — 15 endpoints, no MCP client library needed - Knowledge graph — agents share causal
Add this skill
npx mdskills install doobidoo/mcp-memory-serviceComprehensive memory persistence backend with REST API, knowledge graphs, and multi-agent support
Open-source memory backend for multi-agent systems. Agents store decisions, share causal knowledge graphs, and retrieve context in 5ms — without cloud lock-in or API costs.
Works with LangGraph · CrewAI · AutoGen · any HTTP client · Claude Desktop
| Without mcp-memory-service | With mcp-memory-service |
|---|---|
| Each agent run starts from zero | Agents retrieve prior decisions in 5ms |
| Memory is local to one graph/run | Memory is shared across all agents and runs |
| You manage Redis + Pinecone + glue code | One self-hosted service, zero cloud cost |
| No causal relationships between facts | Knowledge graph with typed edges (causes, fixes, contradicts) |
| Context window limits create amnesia | Autonomous consolidation compresses old memories |
Key capabilities for agent pipelines:
X-Agent-ID header — auto-tag memories by agent identity for scoped retrievalconversation_id — bypass deduplication for incremental conversation storagepip install mcp-memory-service
MCP_ALLOW_ANONYMOUS_ACCESS=true memory server --http
# REST API running at http://localhost:8000
import httpx
BASE_URL = "http://localhost:8000"
# Store — auto-tag with X-Agent-ID header
async with httpx.AsyncClient() as client:
await client.post(f"{BASE_URL}/api/memories", json={
"content": "API rate limit is 100 req/min",
"tags": ["api", "limits"],
}, headers={"X-Agent-ID": "researcher"})
# Stored with tags: ["api", "limits", "agent:researcher"]
# Search — scope to a specific agent
results = await client.post(f"{BASE_URL}/api/memories/search", json={
"query": "API rate limits",
"tags": ["agent:researcher"],
})
print(results.json()["memories"])
Framework-specific guides: docs/agents/
| Mem0 | Zep | DIY Redis+Pinecone | mcp-memory-service | |
|---|---|---|---|---|
| License | Proprietary | Enterprise | — | Apache 2.0 |
| Cost | Per-call API | Enterprise | Infra costs | $0 |
| Framework integration | SDK | SDK | Manual | REST API (any HTTP client) |
| Knowledge graph | No | Limited | No | Yes (typed edges) |
| Auto consolidation | No | No | No | Yes (decay + compression) |
| On-premise embeddings | No | No | Manual | Yes (ONNX, local) |
| Privacy | Cloud | Cloud | Partial | 100% local |
| Hybrid search | No | Yes | Manual | Yes (BM25 + vector) |
| MCP protocol | No | No | No | Yes |
| REST API | Yes | Yes | Manual | Yes (15 endpoints) |
Your AI assistant forgets everything when you start a new chat. After 50 tool uses, context explodes to 500k+ tokens—Claude slows down, you restart, and now it remembers nothing. You spend 10 minutes re-explaining your architecture. Again.
MCP Memory Service solves this.
It automatically captures your project context, architecture decisions, and code patterns. When you start fresh sessions, your AI already knows everything—no re-explaining, no context loss, no wasted time.

Technical showcase: Performance, Architecture, AI/ML Intelligence & Developer Experience
LangGraph · CrewAI · AutoGen · Any HTTP Client · OpenClaw/Nanobot · Custom Pipelines
Claude Code · Gemini Code Assist · Aider · GitHub Copilot CLI · Amp · Continue · Zed · Cody
Claude Desktop · VS Code · Cursor · Windsurf · Raycast · JetBrains · Sourcegraph · Qodo
ChatGPT (Developer Mode) · Claude Web
Works seamlessly with any MCP-compatible client or HTTP client - whether you're building agent pipelines, coding in the terminal, IDE, or browser.
💡 NEW: ChatGPT now supports MCP! Enable Developer Mode to connect your memory service directly. See setup guide →
Express Install (recommended for most users):
pip install mcp-memory-service
# Auto-configure for Claude Desktop (macOS/Linux)
python -m mcp_memory_service.scripts.installation.install --quick
What just happened?
Next: Restart Claude Desktop. Your AI now remembers everything across sessions.
📦 Alternative: PyPI + Manual Configuration
pip install mcp-memory-service
Then add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"memory": {
"command": "memory",
"args": ["server"]
}
}
}
🔧 Advanced: Custom Backends & Team Setup
For production deployments, team collaboration, or cloud sync:
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
python scripts/installation/install.py
Choose from:
| Session 1 | Session 2 (Fresh Start) |
|---|---|
| You: "We're building a Next.js app with Prisma and tRPC" | AI: "What's your tech stack?" ❌ |
| AI: "Got it, I see you're using App Router" | You: Explains architecture again for 10 minutes 😤 |
| You: "Add authentication with NextAuth" | AI: "Should I use Pages Router or App Router?" ❌ |
| Session 1 | Session 2 (Fresh Start) |
|---|---|
| You: "We're building a Next.js app with Prisma and tRPC" | AI: "I remember—Next.js App Router with Prisma and tRPC. What should we build?" ✅ |
| AI: "Got it, I see you're using App Router" | You: "Add OAuth login" |
| You: "Add authentication with NextAuth" | AI: "I'll integrate NextAuth with your existing Prisma setup." ✅ |
Result: Zero re-explaining. Zero context loss. Just continuous, intelligent collaboration.
MCP Memory Service is fully compatible with the SHODH Unified Memory API Specification v1.0.0, enabling seamless interoperability across the SHODH ecosystem.
| Implementation | Backend | Embeddings | Use Case |
|---|---|---|---|
| shodh-memory | RocksDB | MiniLM-L6-v2 (ONNX) | Reference implementation |
| shodh-cloudflare | Cloudflare Workers + Vectorize | Workers AI (bge-small) | Edge deployment, multi-device sync |
| mcp-memory-service (this) | SQLite-vec / Hybrid | MiniLM-L6-v2 (ONNX) | Desktop AI assistants (MCP) |
All SHODH implementations share the same memory schema:
emotion, emotional_valence, emotional_arousalepisode_id, sequence_number, preceding_memory_idsource_type, credibilityquality_score, access_count, last_accessed_atInteroperability Example: Export memories from mcp-memory-service → Import to shodh-cloudflare → Sync across devices → Full fidelity preservation of emotional_valence, episode_id, and all spec fields.
🧠 Persistent Memory – Context survives across sessions with semantic search
🔍 Smart Retrieval – Finds relevant context automatically using AI embeddings
⚡ 5ms Speed – Instant context injection, no latency
🔄 Multi-Client – Works across 13+ AI applications
☁️ Cloud Sync – Optional Cloudflare backend for team collaboration
🔒 Privacy-First – Local-first, you control your data
📊 Web Dashboard – Visualize and manage memories at http://localhost:8000
🧬 Knowledge Graph – Interactive D3.js visualization of memory relationships 🆕
8 Dashboard Tabs: Dashboard • Search • Browse • Documents • Manage • Analytics • Quality (NEW) • API Docs
📖 See Web Dashboard Guide for complete documentation.
Security: Fix minimatch ReDoS and Replace Abandoned PyPDF2 with pypdf
What's New:
minimatch to ^10.2.1 in npm test packages, eliminating a ReDoS attack vector.PyPDF2 to its official successor pypdf; no functional change to PDF ingestion.Previous Releases:
MCP_INIT_TIMEOUT env override, 7 unit tests)validate_config() at startup, safe_get_int_env(), 8 new robustness tests)conversation_id Support for Incremental Conversation Saves (semantic dedup bypass, metadata storage, all backends)memory server --http flag (easier UX, single command)--with-ml for full ML capabilities.For resource-constrained environments (CI/CD, edge devices):
pip install mcp-memory-service-lite
Benefits:
Trade-offs:
# For local development/single-user: Enable anonymous access
export MCP_ALLOW_ANONYMOUS_ACCESS=true
# Start HTTP dashboard server (separate from MCP server)
memory server --http
# Access interactive dashboard
open http://127.0.0.1:8000/
# Upload documents via CLI
curl -X POST http://127.0.0.1:8000/api/documents/upload \
-F "file=@document.pdf" \
-F "tags=documentation,reference"
# Search document content
curl -X POST http://127.0.0.1:8000/api/search \
-H "Content-Type: application/json" \
-d '{"query": "authentication flow", "limit": 10}'
⚠️ Authentication Required: The HTTP dashboard requires authentication by default. For local development, set
MCP_ALLOW_ANONYMOUS_ACCESS=true. For production, use API key authentication (MCP_API_KEY) or OAuth. See Configuration for details.
# Start OAuth-enabled HTTP server for team collaboration
export MCP_OAUTH_ENABLED=true
memory server --http
# Claude Code team members connect via HTTP transport
claude mcp add --transport http memory-service http://your-server:8000/mcp
# → Automatic OAuth discovery, registration, and authentication
# Store a memory
uv run memory store "Fixed race condition in authentication by adding mutex locks"
# Search for relevant memories (hybrid search - default in v10.8.0+)
uv run memory recall "authentication race condition"
# Use hybrid search via HTTP API for exact match + semantic
curl -X POST http://127.0.0.1:8000/api/search \
-H "Content-Type: application/json" \
-d '{
"query": "OAuth 2.1 authentication",
"mode": "hybrid",
"limit": 10
}'
# Search by tags
uv run memory search --tags python debugging
# Check system health (shows OAuth status)
uv run memory health
Recommended approach - Add to your Claude Desktop config (~/.claude/config.json):
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["-m", "mcp_memory_service.server"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}
Alternative approaches:
// Option 1: UV tooling (if using UV)
{
"mcpServers": {
"memory": {
"command": "uv",
"args": ["--directory", "/path/to/mcp-memory-service", "run", "memory", "server"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}
// Option 2: Direct script path (v6.17.0+)
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["/path/to/mcp-memory-service/scripts/server/run_memory_server.py"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}
Hybrid Backend (v8.9.0+ RECOMMENDED):
# Hybrid backend with auto-configured pragmas
export MCP_MEMORY_STORAGE_BACKEND=hybrid
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=15000,cache_size=20000"
# Cloudflare credentials (required for hybrid)
export CLOUDFLARE_API_TOKEN="your-token"
export CLOUDFLARE_ACCOUNT_ID="your-account"
export CLOUDFLARE_D1_DATABASE_ID="your-db-id"
export CLOUDFLARE_VECTORIZE_INDEX="mcp-memory-index"
# Enable HTTP API
export MCP_HTTP_ENABLED=true
export MCP_HTTP_PORT=8000
# Security (choose one authentication method)
# Option 1: API Key authentication (recommended for production)
export MCP_API_KEY="your-secure-key"
# Option 2: Anonymous access (local development only)
# export MCP_ALLOW_ANONYMOUS_ACCESS=true
# Option 3: OAuth team collaboration
# export MCP_OAUTH_ENABLED=true
SQLite-vec Only (Local):
# Local-only storage
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=15000,cache_size=20000"
Hybrid Search (v10.8.0+):
# Enable hybrid BM25 + vector search (default: enabled)
export MCP_HYBRID_SEARCH_ENABLED=true
# Configure score fusion weights (must sum to ~1.0)
export MCP_HYBRID_KEYWORD_WEIGHT=0.3 # BM25 keyword match weight
export MCP_HYBRID_SEMANTIC_WEIGHT=0.7 # Vector similarity weight
# Adjust weights based on your use case:
# - More keyword-focused: 0.5 keyword / 0.5 semantic
# - More semantic-focused: 0.2 keyword / 0.8 semantic
# - Default balanced: 0.3 keyword / 0.7 semantic (recommended)
Note: Hybrid search is only available with
sqlite_vecandhybridbackends. It automatically combines BM25 keyword matching with vector similarity for better exact match scoring while maintaining semantic capabilities.
Control maximum response size to prevent context overflow:
# Limit response size (recommended: 30000-50000)
export MCP_MAX_RESPONSE_CHARS=50000 # Default: unlimited
Applies to all retrieval tools:
retrieve_memory, recall_memory, retrieve_with_quality_boostsearch_by_tag, recall_by_timeframeBehavior:
Use external embedding services instead of running models locally:
# vLLM example
export MCP_EXTERNAL_EMBEDDING_URL=http://localhost:8890/v1/embeddings
export MCP_EXTERNAL_EMBEDDING_MODEL=nomic-ai/nomic-embed-text-v1.5
# Ollama example
export MCP_EXTERNAL_EMBEDDING_URL=http://localhost:11434/v1/embeddings
export MCP_EXTERNAL_EMBEDDING_MODEL=nomic-embed-text
# OpenAI example
export MCP_EXTERNAL_EMBEDDING_URL=https://api.openai.com/v1/embeddings
export MCP_EXTERNAL_EMBEDDING_MODEL=text-embedding-3-small
export MCP_EXTERNAL_EMBEDDING_API_KEY=sk-xxx
Benefits:
Note: Only supported with sqlite_vec backend. See docs/deployment/external-embeddings.md for detailed setup.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ AI Clients │ │ MCP Memory │ │ Storage Backend │
│ │ │ Service v8.9 │ │ │
│ • Claude Desktop│◄──►│ • MCP Protocol │◄──►│ • Hybrid 🌟 │
│ • Claude Code │ │ • HTTP Transport│ │ (5ms local + │
│ (HTTP/OAuth) │ │ • OAuth 2.1 Auth│ │ cloud sync) │
│ • VS Code │ │ • Memory Store │ │ • SQLite-vec │
│ • Cursor │ │ • Semantic │ │ • Cloudflare │
│ • 13+ AI Apps │ │ Search │ │ │
│ • Web Dashboard │ │ • Doc Ingestion │ │ Zero DB Locks ✅│
│ (Port 8000) │ │ • Zero DB Locks │ │ Auto-Config ✅ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
mcp-memory-service/
├── src/mcp_memory_service/ # Core application
│ ├── models/ # Data models
│ ├── storage/ # Storage backends
│ ├── web/ # HTTP API & dashboard
│ └── server.py # MCP server
├── scripts/ # Utilities & installation
├── tests/ # Test suite
└── tools/docker/ # Docker configuration
See CONTRIBUTING.md for detailed guidelines.
python scripts/validation/validate_configuration_complete.py to check your setupReal-world metrics from active deployments:
importantVersion 8.64.0+:
Best practices:
scripts/maintenance/ - Auto-retagging and cleanup toolsInstall via CLI
npx mdskills install doobidoo/mcp-memory-serviceMCP Memory Service is a free, open-source AI agent skill. Open-source memory backend for multi-agent systems. Agents store decisions, share causal knowledge graphs, and retrieve context in 5ms — without cloud lock-in or API costs. Works with LangGraph · CrewAI · AutoGen · any HTTP client · Claude Desktop Key capabilities for agent pipelines: - Framework-agnostic REST API — 15 endpoints, no MCP client library needed - Knowledge graph — agents share causal
Install MCP Memory Service with a single command:
npx mdskills install doobidoo/mcp-memory-serviceThis downloads the skill files into your project and your AI agent picks them up automatically.
MCP Memory Service works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Gemini Cli, Amp, Roo Code, Goose. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.