Hardware-accelerated persistent memory for AI agents. Local-first. No cloud. One-time payment. 66.9% on LoCoMo benchmark (adjusted). Under 1ms retrieval. Zero cloud dependency. Retrieval pipeline rebuilt from scratch. - bge-small-en-v1.5 bi-encoder + ms-marco cross-encoder reranker (spec-decode architecture) - BM25 + Porter-stemmed BM25 + named entity injection, fused via RRF - MAGMA graph layer —
Add this skill
npx mdskills install Vektor-Memory/vektor-memoryHardware-accelerated local memory system with comprehensive MCP tools, strong benchmarks, and excellent documentation
1# vektor-slipstream23Hardware-accelerated persistent memory for AI agents. Local-first. No cloud. One-time payment.45[](https://www.npmjs.com/package/vektor-slipstream)6[](https://www.npmjs.com/package/vektor-slipstream)7[](https://vektormemory.com/product#pricing)89**66.9% on LoCoMo benchmark (adjusted). Under 1ms retrieval. Zero cloud dependency.**1011---1213## Install1415```bash16npm install vektor-slipstream17npx vektor setup18```1920## Quick Start2122```js23const { createMemory } = require('vektor-slipstream');2425const memory = await createMemory({26 agentId: 'my-agent',27 licenceKey: process.env.VEKTOR_LICENCE_KEY,28});2930// Store a memory31await memory.remember('User prefers TypeScript over JavaScript');3233// Recall by semantic similarity — sub-1ms, fully local34const results = await memory.recall('coding preferences', 5);35// → [{ content, score, id }]3637// Traverse the MAGMA graph38const graph = await memory.graph('TypeScript', { hops: 2 });3940// What changed in 7 days?41const delta = await memory.delta('project decisions', 7);4243// Morning briefing44const brief = await memory.briefing();4546// Graph stats47const stats = memory.graphStats();48// → { nodes, edges, entities }49```5051---5253## What's New in v1.5.05455**Retrieval pipeline rebuilt from scratch.**5657- bge-small-en-v1.5 bi-encoder + ms-marco cross-encoder reranker (spec-decode architecture)58- BM25 + Porter-stemmed BM25 + named entity injection, fused via RRF59- MAGMA graph layer — co-occurrence and temporal edges between entities in SQLite60- Persistent entity index (`vektor_entities`) for guaranteed named-entity recall61- Foresight extraction — future-tense statements stored for temporal queries62- Question type classifier — routes single-hop vs multi-hop to optimal retrieval path63- ADD-only contradiction detection — conflicting facts survive with timestamps (no silent deletes)64- Agentic sufficiency check — reformulates query if key entities missing from top results6566**LoCoMo benchmark results (conv 0, 154 valid questions):**6768| Category | Judge Accuracy |69|---|---|70| Multi-hop | 79.1% |71| Adversarial | 70.4% |72| Temporal | 46.2% |73| Single-hop | 51.6% |74| **Adjusted total** | **66.9%** |7576#Under 1ms retrieval latency with zero cloud API calls at query time.7778---7980## CLI Chat — Persistent Memory Terminal8182Chat with any LLM with full memory across every session. Zero configuration.8384```bash85npx vektor chat # start chat (auto-detects Ollama)86npx vektor chat --provider claude # use Anthropic Claude87npx vektor chat --provider groq --model llama-3.3-70b-versatile88npx vektor chat --provider gemini89npx vektor chat --provider openai90```9192### Providers9394| Provider | Details |95|---|---|96| `ollama` | Default — free, local, no API key. Auto-detects best installed model. |97| `claude` | Anthropic Claude — set `ANTHROPIC_API_KEY` |98| `openai` | OpenAI GPT — set `OPENAI_API_KEY` |99| `groq` | Groq LLaMA — set `GROQ_API_KEY` (free tier available) |100| `gemini` | Google Gemini — set `GEMINI_API_KEY` |101102Set a permanent default:103```bash104# Windows105$env:VEKTOR_PROVIDER = "claude"106107# macOS/Linux108export VEKTOR_PROVIDER=claude109```110111### In-chat commands112113Type `/` to see available commands with autocomplete. Tab to select, arrow keys to navigate.114115| Command | Action |116|---|---|117| `/recall <query>` | Search MAGMA memory mid-conversation |118| `/stats` | Show memory node count, edges, pinned |119| `/briefing` | Generate memory briefing inline |120| `/exit` | Exit chat (Ctrl+C also works) |121122### One-liner commands123124```bash125# Store a fact126npx vektor remember "I prefer TypeScript over JavaScript"127npx vektor remember "deadline is Friday" --importance 5128129# Pipe support130cat meeting-notes.txt | npx vektor remember131132# One-shot recall + LLM answer133npx vektor ask "what stack am I using?"134npx vektor ask "what did we decide about the database?"135136# Autonomous goal executor137npx vektor agent "summarise everything I know about project Alpha"138npx vektor agent "research AI memory tools" --steps 15 --provider groq139```140141### Ollama auto-detection142143VEKTOR queries `http://localhost:11434/api/tags` and picks the best available model:144`qwen3` → `qwen2` → `llama` → `mistral` → first available.145146Override:147```bash148$env:OLLAMA_MODEL = "qwen3.5:4b"149export OLLAMA_MODEL=qwen3.5:4b150```151152---153154## All CLI Commands155156```bash157npx vektor setup # First-run wizard — licence, hardware, integrations158npx vektor activate # Activate licence key on this machine159npx vektor test # Test memory engine with progress bar160npx vektor status # System health check161npx vektor mcp # Start Claude Desktop MCP server162npx vektor rem # Run REM dream cycle163npx vektor chat # Persistent memory chat (all LLMs)164npx vektor remember # Store a fact165npx vektor ask # Query memory + LLM answer166npx vektor agent # Autonomous goal executor167npx vektor help # All commands168```169170---171172## Claude Desktop Extension (DXT)173174Install the `.dxt` extension for zero-config memory in every Claude Desktop session.175176**Install:** drag `vektor-slipstream.dxt` onto the Claude Desktop Extensions page.177178Once installed, Claude automatically:179- Recalls relevant context at session start180- Stores facts and decisions during conversation181- Summarises at session end182183All 44 tools are available in Claude Desktop — no configuration needed beyond your licence key.184185**User config fields:**186187| Field | Purpose |188|---|---|189| `licence_key` | Your Polar licence key (required) |190| `db_path` | Memory DB path (defaults to `~/vektor-slipstream-memory.db`) |191| `project_path` | Default path for `cloak_cortex` project scanning (optional) |192193Download the latest `.dxt` from [vektormemory.com/docs/dxt](https://vektormemory.com/docs/dxt).194195---196197## MCP Tools — All 44198199### Memory Tools200201| Tool | Function |202|---|---|203| `vektor_recall` | Semantic + BM25 + graph search across MAGMA memory |204| `vektor_recall_rrf` | BM25+RRF dual-channel recall with cross-encoder rerank |205| `vektor_store` | Store memory with importance score |206| `vektor_ingest` | Batch ingest conversation turns with session date |207| `vektor_graph` | Traverse associative memory graph |208| `vektor_delta` | See what changed on a topic over time |209| `vektor_briefing` | Generate morning briefing from recent memories |210| `vektor_stats` | Memory DB stats — node count, edges, entities |211| `vektor_graph_stats` | MAGMA graph node/edge/entity counts |212| `vektor_timeline` | Query memories by date range |213214### CLOAK Core215216| Tool | Function |217|---|---|218| `cloak_fetch` | Stealth headless browser fetch via Playwright |219| `cloak_fetch_smart` | Checks `llms.txt` first, falls back to stealth browser |220| `cloak_render` | Full CSS/DOM layout sensor |221| `cloak_diff` | Semantic diff of URL since last fetch |222| `cloak_diff_text` | Structural diff between two text blobs |223| `cloak_passport` | AES-256-GCM credential vault (get/set/delete/list) |224| `cloak_ssh_exec` | Execute commands on remote server via SSH |225| `cloak_ssh_upload` | Upload file to remote server via SFTP |226| `tokens_saved` | Token efficiency ROI calculator |227228### Identity Tools229230| Tool | Function |231|---|---|232| `cloak_identity_create` | Create persistent browser fingerprint identity |233| `cloak_identity_use` | Apply saved identity to a fetch call |234| `cloak_identity_list` | List all saved identities with trust summary |235236### Behaviour Tools237238| Tool | Function |239|---|---|240| `cloak_inject_behaviour` | Human mouse/scroll injection for reCAPTCHA/Cloudflare bypass |241| `cloak_behaviour_stats` | List available patterns and categories |242| `cloak_load_pattern` | Load custom recorded behaviour pattern |243| `cloak_pattern_stats` | Self-improving pattern store tier breakdown |244| `cloak_pattern_list` | List patterns with scores and tier |245| `cloak_pattern_prune` | Remove stale/low-scoring patterns |246| `cloak_pattern_seed` | Seed store with built-in patterns |247248### CAPTCHA Tools249250| Tool | Function |251|---|---|252| `cloak_detect_captcha` | Detect CAPTCHA type and sitekey |253| `cloak_solve_captcha` | Solve via vision AI (Claude/GPT-4o/2captcha) |254255### Compression and Cortex Tools256257| Tool | Function |258|---|---|259| `turbo_quant_compress` | PolarQuant vector compression (~75% smaller) |260| `turbo_quant_stats` | Compression ratio and savings stats |261| `cloak_cortex` | Scan project directory into MAGMA entity graph |262| `cloak_cortex_anatomy` | Get cached file anatomy without rescanning |263264### Multimodal Tools265266| Tool | Function |267|---|---|268| `vektor_text` | Text generation across providers (OpenAI/Claude/Groq/Gemini/NVIDIA NIM) |269| `vektor_image` | Image generation (DALL-E, Stability, NVIDIA) |270| `vektor_vision` | Image understanding and analysis |271| `vektor_speech` | Text-to-speech and transcription |272| `vektor_search` | Web search with memory integration |273| `vektor_providers` | List available multimodal providers and status |274275### Agent Tools276277| Tool | Function |278|---|---|279| `vektor_agent_run` | Run autonomous goal executor with memory |280| `vektor_swarm` | Launch multi-agent swarm task |281| `vektor_watch` | File system watcher — auto-ingest on change |282283---284285## Claude Code Setup286287Add to `.claude/settings.json` in your project:288289```json290{291 "mcpServers": {292 "vektor": {293 "command": "node",294 "args": ["/path/to/node_modules/vektor-slipstream/index.js"],295 "env": {296 "VEKTOR_LICENCE_KEY": "your-licence-key",297 "CLOAK_PROJECT_PATH": "/path/to/your/project"298 }299 }300 }301}302```303304All 44 tools are available in Claude Code via this config.305306---307308## What's Included309310### Memory Core (MAGMA)311312- 4-layer associative graph — semantic, causal, temporal, entity313- MAGMA graph bridge — co-occurrence and temporal edges in SQLite (`vektor-magma-bridge.js`)314- bge-small-en-v1.5 bi-encoder + ms-marco cross-encoder reranker (`vektor-embedder.js`)315- BM25 + stemmed BM25 + RRF fusion — keyword + semantic dual-channel recall316- Persistent entity index — guaranteed named-entity retrieval317- Foresight extraction — future-tense statements stored with temporal metadata318- ADD-only contradiction detection — full history preserved, no silent overwrites319- AUDN curation loop — zero contradictions, zero duplicates320- REM dream cycle — up to 50:1 compression321- Sub-1ms recall — local SQLite, no network required322- Local ONNX embeddings — $0 embedding cost, no API key required323324### Integrations325326- **Claude Desktop** — DXT extension, 44 tools, auto-memory system prompt327- **Claude Code** — MCP server, all 44 tools328- **CLI** — `chat`, `remember`, `ask`, `agent` commands329- **LangChain** — v1 + v2 adapter included330- **OpenAI Agents SDK** — drop-in integration331- **Gemini · Groq · Ollama · NVIDIA NIM** — provider agnostic332333---334335## Performance336337| Metric | Value |338|---|---|339| Recall latency | sub-1ms (local SQLite + ONNX) |340| Embedding cost | $0 — fully local ONNX |341| Embedding latency | ~10ms GPU / ~25ms CPU |342| LoCoMo benchmark | 66.9% adjusted judge accuracy |343| vs Mem0 | beats Mem0 old algorithm (62.47%) |344| First run | ~2 min (downloads ~25MB model once) |345| Subsequent boots | <100ms |346347## Hardware Auto-Detection348349Zero config. VEKTOR detects and uses the best available accelerator:350351- **NVIDIA CUDA** — GPU acceleration352- **Apple Silicon** — CoreML353- **CPU** — optimised fallback, works everywhere354355---356357## Environment Variables358359| Variable | Default | Purpose |360|---|---|---|361| `VEKTOR_SUMMARIZE` | `false` | Enable LLM session summarization on ingest |362| `VEKTOR_TRIPLES` | `true` | Enable batch triple extraction on ingest |363| `VEKTOR_FORESIGHT` | `true` | Extract future-tense foresight signals |364| `VEKTOR_TEMPORAL` | `true` | Enable temporal index and date boosting |365| `VEKTOR_CONTRADICT` | `true` | Enable ADD-only contradiction detection |366| `VEKTOR_DEBUG` | — | Enable verbose retrieval debug output |367| `VEKTOR_MODEL` | `Xenova/bge-small-en-v1.5` | Swap embedding model (e.g. bge-large for higher accuracy) |368| `VEKTOR_RERANK` | `true` | Enable cross-encoder reranking |369370---371372## Licence373374Commercial licence granted.375Monthly fee - all updates included376377Solo $9/mo → 3 licences |378Team $35/mo → 5 licences |379Studio $59/mo → 10 licences |380Enterprise $99/mo → 25 licences |381382Purchase: [vektormemory.com/product#pricing](https://vektormemory.com/product#pricing)383Docs: [vektormemory.com/docs](https://vektormemory.com/docs)384Support: hello@vektormemory.com385386---387388## Research389390Built on peer-reviewed research:391392- [MAGMA (arxiv:2601.03236)](https://arxiv.org/abs/2601.03236) — Multi-Graph Agentic Memory Architecture393- [EverMemOS (arxiv:2601.02163)](https://arxiv.org/abs/2601.02163) — Self-Organizing Memory OS394- [HippoRAG (arxiv:2405.14831)](https://arxiv.org/abs/2405.14831) — Neurobiologically Inspired Long-Term Memory (NeurIPS 2024)395- [Mem0 (arxiv:2504.19413)](https://arxiv.org/abs/2504.19413) — Production-Ready Agent Memory396- [LoCoMo Benchmark](https://arxiv.org/abs/2402.17753) — Long-Context Conversational Memory evaluation397
Full transparency — inspect the skill content before installing.