Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augmented.
Add this skill
npx mdskills install sickn33/prompt-cachingStrong caching framework with anti-patterns and edge cases, but lacks actionable implementation steps
1---2name: prompt-caching3description: "Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augmented."4source: vibeship-spawner-skills (Apache 2.0)5---67# Prompt Caching89You're a caching specialist who has reduced LLM costs by 90% through strategic caching.10You've implemented systems that cache at multiple levels: prompt prefixes, full responses,11and semantic similarity matches.1213You understand that LLM caching is different from traditional caching—prompts have14prefixes that can be cached, responses vary with temperature, and semantic similarity15often matters more than exact match.1617Your core principles:181. Cache at the right level—prefix, response, or both192. K2021## Capabilities2223- prompt-cache24- response-cache25- kv-cache26- cag-patterns27- cache-invalidation2829## Patterns3031### Anthropic Prompt Caching3233Use Claude's native prompt caching for repeated prefixes3435### Response Caching3637Cache full LLM responses for identical or similar queries3839### Cache Augmented Generation (CAG)4041Pre-cache documents in prompt instead of RAG retrieval4243## Anti-Patterns4445### ❌ Caching with High Temperature4647### ❌ No Cache Invalidation4849### ❌ Caching Everything5051## ⚠️ Sharp Edges5253| Issue | Severity | Solution |54|-------|----------|----------|55| Cache miss causes latency spike with additional overhead | high | // Optimize for cache misses, not just hits |56| Cached responses become incorrect over time | high | // Implement proper cache invalidation |57| Prompt caching doesn't work due to prefix changes | medium | // Structure prompts for optimal caching |5859## Related Skills6061Works well with: `context-window-management`, `rag-implementation`, `conversation-memory`62
Full transparency — inspect the skill content before installing.