Debug AI traces, find exceptions, analyze sessions, and manage prompts via Langfuse MCP. Also handles MCP setup and configuration.
Add this skill
npx mdskills install avivsinai/langfuseComprehensive LLM observability guide with practical integration patterns and clear examples
1---2name: langfuse3version: 1.0.24description: Debug AI traces, find exceptions, analyze sessions, and manage prompts via Langfuse MCP. Also handles MCP setup and configuration.5metadata:6 short-description: Langfuse observability via MCP7 compatibility: claude-code, codex-cli8---910# Langfuse Skill1112Debug your AI systems through Langfuse observability.1314**Triggers:** langfuse, traces, debug AI, find exceptions, set up langfuse, what went wrong, why is it slow, datasets, evaluation sets1516## Setup1718**Step 1:** Get credentials from https://cloud.langfuse.com → Settings → API Keys1920If self-hosted, use your instance URL for `LANGFUSE_HOST` and create keys there.2122**Step 2:** Install MCP (pick one):2324```bash25# Claude Code (project-scoped, shared via .mcp.json)26claude mcp add \27 --scope project \28 --env LANGFUSE_PUBLIC_KEY=pk-... \29 --env LANGFUSE_SECRET_KEY=sk-... \30 --env LANGFUSE_HOST=https://cloud.langfuse.com \31 langfuse -- uvx --python 3.11 langfuse-mcp3233# Codex CLI (user-scoped, stored in ~/.codex/config.toml)34codex mcp add langfuse \35 --env LANGFUSE_PUBLIC_KEY=pk-... \36 --env LANGFUSE_SECRET_KEY=sk-... \37 --env LANGFUSE_HOST=https://cloud.langfuse.com \38 -- uvx --python 3.11 langfuse-mcp39```4041**Step 3:** Restart CLI, verify with `/mcp` (Claude) or `codex mcp list` (Codex)4243**Step 4:** Test: `fetch_traces(age=60)`4445### Read-Only Mode4647For safer observability without risk of modifying prompts or datasets, enable read-only mode:4849```bash50# CLI flag51langfuse-mcp --read-only5253# Or environment variable54LANGFUSE_MCP_READ_ONLY=true55```5657This disables write tools: `create_text_prompt`, `create_chat_prompt`, `update_prompt_labels`, `create_dataset`, `create_dataset_item`, `delete_dataset_item`.5859For manual `.mcp.json` setup or troubleshooting, see `references/setup.md`.6061---6263## Playbooks6465### "Where are the errors?"6667```68find_exceptions(age=1440, group_by="file")69```70→ Shows error counts by file. Pick the worst offender.7172```73find_exceptions_in_file(filepath="src/ai/chat.py", age=1440)74```75→ Lists specific exceptions. Grab a trace_id.7677```78get_exception_details(trace_id="...")79```80→ Full stacktrace and context.8182---8384### "What happened in this interaction?"8586```87fetch_traces(age=60, user_id="...")88```89→ Find the trace. Note the trace_id.9091If you don't know the user_id, start with:92```93fetch_traces(age=60)94```9596```97fetch_trace(trace_id="...", include_observations=true)98```99→ See all LLM calls in the trace.100101```102fetch_observation(observation_id="...")103```104→ Inspect a specific generation's input/output.105106---107108### "Why is it slow?"109110```111fetch_observations(age=60, type="GENERATION")112```113→ Find recent LLM calls. Look for high latency.114115```116fetch_observation(observation_id="...")117```118→ Check token counts, model, timing.119120---121122### "What's this user experiencing?"123124```125get_user_sessions(user_id="...", age=1440)126```127→ List their sessions.128129```130get_session_details(session_id="...")131```132→ See all traces in the session.133134---135136### "Manage datasets"137138```139list_datasets()140```141→ See all datasets.142143```144get_dataset(name="evaluation-set-v1")145```146→ Get dataset details.147148```149list_dataset_items(dataset_name="evaluation-set-v1", page=1, limit=10)150```151→ Browse items in the dataset.152153```154create_dataset(name="qa-test-cases", description="QA evaluation set")155```156→ Create a new dataset.157158```159create_dataset_item(160 dataset_name="qa-test-cases",161 input={"question": "What is 2+2?"},162 expected_output={"answer": "4"}163)164```165→ Add test cases.166167```168create_dataset_item(169 dataset_name="qa-test-cases",170 item_id="item_123",171 input={"question": "What is 3+3?"},172 expected_output={"answer": "6"}173)174```175→ Upsert: updates existing item by id or creates if missing.176177---178179### "Manage prompts"180181```182list_prompts()183```184→ See all prompts with labels.185186```187get_prompt(name="...", label="production")188```189→ Fetch current production version.190191```192create_text_prompt(name="...", prompt="...", labels=["staging"])193```194→ Create new version in staging.195196```197update_prompt_labels(name="...", version=N, labels=["production"])198```199→ Promote to production. (Rollback = re-apply label to older version)200201---202203## Quick Reference204205| Task | Tool |206|------|------|207| List traces | `fetch_traces(age=N)` |208| Get trace details | `fetch_trace(trace_id="...", include_observations=true)` |209| List LLM calls | `fetch_observations(age=N, type="GENERATION")` |210| Get observation | `fetch_observation(observation_id="...")` |211| Error count | `get_error_count(age=N)` |212| Find exceptions | `find_exceptions(age=N, group_by="file")` |213| List sessions | `fetch_sessions(age=N)` |214| User sessions | `get_user_sessions(user_id="...", age=N)` |215| List prompts | `list_prompts()` |216| Get prompt | `get_prompt(name="...", label="production")` |217| List datasets | `list_datasets()` |218| Get dataset | `get_dataset(name="...")` |219| List dataset items | `list_dataset_items(dataset_name="...", limit=N)` |220| Create/update dataset item | `create_dataset_item(dataset_name="...", item_id="...")` |221222`age` = minutes to look back (max 10080 = 7 days)223224---225226## References227228- `references/tool-reference.md` — Full parameter docs, filter semantics, response schemas229- `references/setup.md` — Manual setup, troubleshooting, advanced configuration230
Full transparency — inspect the skill content before installing.