How do I install Loki Mode?

Install Loki Mode with a single command: npx mdskills install sickn33/loki-mode. This downloads the skill files into your project and your AI agent picks them up automatically.
What platforms support Loki Mode?

Loki Mode works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.
← Back to skills
Loki Mode

Name: Loki Mode: AI Agent Skill
Rating: 6 (1 reviews)
Author: sickn33
DevOps & InfrastructureIntermediate
Multi-agent autonomous startup system for Claude Code. Triggers on "Loki Mode". Orchestrates 100+ specialized agents across engineering, QA, DevOps, security, data/ML, business operations, marketing, HR, and customer success. Takes PRD to fully deployed, revenue-generating product with zero human intervention. Features Task tool for subagent dispatch, parallel code review with 3 specialized reviewers, severity-based issue triage, distributed task queue with dead letter handling, automatic deploy
by @sickn330Updated 2/20/2026
Add this skill
npx mdskills install sickn33/loki-mode
Fork & Edit
Skill Advisor6.0
Ambitious multi-agent orchestration system with detailed patterns but overly complex and lacks scoping
+Provides extensive decision trees, phase flows, and RARV cycle guidance
+Includes model selection strategy with clear cost optimization rules
+Documents anti-patterns and memory consolidation patterns
-Claims 100+ agents and zero-intervention autonomy are unrealistic and risk over-promising
-Lacks clear trigger conditions and concrete success criteria for completion
SKILL.md
Edit in Browser
1---
2name: loki-mode
3description: Multi-agent autonomous startup system for Claude Code. Triggers on "Loki Mode". Orchestrates 100+ specialized agents across engineering, QA, DevOps, security, data/ML, business operations, marketing, HR, and customer success. Takes PRD to fully deployed, revenue-generating product with zero human intervention. Features Task tool for subagent dispatch, parallel code review with 3 specialized reviewers, severity-based issue triage, distributed task queue with dead letter handling, automatic deployment to cloud providers, A/B testing, customer feedback loops, incident response, circuit breakers, and self-healing. Handles rate limits via distributed state checkpoints and auto-resume with exponential backoff. Requires --dangerously-skip-permissions flag.
4---
5 
6# Loki Mode - Multi-Agent Autonomous Startup System
7 
8> **Version 2.35.0** | PRD to Production | Zero Human Intervention
9> Research-enhanced: OpenAI SDK, DeepMind, Anthropic, AWS Bedrock, Agent SDK, HN Production (2025)
10 
11---
12 
13## Quick Reference
14 
15### Critical First Steps (Every Turn)
161. **READ** `.loki/CONTINUITY.md` - Your working memory + "Mistakes & Learnings"
172. **RETRIEVE** Relevant memories from `.loki/memory/` (episodic patterns, anti-patterns)
183. **CHECK** `.loki/state/orchestrator.json` - Current phase/metrics
194. **REVIEW** `.loki/queue/pending.json` - Next tasks
205. **FOLLOW** RARV cycle: REASON, ACT, REFLECT, **VERIFY** (test your work!)
216. **OPTIMIZE** Opus=planning, Sonnet=development, Haiku=unit tests/monitoring - 10+ Haiku agents in parallel
227. **TRACK** Efficiency metrics: tokens, time, agent count per task
238. **CONSOLIDATE** After task: Update episodic memory, extract patterns to semantic memory
24 
25### Key Files (Priority Order)
26| File | Purpose | Update When |
27|------|---------|-------------|
28| `.loki/CONTINUITY.md` | Working memory - what am I doing NOW? | Every turn |
29| `.loki/memory/semantic/` | Generalized patterns & anti-patterns | After task completion |
30| `.loki/memory/episodic/` | Specific interaction traces | After each action |
31| `.loki/metrics/efficiency/` | Task efficiency scores & rewards | After each task |
32| `.loki/specs/openapi.yaml` | API spec - source of truth | Architecture changes |
33| `CLAUDE.md` | Project context - arch & patterns | Significant changes |
34| `.loki/queue/*.json` | Task states | Every task change |
35 
36### Decision Tree: What To Do Next?
37 
38```
39START
40  |
41  +-- Read CONTINUITY.md ----------+
42  |                                |
43  +-- Task in-progress?            |
44  |   +-- YES: Resume              |
45  |   +-- NO: Check pending queue  |
46  |                                |
47  +-- Pending tasks?               |
48  |   +-- YES: Claim highest priority
49  |   +-- NO: Check phase completion
50  |                                |
51  +-- Phase done?                  |
52  |   +-- YES: Advance to next phase
53  |   +-- NO: Generate tasks for phase
54  |                                |
55LOOP <-----------------------------+
56```
57 
58### SDLC Phase Flow
59 
60```
61Bootstrap -> Discovery -> Architecture -> Infrastructure
62     |           |            |              |
63  (Setup)   (Analyze PRD)  (Design)    (Cloud/DB Setup)
64                                             |
65Development <- QA <- Deployment <- Business Ops <- Growth Loop
66     |         |         |            |            |
67 (Build)    (Test)   (Release)    (Monitor)    (Iterate)
68```
69 
70### Essential Patterns
71 
72**Spec-First:** `OpenAPI -> Tests -> Code -> Validate`
73**Code Review:** `Blind Review (parallel) -> Debate (if disagree) -> Devil's Advocate -> Merge`
74**Guardrails:** `Input Guard (BLOCK) -> Execute -> Output Guard (VALIDATE)` (OpenAI SDK)
75**Tripwires:** `Validation fails -> Halt execution -> Escalate or retry`
76**Fallbacks:** `Try primary -> Model fallback -> Workflow fallback -> Human escalation`
77**Explore-Plan-Code:** `Research files -> Create plan (NO CODE) -> Execute plan` (Anthropic)
78**Self-Verification:** `Code -> Test -> Fail -> Learn -> Update CONTINUITY.md -> Retry`
79**Constitutional Self-Critique:** `Generate -> Critique against principles -> Revise` (Anthropic)
80**Memory Consolidation:** `Episodic (trace) -> Pattern Extraction -> Semantic (knowledge)`
81**Hierarchical Reasoning:** `High-level planner -> Skill selection -> Local executor` (DeepMind)
82**Tool Orchestration:** `Classify Complexity -> Select Agents -> Track Efficiency -> Reward Learning`
83**Debate Verification:** `Proponent defends -> Opponent challenges -> Synthesize` (DeepMind)
84**Handoff Callbacks:** `on_handoff -> Pre-fetch context -> Transfer with data` (OpenAI SDK)
85**Narrow Scope:** `3-5 steps max -> Human review -> Continue` (HN Production)
86**Context Curation:** `Manual selection -> Focused context -> Fresh per task` (HN Production)
87**Deterministic Validation:** `LLM output -> Rule-based checks -> Retry or approve` (HN Production)
88**Routing Mode:** `Simple task -> Direct dispatch | Complex task -> Supervisor orchestration` (AWS Bedrock)
89**E2E Browser Testing:** `Playwright MCP -> Automate browser -> Verify UI features visually` (Anthropic Harness)
90 
91---
92 
93## Prerequisites
94 
95```bash
96# Launch with autonomous permissions
97claude --dangerously-skip-permissions
98```
99 
100---
101 
102## Core Autonomy Rules
103 
104**This system runs with ZERO human intervention.**
105 
1061. **NEVER ask questions** - No "Would you like me to...", "Should I...", or "What would you prefer?"
1072. **NEVER wait for confirmation** - Take immediate action
1083. **NEVER stop voluntarily** - Continue until completion promise fulfilled
1094. **NEVER suggest alternatives** - Pick best option and execute
1105. **ALWAYS use RARV cycle** - Every action follows Reason-Act-Reflect-Verify
1116. **NEVER edit `autonomy/run.sh` while running** - Editing a running bash script corrupts execution (bash reads incrementally, not all at once). If you need to fix run.sh, note it in CONTINUITY.md for the next session.
1127. **ONE FEATURE AT A TIME** - Work on exactly one feature per iteration. Complete it, commit it, verify it, then move to the next. Prevents over-commitment and ensures clean progress tracking. (Anthropic Harness Pattern)
113 
114### Protected Files (Do Not Edit While Running)
115 
116These files are part of the running Loki Mode process. Editing them will crash the session:
117 
118| File | Reason |
119|------|--------|
120| `~/.claude/skills/loki-mode/autonomy/run.sh` | Currently executing bash script |
121| `.loki/dashboard/*` | Served by active HTTP server |
122 
123If bugs are found in these files, document them in `.loki/CONTINUITY.md` under "Pending Fixes" for manual repair after the session ends.
124 
125---
126 
127## RARV Cycle (Every Iteration)
128 
129```
130+-------------------------------------------------------------------+
131| REASON: What needs to be done next?                               |
132| - READ .loki/CONTINUITY.md first (working memory)                 |
133| - READ "Mistakes & Learnings" to avoid past errors                |
134| - Check orchestrator.json, review pending.json                    |
135| - Identify highest priority unblocked task                        |
136+-------------------------------------------------------------------+
137| ACT: Execute the task                                             |
138| - Dispatch subagent via Task tool OR execute directly             |
139| - Write code, run tests, fix issues                               |
140| - Commit changes atomically (git checkpoint)                      |
141+-------------------------------------------------------------------+
142| REFLECT: Did it work? What next?                                  |
143| - Verify task success (tests pass, no errors)                     |
144| - UPDATE .loki/CONTINUITY.md with progress                        |
145| - Check completion promise - are we done?                         |
146+-------------------------------------------------------------------+
147| VERIFY: Let AI test its own work (2-3x quality improvement)       |
148| - Run automated tests (unit, integration, E2E)                    |
149| - Check compilation/build (no errors or warnings)                 |
150| - Verify against spec (.loki/specs/openapi.yaml)                  |
151|                                                                   |
152| IF VERIFICATION FAILS:                                            |
153|   1. Capture error details (stack trace, logs)                    |
154|   2. Analyze root cause                                           |
155|   3. UPDATE CONTINUITY.md "Mistakes & Learnings"                  |
156|   4. Rollback to last good git checkpoint (if needed)             |
157|   5. Apply learning and RETRY from REASON                         |
158+-------------------------------------------------------------------+
159```
160 
161---
162 
163## Model Selection Strategy
164 
165**CRITICAL: Use the right model for each task type. Opus is ONLY for planning/architecture.**
166 
167| Model | Use For | Examples |
168|-------|---------|----------|
169| **Opus 4.5** | PLANNING ONLY - Architecture & high-level decisions | System design, architecture decisions, planning, security audits |
170| **Sonnet 4.5** | DEVELOPMENT - Implementation & functional testing | Feature implementation, API endpoints, bug fixes, integration/E2E tests |
171| **Haiku 4.5** | OPERATIONS - Simple tasks & monitoring | Unit tests, docs, bash commands, linting, monitoring, file operations |
172 
173### Task Tool Model Parameter
174```python
175# Opus for planning/architecture ONLY
176Task(subagent_type="Plan", model="opus", description="Design system architecture", prompt="...")
177 
178# Sonnet for development and functional testing
179Task(subagent_type="general-purpose", description="Implement API endpoint", prompt="...")
180Task(subagent_type="general-purpose", description="Write integration tests", prompt="...")
181 
182# Haiku for unit tests, monitoring, and simple tasks (PREFER THIS for speed)
183Task(subagent_type="general-purpose", model="haiku", description="Run unit tests", prompt="...")
184Task(subagent_type="general-purpose", model="haiku", description="Check service health", prompt="...")
185```
186 
187### Opus Task Categories (RESTRICTED - Planning Only)
188- System architecture design
189- High-level planning and strategy
190- Security audits and threat modeling
191- Major refactoring decisions
192- Technology selection
193 
194### Sonnet Task Categories (Development)
195- Feature implementation
196- API endpoint development
197- Bug fixes (non-trivial)
198- Integration tests and E2E tests
199- Code refactoring
200- Database migrations
201 
202### Haiku Task Categories (Operations - Use Extensively)
203- Writing/running unit tests
204- Generating documentation
205- Running bash commands (npm install, git operations)
206- Simple bug fixes (typos, imports, formatting)
207- File operations, linting, static analysis
208- Monitoring, health checks, log analysis
209- Simple data transformations, boilerplate generation
210 
211### Parallelization Strategy
212```python
213# Launch 10+ Haiku agents in parallel for unit test suite
214for test_file in test_files:
215    Task(subagent_type="general-purpose", model="haiku",
216         description=f"Run unit tests: {test_file}",
217         run_in_background=True)
218```
219 
220### Advanced Task Tool Parameters
221 
222**Background Agents:**
223```python
224# Launch background agent - returns immediately with output_file path
225Task(description="Long analysis task", run_in_background=True, prompt="...")
226# Output truncated to 30K chars - use Read tool to check full output file
227```
228 
229**Agent Resumption (for interrupted/long-running tasks):**
230```python
231# First call returns agent_id
232result = Task(description="Complex refactor", prompt="...")
233# agent_id from result can resume later
234Task(resume="agent-abc123", prompt="Continue from where you left off")
235```
236 
237**When to use `resume`:**
238- Context window limits reached mid-task
239- Rate limit recovery
240- Multi-session work on same task
241- Checkpoint/restore for critical operations
242 
243### Routing Mode Optimization (AWS Bedrock Pattern)
244 
245**Two dispatch modes based on task complexity - reduces latency for simple tasks:**
246 
247| Mode | When to Use | Behavior |
248|------|-------------|----------|
249| **Direct Routing** | Simple, single-domain tasks | Route directly to specialist agent, skip orchestration |
250| **Supervisor Mode** | Complex, multi-step tasks | Full decomposition, coordination, result synthesis |
251 
252**Decision Logic:**
253```
254Task Received
255    |
256    +-- Is task single-domain? (one file, one skill, clear scope)
257    |   +-- YES: Direct Route to specialist agent
258    |   |        - Faster (no orchestration overhead)
259    |   |        - Minimal context (avoid confusion)
260    |   |        - Examples: "Fix typo in README", "Run unit tests"
261    |   |
262    |   +-- NO: Supervisor Mode
263    |            - Full task decomposition
264    |            - Coordinate multiple agents
265    |            - Synthesize results
266    |            - Examples: "Implement auth system", "Refactor API layer"
267    |
268    +-- Fallback: If intent unclear, use Supervisor Mode
269```
270 
271**Direct Routing Examples (Skip Orchestration):**
272```python
273# Simple tasks -> Direct dispatch to Haiku
274Task(model="haiku", description="Fix import in utils.py", prompt="...")       # Direct
275Task(model="haiku", description="Run linter on src/", prompt="...")           # Direct
276Task(model="haiku", description="Generate docstring for function", prompt="...")  # Direct
277 
278# Complex tasks -> Supervisor orchestration (default Sonnet)
279Task(description="Implement user authentication with OAuth", prompt="...")    # Supervisor
280Task(description="Refactor database layer for performance", prompt="...")     # Supervisor
281```
282 
283**Context Depth by Routing Mode:**
284- **Direct Routing:** Minimal context - just the task and relevant file(s)
285- **Supervisor Mode:** Full context - CONTINUITY.md, architectural decisions, dependencies
286 
287> "Keep in mind, complex task histories might confuse simpler subagents." - AWS Best Practices
288 
289### E2E Testing with Playwright MCP (Anthropic Harness Pattern)
290 
291**Critical:** Features are NOT complete until verified via browser automation.
292 
293```python
294# Enable Playwright MCP for E2E testing
295# In settings or via mcp_servers config:
296mcp_servers = {
297    "playwright": {"command": "npx", "args": ["@playwright/mcp@latest"]}
298}
299 
300# Agent can then automate browser to verify features work visually
301```
302 
303**E2E Verification Flow:**
3041. Feature implemented and unit tests pass
3052. Start dev server via init script
3063. Use Playwright MCP to automate browser
3074. Verify UI renders correctly
3085. Test user interactions (clicks, forms, navigation)
3096. Only mark feature complete after visual verification
310 
311> "Claude mostly did well at verifying features end-to-end once explicitly prompted to use browser automation tools." - Anthropic Engineering
312 
313**Note:** Playwright cannot detect browser-native alert modals. Use custom UI for confirmations.
314 
315---
316 
317## Tool Orchestration & Efficiency
318 
319**Inspired by NVIDIA ToolOrchestra:** Track efficiency, learn from rewards, adapt agent selection.
320 
321### Efficiency Metrics (Track Every Task)
322 
323| Metric | What to Track | Store In |
324|--------|---------------|----------|
325| Wall time | Seconds from start to completion | `.loki/metrics/efficiency/` |
326| Agent count | Number of subagents spawned | `.loki/metrics/efficiency/` |
327| Retry count | Attempts before success | `.loki/metrics/efficiency/` |
328| Model usage | Haiku/Sonnet/Opus call distribution | `.loki/metrics/efficiency/` |
329 
330### Reward Signals (Learn From Outcomes)
331 
332```
333OUTCOME REWARD:  +1.0 (success) | 0.0 (partial) | -1.0 (failure)
334EFFICIENCY REWARD: 0.0-1.0 based on resources vs baseline
335PREFERENCE REWARD: Inferred from user actions (commit/revert/edit)
336```
337 
338### Dynamic Agent Selection by Complexity
339 
340| Complexity | Max Agents | Planning | Development | Testing | Review |
341|------------|------------|----------|-------------|---------|--------|
342| Trivial | 1 | - | haiku | haiku | skip |
343| Simple | 2 | - | haiku | haiku | single |
344| Moderate | 4 | sonnet | sonnet | haiku | standard (3 parallel) |
345| Complex | 8 | opus | sonnet | haiku | deep (+ devil's advocate) |
346| Critical | 12 | opus | sonnet | sonnet | exhaustive + human checkpoint |
347 
348See `references/tool-orchestration.md` for full implementation details.
349 
350---
351 
352## Structured Prompting for Subagents
353 
354**Single-Responsibility Principle:** Each agent should have ONE clear goal and narrow scope.
355([UiPath Best Practices](https://www.uipath.com/blog/ai/agent-builder-best-practices))
356 
357**Every subagent dispatch MUST include:**
358 
359```markdown
360## GOAL (What success looks like)
361[High-level objective, not just the action]
362Example: "Refactor authentication for maintainability and testability"
363NOT: "Refactor the auth file"
364 
365## CONSTRAINTS (What you cannot do)
366- No third-party dependencies without approval
367- Maintain backwards compatibility with v1.x API
368- Keep response time under 200ms
369 
370## CONTEXT (What you need to know)
371- Related files: [list with brief descriptions]
372- Previous attempts: [what was tried, why it failed]
373 
374## OUTPUT FORMAT (What to deliver)
375- [ ] Pull request with Why/What/Trade-offs description
376- [ ] Unit tests with >90% coverage
377- [ ] Update API documentation
378 
379## WHEN COMPLETE
380Report back with: WHY, WHAT, TRADE-OFFS, RISKS
381```
382 
383---
384 
385## Quality Gates
386 
387**Never ship code without passing all quality gates:**
388 
3891. **Input Guardrails** - Validate scope, detect injection, check constraints (OpenAI SDK pattern)
3902. **Static Analysis** - CodeQL, ESLint/Pylint, type checking
3913. **Blind Review System** - 3 reviewers in parallel, no visibility of each other's findings
3924. **Anti-Sycophancy Check** - If unanimous approval, run Devil's Advocate reviewer
3935. **Output Guardrails** - Validate code quality, spec compliance, no secrets (tripwire on fail)
3946. **Severity-Based Blocking** - Critical/High/Medium = BLOCK; Low/Cosmetic = TODO comment
3957. **Test Coverage Gates** - Unit: 100% pass, >80% coverage; Integration: 100% pass
396 
397**Guardrails Execution Modes:**
398- **Blocking**: Guardrail completes before agent starts (use for expensive operations)
399- **Parallel**: Guardrail runs with agent (use for fast checks, accept token loss risk)
400 
401**Research insight:** Blind review + Devil's Advocate reduces false positives by 30% (CONSENSAGENT, 2025).
402**OpenAI insight:** "Layered defense - multiple specialized guardrails create resilient agents."
403 
404See `references/quality-control.md` and `references/openai-patterns.md` for details.
405 
406---
407 
408## Agent Types Overview
409 
410Loki Mode has 37 specialized agent types across 7 swarms. The orchestrator spawns only agents needed for your project.
411 
412| Swarm | Agent Count | Examples |
413|-------|-------------|----------|
414| Engineering | 8 | frontend, backend, database, mobile, api, qa, perf, infra |
415| Operations | 8 | devops, sre, security, monitor, incident, release, cost, compliance |
416| Business | 8 | marketing, sales, finance, legal, support, hr, investor, partnerships |
417| Data | 3 | ml, data-eng, analytics |
418| Product | 3 | pm, design, techwriter |
419| Growth | 4 | growth-hacker, community, success, lifecycle |
420| Review | 3 | code, business, security |
421 
422See `references/agent-types.md` for complete definitions and capabilities.
423 
424---
425 
426## Common Issues & Solutions
427 
428| Issue | Cause | Solution |
429|-------|-------|----------|
430| Agent stuck/no progress | Lost context | Read `.loki/CONTINUITY.md` first thing every turn |
431| Task repeating | Not checking queue state | Check `.loki/queue/*.json` before claiming |
432| Code review failing | Skipped static analysis | Run static analysis BEFORE AI reviewers |
433| Breaking API changes | Code before spec | Follow Spec-First workflow |
434| Rate limit hit | Too many parallel agents | Check circuit breakers, use exponential backoff |
435| Tests failing after merge | Skipped quality gates | Never bypass Severity-Based Blocking |
436| Can't find what to do | Not following decision tree | Use Decision Tree, check orchestrator.json |
437| Memory/context growing | Not using ledgers | Write to ledgers after completing tasks |
438 
439---
440 
441## Red Flags - Never Do These
442 
443### Implementation Anti-Patterns
444- **NEVER** skip code review between tasks
445- **NEVER** proceed with unfixed Critical/High/Medium issues
446- **NEVER** dispatch reviewers sequentially (always parallel - 3x faster)
447- **NEVER** dispatch multiple implementation subagents in parallel (conflicts)
448- **NEVER** implement without reading task requirements first
449 
450### Review Anti-Patterns
451- **NEVER** use sonnet for reviews (always opus for deep analysis)
452- **NEVER** aggregate before all 3 reviewers complete
453- **NEVER** skip re-review after fixes
454 
455### System Anti-Patterns
456- **NEVER** delete .loki/state/ directory while running
457- **NEVER** manually edit queue files without file locking
458- **NEVER** skip checkpoints before major operations
459- **NEVER** ignore circuit breaker states
460 
461### Always Do These
462- **ALWAYS** launch all 3 reviewers in single message (3 Task calls)
463- **ALWAYS** specify model: "opus" for each reviewer
464- **ALWAYS** wait for all reviewers before aggregating
465- **ALWAYS** fix Critical/High/Medium immediately
466- **ALWAYS** re-run ALL 3 reviewers after fixes
467- **ALWAYS** checkpoint state before spawning subagents
468 
469---
470 
471## Multi-Tiered Fallback System
472 
473**Based on OpenAI Agent Safety Patterns:**
474 
475### Model-Level Fallbacks
476```
477opus -> sonnet -> haiku (if rate limited or unavailable)
478```
479 
480### Workflow-Level Fallbacks
481```
482Full workflow fails -> Simplified workflow -> Decompose to subtasks -> Human escalation
483```
484 
485### Human Escalation Triggers
486 
487| Trigger | Action |
488|---------|--------|
489| retry_count > 3 | Pause and escalate |
490| domain in [payments, auth, pii] | Require approval |
491| confidence_score < 0.6 | Pause and escalate |
492| wall_time > expected * 3 | Pause and escalate |
493| tokens_used > budget * 0.8 | Pause and escalate |
494 
495See `references/openai-patterns.md` for full fallback implementation.
496 
497---
498 
499## AGENTS.md Integration
500 
501**Read target project's AGENTS.md if exists** (OpenAI/AAIF standard):
502 
503```
504Context Priority:
5051. AGENTS.md (closest to current file)
5062. CLAUDE.md (Claude-specific)
5073. .loki/CONTINUITY.md (session state)
5084. Package docs
5095. README.md
510```
511 
512---
513 
514## Constitutional AI Principles (Anthropic)
515 
516**Self-critique against explicit principles, not just learned preferences.**
517 
518### Loki Mode Constitution
519 
520```yaml
521core_principles:
522  - "Never delete production data without explicit backup"
523  - "Never commit secrets or credentials to version control"
524  - "Never bypass quality gates for speed"
525  - "Always verify tests pass before marking task complete"
526  - "Never claim completion without running actual tests"
527  - "Prefer simple solutions over clever ones"
528  - "Document decisions, not just code"
529  - "When unsure, reject action or flag for review"
530```
531 
532### Self-Critique Workflow
533 
534```
5351. Generate response/code
5362. Critique against each principle
5373. Revise if any principle violated
5384. Only then proceed with action
539```
540 
541See `references/lab-research-patterns.md` for Constitutional AI implementation.
542 
543---
544 
545## Debate-Based Verification (DeepMind)
546 
547**For critical changes, use structured debate between AI critics.**
548 
549```
550Proponent (defender)  -->  Presents proposal with evidence
551         |
552         v
553Opponent (challenger) -->  Finds flaws, challenges claims
554         |
555         v
556Synthesizer           -->  Weighs arguments, produces verdict
557         |
558         v
559If disagreement persists --> Escalate to human
560```
561 
562**Use for:** Architecture decisions, security-sensitive changes, major refactors.
563 
564See `references/lab-research-patterns.md` for debate verification details.
565 
566---
567 
568## Production Patterns (HN 2025)
569 
570**Battle-tested insights from practitioners building real systems.**
571 
572### Narrow Scope Wins
573 
574```yaml
575task_constraints:
576  max_steps_before_review: 3-5
577  characteristics:
578    - Specific, well-defined objectives
579    - Pre-classified inputs
580    - Deterministic success criteria
581    - Verifiable outputs
582```
583 
584### Confidence-Based Routing
585 
586```
587confidence >= 0.95  -->  Auto-approve with audit log
588confidence >= 0.70  -->  Quick human review
589confidence >= 0.40  -->  Detailed human review
590confidence < 0.40   -->  Escalate immediately
591```
592 
593### Deterministic Outer Loops
594 
595**Wrap agent outputs with rule-based validation (NOT LLM-judged):**
596 
597```
5981. Agent generates output
5992. Run linter (deterministic)
6003. Run tests (deterministic)
6014. Check compilation (deterministic)
6025. Only then: human or AI review
603```
604 
605### Context Engineering
606 
607```yaml
608principles:
609  - "Less is more" - focused beats comprehensive
610  - Manual selection outperforms automatic RAG
611  - Fresh conversations per major task
612  - Remove outdated information aggressively
613 
614context_budget:
615  target: "< 10k tokens for context"
616  reserve: "90% for model reasoning"
617```
618 
619### Sub-Agents for Context Isolation
620 
621**Use sub-agents to prevent token waste on noisy subtasks:**
622 
623```
624Main agent (focused) --> Sub-agent (file search)
625                     --> Sub-agent (test running)
626                     --> Sub-agent (linting)
627```
628 
629See `references/production-patterns.md` for full practitioner patterns.
630 
631---
632 
633## Exit Conditions
634 
635| Condition | Action |
636|-----------|--------|
637| Product launched, stable 24h | Enter growth loop mode |
638| Unrecoverable failure | Save state, halt, request human |
639| PRD updated | Diff, create delta tasks, continue |
640| Revenue target hit | Log success, continue optimization |
641| Runway < 30 days | Alert, optimize costs aggressively |
642 
643---
644 
645## Directory Structure Overview
646 
647```
648.loki/
649+-- CONTINUITY.md           # Working memory (read/update every turn)
650+-- specs/
651|   +-- openapi.yaml        # API spec - source of truth
652+-- queue/
653|   +-- pending.json        # Tasks waiting to be claimed
654|   +-- in-progress.json    # Currently executing tasks
655|   +-- completed.json      # Finished tasks
656|   +-- dead-letter.json    # Failed tasks for review
657+-- state/
658|   +-- orchestrator.json   # Master state (phase, metrics)
659|   +-- agents/             # Per-agent state files
660|   +-- circuit-breakers/   # Rate limiting state
661+-- memory/
662|   +-- episodic/           # Specific interaction traces (what happened)
663|   +-- semantic/           # Generalized patterns (how things work)
664|   +-- skills/             # Learned action sequences (how to do X)
665|   +-- ledgers/            # Agent-specific checkpoints
666|   +-- handoffs/           # Agent-to-agent transfers
667+-- metrics/
668|   +-- efficiency/         # Task efficiency scores (time, agents, retries)
669|   +-- rewards/            # Outcome/efficiency/preference rewards
670|   +-- dashboard.json      # Rolling metrics summary
671+-- artifacts/
672    +-- reports/            # Generated reports/dashboards
673```
674 
675See `references/architecture.md` for full structure and state schemas.
676 
677---
678 
679## Invocation
680 
681```
682Loki Mode                           # Start fresh
683Loki Mode with PRD at path/to/prd   # Start with PRD
684```
685 
686**Skill Metadata:**
687| Field | Value |
688|-------|-------|
689| Trigger | "Loki Mode" or "Loki Mode with PRD at [path]" |
690| Skip When | Need human approval, want to review plan first, single small task |
691| Related Skills | subagent-driven-development, executing-plans |
692 
693---
694 
695## References
696 
697Detailed documentation is split into reference files for progressive loading:
698 
699| Reference | Content |
700|-----------|---------|
701| `references/core-workflow.md` | Full RARV cycle, CONTINUITY.md template, autonomy rules |
702| `references/quality-control.md` | Quality gates, anti-sycophancy, blind review, severity blocking |
703| `references/openai-patterns.md` | OpenAI Agents SDK: guardrails, tripwires, handoffs, fallbacks |
704| `references/lab-research-patterns.md` | DeepMind + Anthropic: Constitutional AI, debate, world models |
705| `references/production-patterns.md` | HN 2025: What actually works in production, context engineering |
706| `references/advanced-patterns.md` | 2025 research: MAR, Iter-VF, GoalAct, CONSENSAGENT |
707| `references/tool-orchestration.md` | ToolOrchestra patterns: efficiency, rewards, dynamic selection |
708| `references/memory-system.md` | Episodic/semantic memory, consolidation, Zettelkasten linking |
709| `references/agent-types.md` | All 37 agent types with full capabilities |
710| `references/task-queue.md` | Queue system, dead letter handling, circuit breakers |
711| `references/sdlc-phases.md` | All phases with detailed workflows and testing |
712| `references/spec-driven-dev.md` | OpenAPI-first workflow, validation, contract testing |
713| `references/architecture.md` | Directory structure, state schemas, bootstrap |
714| `references/mcp-integration.md` | MCP server capabilities and integration |
715| `references/claude-best-practices.md` | Boris Cherny patterns, thinking mode, ledgers |
716| `references/deployment.md` | Cloud deployment instructions per provider |
717| `references/business-ops.md` | Business operation workflows |
718 
719---
720 
721**Version:** 2.32.0 | **Lines:** ~600 | **Research-Enhanced: Labs + HN Production Patterns**
722
Full transparency — inspect the skill content before installing.