Multi-agent autonomous startup system for Claude Code. Triggers on "Loki Mode". Orchestrates 100+ specialized agents across engineering, QA, DevOps, security, data/ML, business operations, marketing, HR, and customer success. Takes PRD to fully deployed, revenue-generating product with zero human intervention. Features Task tool for subagent dispatch, parallel code review with 3 specialized reviewers, severity-based issue triage, distributed task queue with dead letter handling, automatic deploy
Add this skill
npx mdskills install sickn33/loki-modeAmbitious multi-agent orchestration system with detailed patterns but overly complex and lacks scoping
The First Truly Autonomous Multi-Agent Startup System
PRD → Deployed Product in Zero Human Intervention
Loki Mode transforms a Product Requirements Document into a fully built, tested, deployed, and revenue-generating product while you sleep. No manual steps. No intervention. Just results.
Click to watch Loki Mode build a complete Todo App from PRD - zero human intervention
| System | Pass@1 | Details |
|---|---|---|
| Loki Mode (Multi-Agent) | 98.78% | 162/164 problems, RARV cycle recovered 2 |
| Direct Claude | 98.17% | 161/164 problems (baseline) |
| MetaGPT | 85.9-87.7% | Published benchmark |
Loki Mode beats MetaGPT by +11-13% thanks to the RARV (Reason-Act-Reflect-Verify) cycle.
| Benchmark | Score | Details |
|---|---|---|
| Loki Mode HumanEval | 98.78% Pass@1 | 162/164 (multi-agent with RARV) |
| Direct Claude HumanEval | 98.17% Pass@1 | 161/164 (single agent baseline) |
| Direct Claude SWE-bench | 99.67% patch gen | 299/300 problems |
| Loki Mode SWE-bench | 99.67% patch gen | 299/300 problems |
| Model | Claude Opus 4.5 |
Key Finding: Multi-agent RARV matches single-agent performance on both benchmarks after timeout optimization. The 4-agent pipeline (Architect->Engineer->QA->Reviewer) achieves the same 99.67% patch generation as direct Claude.
See benchmarks/results/ for full methodology and solutions.
Loki Mode is a Claude Code skill that orchestrates 37 specialized AI agent types across 6 swarms to autonomously build, test, deploy, and scale complete startups. It dynamically spawns only the agents you need—5-10 for simple projects, 100+ for complex startups—working in parallel with continuous self-verification.
PRD → Research → Architecture → Development → Testing → Deployment → Marketing → Revenue
Just say "Loki Mode" and point to a PRD. Walk away. Come back to a deployed product.
| What Others Do | What Loki Mode Does |
|---|---|
| Single agent writes code linearly | 100+ agents work in parallel across engineering, ops, business, data, product, and growth |
| Manual deployment required | Autonomous deployment to AWS, GCP, Azure, Vercel, Railway with blue-green and canary strategies |
| No testing or basic unit tests | 14 automated quality gates: security scans, load tests, accessibility audits, code reviews |
| Code only - you handle the rest | Full business operations: marketing, sales, legal, HR, finance, investor relations |
| Stops on errors | Self-healing: circuit breakers, dead letter queues, exponential backoff, automatic recovery |
| No visibility into progress | Real-time dashboard with agent monitoring, task queues, and live status updates |
| "Done" when code is written | Never "done": continuous optimization, A/B testing, customer feedback loops, perpetual improvement |
Monitor your autonomous startup being built in real-time through the Loki Mode dashboard:

Track all active agents in real-time:

Four-column kanban view:
# Watch status updates in terminal
watch -n 2 cat .loki/STATUS.txt
╔════════════════════════════════════════════════════════════════╗
║ LOKI MODE STATUS ║
╚════════════════════════════════════════════════════════════════╝
Phase: DEVELOPMENT
Active Agents: 47
├─ Engineering: 18
├─ Operations: 12
├─ QA: 8
└─ Business: 9
Tasks:
├─ Pending: 10
├─ In Progress: 47
├─ Completed: 203
└─ Failed: 0
Last Updated: 2026-01-04 20:45:32
Access the dashboard:
# Automatically opens when running autonomously
./autonomy/run.sh ./docs/requirements.md
# Or open manually
open .loki/dashboard/index.html
Auto-refreshes every 3 seconds. Works with any modern browser.
Loki Mode doesn't just write code—it thinks, acts, learns, and verifies:
1. REASON
└─ Read .loki/CONTINUITY.md including "Mistakes & Learnings"
└─ Check .loki/state/ and .loki/queue/
└─ Identify next task or improvement
2. ACT
└─ Execute task, write code
└─ Commit changes atomically (git checkpoint)
3. REFLECT
└─ Update .loki/CONTINUITY.md with progress
└─ Update state files
└─ Identify NEXT improvement
4. VERIFY
└─ Run automated tests (unit, integration, E2E)
└─ Check compilation/build
└─ Verify against spec
IF VERIFICATION FAILS:
├─ Capture error details (stack trace, logs)
├─ Analyze root cause
├─ UPDATE "Mistakes & Learnings" in CONTINUITY.md
├─ Rollback to last good git checkpoint if needed
└─ Apply learning and RETRY from REASON
Result: 2-3x quality improvement through continuous self-verification.
There is NEVER a "finished" state. After completing the PRD, Loki Mode:
It keeps going until you stop it.
Rate limits? Exponential backoff and automatic resume. Errors? Circuit breakers, dead letter queues, retry logic. Interruptions? State checkpoints every 5 seconds—just restart.
# Start autonomous mode
./autonomy/run.sh ./docs/requirements.md
# Hit rate limit? Script automatically:
# ├─ Saves state checkpoint
# ├─ Waits with exponential backoff (60s → 120s → 240s...)
# ├─ Resumes from exact point
# └─ Continues until completion or max retries (default: 50)
# Clone to your Claude Code skills directory
git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
See INSTALLATION.md for other installation methods (Web, API Console, minimal curl install).
# Product: AI-Powered Todo App
## Overview
Build a todo app with AI-powered task suggestions and deadline predictions.
## Features
- User authentication (email/password)
- Create, read, update, delete todos
- AI suggests next tasks based on patterns
- Smart deadline predictions
- Mobile-responsive design
## Tech Stack
- Next.js 14 with TypeScript
- PostgreSQL database
- OpenAI API for suggestions
- Deploy to Vercel
Save as my-prd.md.
# Autonomous mode (recommended)
./autonomy/run.sh ./my-prd.md
# Or manual mode
claude --dangerously-skip-permissions
> Loki Mode with PRD at ./my-prd.md
Open the dashboard in your browser (auto-opens) or check status:
watch -n 2 cat .loki/STATUS.txt
Seriously. Go get coffee. It'll be deployed when you get back.
That's it. No configuration. No manual steps. No intervention.
Loki Mode has 37 predefined agent types organized into 6 specialized swarms. The orchestrator spawns only what you need—simple projects use 5-10 agents, complex startups spawn 100+.
eng-frontend eng-backend eng-database eng-mobile eng-api eng-qa eng-perf eng-infra
ops-devops ops-sre ops-security ops-monitor ops-incident ops-release ops-cost ops-compliance
biz-marketing biz-sales biz-finance biz-legal biz-support biz-hr biz-investor biz-partnerships
data-ml data-eng data-analytics
prod-pm prod-design prod-techwriter
growth-hacker growth-community growth-success growth-lifecycle
review-code review-business review-security
See references/agents.md for complete agent type definitions.
| Phase | Description |
|---|---|
| 0. Bootstrap | Create .loki/ directory structure, initialize state |
| 1. Discovery | Parse PRD, competitive research via web search |
| 2. Architecture | Tech stack selection with self-reflection |
| 3. Infrastructure | Provision cloud, CI/CD, monitoring |
| 4. Development | Implement with TDD, parallel code review |
| 5. QA | 14 quality gates, security audit, load testing |
| 6. Deployment | Blue-green deploy, auto-rollback on errors |
| 7. Business | Marketing, sales, legal, support setup |
| 8. Growth | Continuous optimization, A/B testing, feedback loops |
Every code change goes through 3 specialized reviewers simultaneously:
IMPLEMENT → REVIEW (parallel) → AGGREGATE → FIX → RE-REVIEW → COMPLETE
│
├─ code-reviewer (Opus) - Code quality, patterns, best practices
├─ business-logic-reviewer (Opus) - Requirements, edge cases, UX
└─ security-reviewer (Opus) - Vulnerabilities, OWASP Top 10
Severity-based issue handling:
// TODO(review): ... comment, continue.// FIXME(nitpick): ... comment, continue..loki/
├── state/ # Orchestrator and agent states
├── queue/ # Task queue (pending, in-progress, completed, dead-letter)
├── memory/ # Episodic, semantic, and procedural memory
├── metrics/ # Efficiency tracking and reward signals
├── messages/ # Inter-agent communication
├── logs/ # Audit logs
├── config/ # Configuration files
├── prompts/ # Agent role prompts
├── artifacts/ # Releases, reports, backups
├── dashboard/ # Real-time monitoring dashboard
└── scripts/ # Helper scripts
Test Loki Mode with these pre-built PRDs in the examples/ directory:
| PRD | Complexity | Est. Time | Description |
|---|---|---|---|
simple-todo-app.md | Low | ~10 min | Basic todo app - tests core functionality |
api-only.md | Low | ~10 min | REST API only - tests backend agents |
static-landing-page.md | Low | ~5 min | HTML/CSS only - tests frontend/marketing |
full-stack-demo.md | Medium | ~30-60 min | Complete bookmark manager - full test |
# Example: Run with simple todo app
./autonomy/run.sh examples/simple-todo-app.md
Customize the autonomous runner with environment variables:
LOKI_MAX_RETRIES=100 \
LOKI_BASE_WAIT=120 \
LOKI_MAX_WAIT=7200 \
./autonomy/run.sh ./docs/requirements.md
| Variable | Default | Description |
|---|---|---|
LOKI_MAX_RETRIES | 50 | Maximum retry attempts before giving up |
LOKI_BASE_WAIT | 60 | Base wait time in seconds |
LOKI_MAX_WAIT | 3600 | Maximum wait time (1 hour) |
LOKI_SKIP_PREREQS | false | Skip prerequisite checks |
# .loki/config/circuit-breakers.yaml
defaults:
failureThreshold: 5
cooldownSeconds: 300
# .loki/config/alerting.yaml
channels:
slack:
webhook_url: "${SLACK_WEBHOOK_URL}"
severity: [critical, high]
pagerduty:
integration_key: "${PAGERDUTY_KEY}"
severity: [critical]
--dangerously-skip-permissions flagOptional but recommended:
Integrate with Vibe Kanban for a visual kanban board:
# Install Vibe Kanban
npx vibe-kanban
# Export Loki tasks to Vibe Kanban
./scripts/export-to-vibe-kanban.sh
Benefits:
See integrations/vibe-kanban.md for full setup guide.
Run the comprehensive test suite:
# Run all tests
./tests/run-all-tests.sh
# Or run individual test suites
./tests/test-bootstrap.sh # Directory structure, state init
./tests/test-task-queue.sh # Queue operations, priorities
./tests/test-circuit-breaker.sh # Failure handling, recovery
./tests/test-agent-timeout.sh # Timeout, stuck process handling
./tests/test-state-recovery.sh # Checkpoints, recovery
Contributions welcome! Please:
MIT License - see LICENSE for details.
Loki Mode incorporates research and patterns from leading AI labs and practitioners:
| Source | Key Contribution |
|---|---|
| Anthropic: Building Effective Agents | Evaluator-optimizer pattern, parallelization |
| Anthropic: Constitutional AI | Self-critique against principles |
| DeepMind: Scalable Oversight via Debate | Debate-based verification |
| DeepMind: SIMA 2 | Self-improvement loop |
| OpenAI: Agents SDK | Guardrails, tripwires, tracing |
| NVIDIA ToolOrchestra | Efficiency metrics, reward signals |
| CONSENSAGENT (ACL 2025) | Anti-sycophancy, blind review |
| GoalAct | Hierarchical planning |
Full Acknowledgements - Complete list of 50+ research papers, articles, and resources
Built for the Claude Code ecosystem, powered by Anthropic's Claude models (Sonnet, Haiku, Opus).
Ready to build a startup while you sleep?
git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
./autonomy/run.sh your-prd.md
Keywords: claude-code, claude-skills, ai-agents, autonomous-development, multi-agent-system, sdlc-automation, startup-automation, devops, mlops, deployment-automation, self-healing, perpetual-improvement
Install via CLI
npx mdskills install sickn33/loki-modeLoki Mode is a free, open-source AI agent skill. Multi-agent autonomous startup system for Claude Code. Triggers on "Loki Mode". Orchestrates 100+ specialized agents across engineering, QA, DevOps, security, data/ML, business operations, marketing, HR, and customer success. Takes PRD to fully deployed, revenue-generating product with zero human intervention. Features Task tool for subagent dispatch, parallel code review with 3 specialized reviewers, severity-based issue triage, distributed task queue with dead letter handling, automatic deploy
Install Loki Mode with a single command:
npx mdskills install sickn33/loki-modeThis downloads the skill files into your project and your AI agent picks them up automatically.
Loki Mode works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.