AgentSys A modular runtime and orchestration system for AI agents.
Add this skill
npx mdskills install avifenesh/agentsysComprehensive orchestration system with 13 plugins, 42 agents, and 28 skills for full-cycle dev automation
1<p align="center">2 <img src="site/assets/logo.png" alt="AgentSys" width="120">3</p>45<h1 align="center">AgentSys</h1>67<p align="center">8 <strong>A modular runtime and orchestration system for AI agents.</strong>9</p>1011> **Renamed from `awesome-slash`** — The `awesome-` prefix implies a curated list of links, but this project is a functional software suite and runtime. Please update your installs: `npm install -g agentsys`1213<p align="center">14 <a href="https://www.npmjs.com/package/agentsys"><img src="https://img.shields.io/npm/v/agentsys.svg" alt="npm version"></a>15 <a href="https://www.npmjs.com/package/agentsys"><img src="https://img.shields.io/npm/dm/agentsys.svg" alt="npm downloads"></a>16 <a href="https://github.com/avifenesh/agentsys/actions/workflows/ci.yml"><img src="https://github.com/avifenesh/agentsys/actions/workflows/ci.yml/badge.svg" alt="CI"></a>17 <a href="https://github.com/avifenesh/agentsys/stargazers"><img src="https://img.shields.io/github/stars/avifenesh/agentsys.svg" alt="GitHub stars"></a>18 <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a>19 <a href="https://avifenesh.github.io/agentsys/"><img src="https://img.shields.io/badge/Website-AgentSys-blue?style=flat&logo=github" alt="Website"></a>20 <a href="https://github.com/hesreallyhim/awesome-claude-code"><img src="https://awesome.re/mentioned-badge.svg" alt="Mentioned in Awesome Claude Code"></a>21</p>2223<p align="center">24 <b>13 plugins · 42 agents · 28 skills · 26k lines of lib code · 3,357 tests · 3 platforms</b>25</p>2627<p align="center">28 <a href="#commands">Commands</a> · <a href="#installation">Installation</a> · <a href="https://avifenesh.github.io/agentsys/">Website</a> · <a href="https://github.com/avifenesh/agentsys/discussions">Discussions</a>29</p>3031<p align="center">32 <b>Built for Claude Code · Codex CLI · OpenCode</b>33</p>3435<p align="center"><em>New skills, agents, and integrations ship constantly. Follow for real-time updates:</em></p>36<p align="center">37 <a href="https://x.com/avi_fenesh"><img src="https://img.shields.io/badge/Follow-@avi__fenesh-1DA1F2?style=for-the-badge&logo=x&logoColor=white" alt="Follow on X"></a>38</p>3940---4142AI models can write code. That's not the hard part anymore. The hard part is everything around it — task selection, branch management, code review, artifact cleanup, CI, PR comments, deployment. **AgentSys is the runtime that orchestrates agents to handle all of it** — structured pipelines, gated phases, specialized agents, and persistent state that survives session boundaries.4344---45> Building custom skills, agents, hooks, or MCP tools? [agnix](https://github.com/avifenesh/agnix) is the CLI + LSP linter that catches config errors before they fail silently - real-time IDE validation, auto suggestions, auto-fix, and 155 rules for Cursor, Claude Code, Cline, Copilot, Codex, Windsurf, and more.4647## What This Is4849An agent orchestration system — 13 plugins, 42 agents, and 28 skills that compose into structured pipelines for software development.5051Each agent has a single responsibility, a specific model assignment, and defined inputs/outputs. Pipelines enforce phase gates so agents can't skip steps. State persists across sessions so work survives interruptions.5253The system runs on Claude Code, OpenCode, and Codex CLI. Install the plugins, get the runtime.5455---5657## The Approach5859**Code does code work. AI does AI work.**6061- **Detection**: regex, AST analysis, static analysis—fast, deterministic, no tokens wasted62- **Judgment**: LLM calls for synthesis, planning, review—where reasoning matters63- **Result**: 77% fewer tokens for [/drift-detect](#drift-detect) vs multi-agent approaches, certainty-graded findings throughout6465**Certainty levels exist because not all findings are equal:**6667| Level | Meaning | Action |68|-------|---------|--------|69| HIGH | Definitely a problem | Safe to auto-fix |70| MEDIUM | Probably a problem | Needs context |71| LOW | Might be a problem | Needs human judgment |7273This came from testing on 1,000+ repositories.7475---7677## Commands7879<!-- GEN:START:readme-commands -->80| Command | What it does |81|---------|--------------|82| [`/next-task`](#next-task) | Task → exploration → plan → implementation → review → ship |83| [`/agnix`](#agnix) | **Lint agent configs** - 155 rules for Skills, Memory, Hooks, MCP across 10+ AI tools |84| [`/ship`](#ship) | Branch → PR → CI → reviews addressed → merge → cleanup |85| [`/deslop`](#deslop) | 3-phase detection pipeline, certainty-graded findings |86| [`/perf`](#perf) | 10-phase performance investigation with baselines and profiling |87| [`/drift-detect`](#drift-detect) | AST-based plan vs code analysis, finds what's documented but not built |88| [`/audit-project`](#audit-project) | Multi-agent code review, iterates until issues resolved |89| [`/enhance`](#enhance) | Analyzes prompts, agents, plugins, docs, hooks, skills |90| [`/repo-map`](#repo-map) | AST symbol and import mapping via ast-grep |91| [`/sync-docs`](#sync-docs) | Finds outdated references, stale examples, missing CHANGELOG entries |92| [`/learn`](#learn) | Research any topic, gather online sources, create learning guide with RAG index |93| [`/consult`](#consult) | Consult another AI CLI tool for a second opinion. Use when you want to cross-check ideas, get alternative approaches, or validate decisions with Gemini, Codex, Claude, OpenCode, or Copilot. |94| [`/debate`](#debate) | Use when user asks to "debate", "argue about", "compare perspectives", "stress test idea", "devil advocate", or "tool vs tool". Structured debate between two AI tools with proposer/challenger roles and a verdict. |95<!-- GEN:END:readme-commands -->9697Each command works standalone. Together, they compose into end-to-end pipelines.9899---100101## Skills102103<!-- GEN:START:readme-skills -->10428 skills included across the plugins:105106| Category | Skills |107|----------|--------|108| **Performance** | `perf:perf-analyzer`, `perf:perf-baseline-manager`, `perf:perf-benchmarker`, `perf:perf-code-paths`, `perf:perf-investigation-logger`, `perf:perf-profiler`, `perf:perf-theory-gatherer`, `perf:perf-theory-tester` |109| **Enhancement** | `enhance:enhance-agent-prompts`, `enhance:enhance-claude-memory`, `enhance:enhance-cross-file`, `enhance:enhance-docs`, `enhance:enhance-hooks`, `enhance:enhance-orchestrator`, `enhance:enhance-plugins`, `enhance:enhance-prompts`, `enhance:enhance-skills` |110| **Workflow** | `next-task:discover-tasks`, `next-task:orchestrate-review`, `next-task:validate-delivery` |111| **Cleanup** | `deslop:deslop`, `sync-docs:sync-docs` |112| **Analysis** | `debate:debate`, `drift-detect:drift-analysis`, `repo-map:repo-mapping` |113| **Productivity** | `consult:consult` |114| **Learning** | `learn:learn` |115| **Linting** | `agnix:agnix` |116<!-- GEN:END:readme-skills -->117118Skills are the reusable implementation units. Agents invoke skills; commands orchestrate agents. When you install a plugin, its skills become available to all agents in that session.119120---121122## Quick Navigation123124| Section | What's there |125|---------|--------------|126| [The Approach](#the-approach) | Why it's built this way |127| [Commands](#commands) | All 12 commands overview |128| [Skills](#skills) | 28 skills across plugins |129| [Command Details](#command-details) | Deep dive into each command |130| [How Commands Work Together](#how-commands-work-together) | Standalone vs integrated |131| [Design Philosophy](#design-philosophy) | The thinking behind the architecture |132| [Installation](#installation) | Get started |133| [Research & Testing](#research--testing) | What went into building this |134| [Documentation](#documentation) | Links to detailed docs |135136---137138## Command Details139140### /next-task141142**Purpose:** Complete task-to-production automation.143144**What happens when you run it:**1451461. **Policy Selection** - Choose task source (GitHub issues, GitLab, local file), priority filter, stopping point1472. **Task Discovery** - Shows top 5 prioritized tasks, you pick one1483. **Worktree Setup** - Creates isolated branch and working directory1494. **Exploration** - Deep codebase analysis to understand context1505. **Planning** - Designs implementation approach1516. **User Approval** - You review and approve the plan (last human interaction)1527. **Implementation** - Executes the plan1538. **Pre-Review** - Runs [deslop](#deslop)-agent and test-coverage-checker1549. **Review Loop** - Multi-agent review iterates until clean15510. **Delivery Validation** - Verifies tests pass, build passes, requirements met15611. **Docs Update** - Updates CHANGELOG and related documentation15712. **[Ship](#ship)** - Creates PR, monitors CI, addresses comments, merges158159Phase 9 uses the `orchestrate-review` skill to spawn parallel reviewers (code quality, security, performance, test coverage) plus conditional specialists.160161**Agents involved:**162163| Agent | Model | Role |164|-------|-------|------|165| task-discoverer | sonnet | Finds and ranks tasks from your source |166| worktree-manager | haiku | Creates git worktrees and branches |167| exploration-agent | opus | Deep codebase analysis before planning |168| planning-agent | opus | Designs step-by-step implementation plan |169| implementation-agent | opus | Writes the actual code |170| test-coverage-checker | sonnet | Validates tests exist and are meaningful |171| delivery-validator | sonnet | Final checks before shipping |172| ci-monitor | haiku | Watches CI status |173| ci-fixer | sonnet | Fixes CI failures and review comments |174| simple-fixer | haiku | Executes mechanical edits |175176**Cross-plugin agent:**177| Agent | Plugin | Role |178|-------|--------|------|179| deslop-agent | deslop | Removes AI artifacts before review |180| sync-docs-agent | sync-docs | Updates documentation |181182**Usage:**183184```bash185/next-task # Start new workflow186/next-task --resume # Resume interrupted workflow187/next-task --status # Check current state188/next-task --abort # Cancel and cleanup189```190191[Full workflow documentation →](./docs/workflows/NEXT-TASK.md)192193---194195### /agnix196197**Purpose:** Lint agent configurations before they break your workflow. The first dedicated linter for AI agent configs.198199**[agnix](https://github.com/avifenesh/agnix)** is a standalone open-source project that provides the validation engine. This plugin integrates it into your workflow.200201**The problem it solves:**202203Agent configurations are code. They affect behavior, security, and reliability. But unlike application code, they have no linting. You find out your SKILL.md is malformed when the agent fails. You discover your hooks have security issues when they're exploited. You realize your CLAUDE.md has conflicting rules when the AI behaves unexpectedly.204205agnix catches these issues before they cause problems.206207**What it validates:**208209| Category | What It Checks |210|----------|----------------|211| **Structure** | Required fields, valid YAML/JSON, proper frontmatter |212| **Security** | Prompt injection vectors, overpermissive tools, exposed secrets |213| **Consistency** | Conflicting rules, duplicate definitions, broken references |214| **Best Practices** | Tool restrictions, model selection, trigger phrase quality |215| **Cross-Platform** | Compatibility across Claude Code, Cursor, Copilot, Codex, OpenCode, Gemini CLI, Cline, and more |216217**155 validation rules** (57 auto-fixable) derived from:218- Official tool specifications (Claude Code, Cursor, GitHub Copilot, Codex CLI, OpenCode, Gemini CLI, and more)219- Research papers on agent reliability and prompt injection220- Real-world testing across 500+ repositories221- Community-reported issues and edge cases222223**Supported files:**224225| File Type | Examples |226|-----------|----------|227| Skills | `SKILL.md`, `*/SKILL.md` |228| Memory | `CLAUDE.md`, `AGENTS.md`, `.github/CLAUDE.md` |229| Hooks | `.claude/settings.json`, hooks configuration |230| MCP | `*.mcp.json`, MCP server configs |231| Cursor | `.cursor/rules/*.mdc`, `.cursorrules` |232| Copilot | `.github/copilot-instructions.md` |233234**CI/CD Integration:**235236agnix outputs SARIF format for GitHub Code Scanning. Add it to your workflow:237238```yaml239- name: Lint agent configs240 run: agnix --format sarif > results.sarif241- uses: github/codeql-action/upload-sarif@v3242 with:243 sarif_file: results.sarif244```245246**Usage:**247248```bash249/agnix # Validate current project250/agnix --fix # Auto-fix fixable issues251/agnix --strict # Treat warnings as errors252/agnix --target claude-code # Only Claude Code rules253/agnix --format sarif # Output for GitHub Code Scanning254```255256**Agent:** agnix-agent (sonnet model)257258**External tool:** Requires [agnix CLI](https://github.com/avifenesh/agnix)259260```bash261npm install -g agnix # Install via npm262# or263cargo install agnix-cli # Install via Cargo264# or265brew install agnix # Install via Homebrew (macOS)266```267268**Why use agnix:**269- Catch config errors before they cause agent failures270- Enforce security best practices across your team271- Maintain consistency as your agent configs grow272- Integrate validation into CI/CD pipelines273- Support multiple AI tools from one linter274275---276277### /ship278279**Purpose:** Takes your current branch from "ready to commit" to "merged PR."280281**What happens when you run it:**2822831. **Pre-flight** - Detects CI platform, deployment platform, branch strategy2842. **Commit** - Stages and commits with generated message (if uncommitted changes)2853. **Push & PR** - Pushes branch, creates pull request2864. **CI Monitor** - Waits for CI, retries on transient failures2875. **Review Wait** - Waits 3 minutes for auto-reviewers (Copilot, Claude, Gemini, Codex)2886. **Address Comments** - Handles every comment from every reviewer2897. **Merge** - Merges when all comments resolved and CI passes2908. **Deploy** - Deploys and validates (if multi-branch workflow)2919. **Cleanup** - Removes worktree, closes issue, deletes branch292293**Platform Detection:**294295| Type | Detected |296|------|----------|297| CI | GitHub Actions, GitLab CI, CircleCI, Jenkins, Travis |298| Deploy | Railway, Vercel, Netlify, Fly.io, Render |299| Project | Node.js, Python, Rust, Go, Java |300301**Review Comment Handling:**302303Every comment gets addressed. No exceptions. The workflow categorizes comments and handles each:304- Code fixes get implemented305- Style suggestions get applied306- Questions get answered307- False positives get explained308309If something can't be fixed, the workflow replies explaining why and resolves the thread.310311**Usage:**312313```bash314/ship # Full workflow315/ship --dry-run # Preview without executing316/ship --strategy rebase # Use rebase instead of squash317```318319[Full workflow documentation →](./docs/workflows/SHIP.md)320321---322323### /deslop324325**Purpose:** Finds AI slop—debug statements, placeholder text, verbose comments, TODOs—and removes it.326327**How detection works:**328329Three phases run in sequence:3303311. **Phase 1: Regex Patterns** (HIGH certainty)332 - `console.log`, `print()`, `dbg!()`, `println!()`333 - `// TODO`, `// FIXME`, `// HACK`334 - Empty catch blocks, disabled linters335 - Hardcoded secrets (API keys, tokens)3363372. **Phase 2: Multi-Pass Analyzers** (MEDIUM certainty)338 - Doc-to-code ratio (excessive comments)339 - Verbosity ratio (AI preambles)340 - Over-engineering patterns341 - Buzzword inflation342 - Dead code detection343 - Stub functions3443453. **Phase 3: CLI Tools** (LOW certainty, optional)346 - jscpd, madge, escomplex (JS/TS)347 - pylint, radon (Python)348 - golangci-lint (Go)349 - clippy (Rust)350351**Languages supported:** JavaScript/TypeScript, Python, Rust, Go, Java352353**Usage:**354355```bash356/deslop # Report only (safe)357/deslop apply # Fix HIGH certainty issues358/deslop apply src/ 10 # Fix 10 issues in src/359```360361**Thoroughness levels:**362363- `quick` - Phase 1 only (fastest)364- `normal` - Phase 1 + Phase 2 (default)365- `deep` - All phases if tools available366367[Pattern reference →](./docs/reference/SLOP-PATTERNS.md)368369---370371### /perf372373**Purpose:** Structured performance investigation with baselines, profiling, and evidence-backed decisions.374375**10-phase methodology** (based on recorded real performance investigation sessions):3763771. **Setup** - Confirm scenario, success criteria, benchmark command3782. **Baseline** - 60s minimum runs, PERF_METRICS markers required3793. **Breaking Point** - Binary search to find failure threshold3804. **Constraints** - CPU/memory limits, measure delta vs baseline3815. **Hypotheses** - Generate up to 5 hypotheses with evidence and confidence3826. **Code Paths** - Use repo-map to identify entrypoints and hot files3837. **Profiling** - Language-specific tools (--cpu-prof, JFR, cProfile, pprof)3848. **Optimization** - One change per experiment, 2+ validation passes3859. **Decision** - Continue or stop based on measurable improvement38610. **Consolidation** - Final baseline, evidence log, investigation complete387388**Agents and skills:**389390| Component | Role |391|-----------|------|392| perf-orchestrator | Coordinates all phases |393| perf-theory-gatherer | Generates hypotheses from git history and code |394| perf-theory-tester | Validates hypotheses with controlled experiments |395| perf-analyzer | Synthesizes findings into recommendations |396| perf-code-paths | Maps entrypoints and likely hot paths |397| perf-investigation-logger | Structured evidence logging |398399**Usage:**400401```bash402/perf # Start new investigation403/perf --resume # Resume previous investigation404```405406**Phase flags (advanced):**407408```bash409/perf --phase baseline --command "npm run bench" --version v1.2.0410/perf --phase breaking-point --param-min 1 --param-max 500411/perf --phase constraints --cpu 1 --memory 1GB412/perf --phase hypotheses --hypotheses-file perf-hypotheses.json413/perf --phase optimization --change "reduce allocations"414/perf --phase decision --verdict stop --rationale "no measurable improvement"415```416417---418419### /drift-detect420421**Purpose:** Compares your documentation and plans to what's actually in the code.422423**The problem it solves:**424425Your roadmap says "user authentication: done." But is it actually implemented? Your GitHub issue says "add dark mode." Is it already in the codebase? Plans drift from reality. This command finds the drift.426427**How it works:**4284291. **JavaScript collectors** gather data (fast, token-efficient)430 - GitHub issues and their labels431 - Documentation files432 - Actual code exports and implementations4334342. **Single Opus call** performs semantic analysis435 - Matches concepts, not strings ("user auth" matches `auth/`, `login.js`, `session.ts`)436 - Identifies implemented but not documented437 - Identifies documented but not implemented438 - Finds stale issues that should be closed439440**Why this approach:**441442Multi-agent collection wastes tokens on coordination. JavaScript collectors are fast and deterministic. One well-prompted LLM call does the actual analysis. Result: 77% token reduction vs multi-agent approaches.443444**Tested on 1,000+ repositories** before release.445446**Usage:**447448```bash449/drift-detect # Full analysis450/drift-detect --depth quick # Quick scan451```452453---454455### /audit-project456457**Purpose:** Multi-agent code review that iterates until issues are resolved.458459**What happens when you run it:**460461Up to 10 specialized role-based agents run based on your project:462463| Agent | When Active | Focus Area |464|-------|-------------|------------|465| code-quality-reviewer | Always | Code quality, error handling |466| security-expert | Always | Vulnerabilities, auth, secrets |467| performance-engineer | Always | N+1 queries, memory, blocking ops |468| test-quality-guardian | Always | Coverage, edge cases, mocking |469| architecture-reviewer | If 50+ files | Modularity, patterns, SOLID |470| database-specialist | If DB detected | Queries, indexes, transactions |471| api-designer | If API detected | REST, errors, pagination |472| frontend-specialist | If frontend detected | Components, state, UX |473| backend-specialist | If backend detected | Services, domain logic |474| devops-reviewer | If CI/CD detected | Pipelines, configs, secrets |475476Findings are collected and categorized by severity (critical/high/medium/low). All non-false-positive issues get fixed automatically. The loop repeats until no open issues remain.477478**Usage:**479480```bash481/audit-project # Full review482/audit-project --quick # Single pass483/audit-project --resume # Resume from queue file484/audit-project --domain security # Security focus only485/audit-project --recent # Only recent changes486```487488[Agent reference →](./docs/reference/AGENTS.md#audit-project-plugin-agents)489490---491492### /enhance493494**Purpose:** Analyzes your prompts, plugins, agents, docs, hooks, and skills for improvement opportunities.495496**Seven analyzers run in parallel:**497498| Analyzer | What it checks |499|----------|----------------|500| plugin-enhancer | Plugin structure, MCP tool definitions, security patterns |501| agent-enhancer | Agent frontmatter, prompt quality |502| claudemd-enhancer | CLAUDE.md/AGENTS.md structure, token efficiency |503| docs-enhancer | Documentation readability, RAG optimization |504| prompt-enhancer | Prompt engineering patterns, clarity, examples |505| hooks-enhancer | Hook frontmatter, structure, safety |506| skills-enhancer | SKILL.md structure, trigger phrases |507508**Each finding includes:**509- Certainty level (HIGH/MEDIUM/LOW)510- Specific location (file:line)511- What's wrong512- How to fix it513- Whether it can be auto-fixed514515**Auto-learning:** Detects obvious false positives (pattern docs, workflow gates) and saves them for future runs. Reduces noise over time without manual suppression files.516517**Usage:**518519```bash520/enhance # Run all analyzers521/enhance --focus=agent # Just agent prompts522/enhance --apply # Apply HIGH certainty fixes523/enhance --show-suppressed # Show what's being filtered524/enhance --no-learn # Analyze but don't save false positives525```526527---528529### /repo-map530531**Purpose:** Builds an AST-based map of symbols and imports for fast repo analysis.532533**What it generates:**534535- Cached file→symbols map (exports, functions, classes)536- Import graph for dependency hints537538Output is cached at `{state-dir}/repo-map.json` and exposed via the MCP `repo_map` tool.539540**Why it matters:**541542Tools like `/drift-detect` and planners can use the map instead of re-scanning the repo every time.543544**Usage:**545546```bash547/repo-map init # First-time map generation548/repo-map update # Incremental update549/repo-map status # Check freshness550```551552**Required:** ast-grep (`sg`) must be installed.553554---555556### /sync-docs557558**Purpose:** Sync documentation with actual code changes—find outdated refs, update CHANGELOG, flag stale examples.559560**The problem it solves:**561562You refactor `auth.js` into `auth/index.js`. Your README still says `import from './auth'`. You rename a function. Three docs still reference the old name. You ship a feature. CHANGELOG doesn't mention it. Documentation drifts from code. This command finds the drift.563564**What it detects:**565566| Category | Examples |567|----------|----------|568| Broken references | Imports to moved/renamed files, deleted exports |569| Version mismatches | Doc says v2.0, package.json says v2.1 |570| Stale code examples | Import paths that no longer exist |571| Missing CHANGELOG | `feat:` and `fix:` commits without entries |572573**Auto-fixable vs flagged:**574575| Auto-fixable (apply mode) | Flagged for review |576|---------------------------|-------------------|577| Version number updates | Removed exports referenced in docs |578| CHANGELOG entries for commits | Code examples needing context |579| | Function renames |580581**Usage:**582583```bash584/sync-docs # Check what docs need updates (safe)585/sync-docs apply # Apply safe fixes586/sync-docs report src/ # Check docs related to src/587/sync-docs --all # Full codebase scan588```589590---591592### /learn593594**Purpose:** Research any topic online and create a comprehensive learning guide with RAG-optimized indexes.595596**What it does:**5975981. **Progressive Discovery** - Uses funnel approach (broad → specific → deep) to find quality sources5992. **Quality Scoring** - Scores sources by authority, recency, depth, examples, uniqueness6003. **Just-In-Time Extraction** - Fetches only high-scoring sources to save tokens6014. **Synthesis** - Creates structured learning guide with examples and best practices6025. **RAG Index** - Updates CLAUDE.md/AGENTS.md master index for future lookups6036. **Enhancement** - Runs enhance:enhance-docs and enhance:enhance-prompts604605**Depth levels:**606607| Depth | Sources | Use Case |608|-------|---------|----------|609| brief | 10 | Quick overview |610| medium | 20 | Default, balanced |611| deep | 40 | Comprehensive |612613**Output structure:**614615```616agent-knowledge/617 CLAUDE.md # Master index (updated each run)618 AGENTS.md # Index for OpenCode/Codex619 recursion.md # Topic-specific guide620 resources/621 recursion-sources.json # Source metadata with quality scores622```623624**Usage:**625626```bash627/learn recursion # Default (20 sources)628/learn react hooks --depth=deep # Comprehensive (40 sources)629/learn kubernetes --depth=brief # Quick overview (10 sources)630/learn python async --no-enhance # Skip enhancement pass631```632633**Agent:** learn-agent (opus model for research quality)634635---636637### /consult638639**Purpose:** Get a second opinion from another AI CLI tool without leaving your current session.640641**What it does:**6426431. **Tool Detection** - Detects which AI CLI tools are installed (cross-platform)6442. **Interactive Picker** - If no tool specified, shows only installed tools to choose from6453. **Effort Mapping** - Maps effort levels to per-provider models and reasoning flags6464. **Execution** - Runs the consultation with safe-mode defaults and 120s timeout6475. **Session Continuity** - Saves session state for Claude and Gemini (supports `--continue`)648649**Supported tools:**650651| Tool | Default Model (high) | Reasoning Control |652|------|---------------------|-------------------|653| Claude | opus | max-turns |654| Gemini | gemini-3-pro | built-in |655| Codex | gpt-5.3-codex | model_reasoning_effort |656| OpenCode | github-copilot/claude-opus-4-6 | --variant |657| Copilot | (default) | none |658659**Usage:**660661```bash662/consult "Is this the right approach?" --tool=gemini --effort=high663/consult "Review for performance issues" --tool=codex664/consult "Suggest alternatives" --tool=claude --effort=max665/consult "Continue from where we left off" --continue666/consult "Explain this error" --context=diff --tool=gemini667```668669**Agent:** consult-agent (sonnet model for orchestration)670671---672673### /debate674675**Purpose:** Stress-test ideas through structured multi-round debate between two AI CLI tools.676677**What it does:**6786791. **Tool Detection** - Detects which AI CLI tools are installed (cross-platform)6802. **Interactive Picker** - If no tools specified, prompts for proposer, challenger, effort, rounds, and context in a single batch question6813. **Proposer/Challenger Format** - First tool argues for the topic; second tool challenges with evidence6824. **Multi-Round Exchange** - Each round the proposer defends and the challenger responds (1–5 rounds)6835. **Verdict** - Orchestrator delivers a final synthesis picking a winner with reasoning684685**Usage:**686687```bash688# Natural language689/debate codex vs gemini about microservices vs monolith690/debate with claude and codex about our auth implementation691/debate thoroughly gemini vs codex about database schema design692/debate codex vs gemini 3 rounds about event sourcing693694# Explicit flags695/debate "Should we use event sourcing?" --tools=claude,gemini --rounds=3 --effort=high696/debate "Valkey vs PostgreSQL for caching" --tools=codex,opencode697698# With codebase context699/debate "Is our current approach correct?" --tools=gemini,codex --context=diff700```701702**Options:**703704| Flag | Description |705|------|-------------|706| `--tools=TOOL1,TOOL2` | Proposer and challenger (comma-separated) |707| `--rounds=N` | Number of debate rounds, 1–5 (default: 2) |708| `--effort=low\|medium\|high\|max` | Reasoning depth per tool call |709| `--context=diff\|file=PATH\|none` | Codebase context passed to both tools |710711**Agent:** debate-orchestrator (opus model for orchestration)712713---714715## How Commands Work Together716717**Standalone use:**718719```bash720/deslop apply # Just clean up your code721/sync-docs # Just check if docs need updates722/ship # Just ship this branch723/audit-project # Just review the codebase724```725726**Integrated workflow:**727728When you run [`/next-task`](#next-task), it orchestrates everything:729730```731/next-task picks task → explores codebase → plans implementation732 ↓733implementation-agent writes code734 ↓735deslop-agent cleans AI artifacts736 ↓737Phase 9 review loop iterates until approved738 ↓739delivery-validator checks requirements740 ↓741sync-docs-agent syncs documentation742 ↓743[/ship](#ship) creates PR → monitors CI → merges744```745746The workflow tracks state so you can resume from any point.747748---749750## Design Philosophy751752<details>753<summary><strong>Architecture decisions and trade-offs</strong> (click to expand)</summary>754755### The Actual Problem756757Frontier models write good code. That's solved. What's not solved:758759- **Context management** - Models forget what they're doing mid-session760- **Compaction amnesia** - Long sessions get summarized, losing critical state761- **Task drift** - Without structure, agents wander from the actual goal762- **Skipped steps** - Agents skip reviews, tests, or cleanup when not enforced763- **Token waste** - Using LLM calls for work that static analysis can do faster764- **Babysitting** - Manually orchestrating each phase of development765- **Repetitive requests** - Asking for the same workflow every single session766767### How This Addresses It768769**1. One agent, one job, done extremely well**770771Same principle as good code: single responsibility. The exploration-agent explores. The implementation-agent implements. Phase 9 spawns multiple focused reviewers. No agent tries to do everything. Specialized agents, each with narrow scope and clear success criteria.772773**2. Pipeline with gates, not a monolith**774775Same principle as DevOps. Each step must pass before the next begins. Can't push before review. Can't merge before CI passes. Hooks enforce this—agents literally cannot skip phases.776777**3. Tools do tool work, agents do agent work**778779If static analysis, regex, or a shell command can do it, don't ask an LLM. Pattern detection uses pre-indexed regex. File discovery uses glob. Platform detection uses file existence checks. The LLM only handles what requires judgment.780781**4. Agents don't need to know how tools work**782783The slop detector returns findings with certainty levels. The agent doesn't need to understand the three-phase pipeline, the regex patterns, or the analyzer heuristics. Good tool design means the consumer doesn't need implementation details.784785**5. Build tools where tools don't exist**786787Many tasks lack existing tools. JavaScript collectors for drift-detect. Multi-pass analyzers for slop detection. The result: agents receive structured data, not raw problems to figure out.788789**6. Research-backed prompt engineering**790791Documented techniques that measurably improve results:792- **Progressive disclosure** - Agents see only what's needed for the current step793- **Structured output** - JSON between delimiters, XML tags for sections794- **Explicit constraints** - What agents MUST NOT do matters as much as what they do795- **Few-shot examples** - Where patterns aren't obvious796- **Tool calling over generation** - Let the model use tools rather than generate tool-like output797798**7. Validate plan and results, not every step**799800Approve the plan. See the results. The middle is automated. One plan approval unlocks autonomous execution through implementation, review, cleanup, and shipping.801802**8. Right model for the task**803804Match model capability to task complexity:805- **opus** - Exploration, planning, implementation, review orchestration806- **sonnet** - Pattern matching, validation, discovery807- **haiku** - Git operations, file moves, CI polling808809Quality compounds. Poor exploration → poor plan → poor implementation → review cycles. Early phases deserve the best model.810811**9. Persistent state survives sessions**812813Two JSON files track everything: what task, what phase. Sessions can die and resume. Multiple sessions run in parallel on different tasks using separate worktrees.814815**10. Delegate everything automatable**816817Agents don't just write code. They:818- Clean their own output (deslop-agent)819- Update documentation (sync-docs-agent)820- Fix CI failures (ci-fixer)821- Respond to review comments822- Check for plan drift ([/drift-detect](#drift-detect))823- Analyze their own prompts ([/enhance](#enhance))824825If it can be specified, it can be delegated.826827**11. Orchestrator stays high-level**828829The main workflow orchestrator doesn't read files, search code, or write implementations. It launches specialized agents and receives their outputs. Keeps the orchestrator's context window available for coordination rather than filled with file contents.830831**12. Composable, not monolithic**832833Every command works standalone. [`/deslop`](#deslop) cleans code without needing [`/next-task`](#next-task). [`/ship`](#ship) merges PRs without needing the full workflow. Pieces compose together, but each piece is useful on its own.834835### What This Gets You836837- **Run multiple sessions** - Different tasks in different worktrees, no interference838- **Fast iteration** - Approve plan, check results, repeat839- **Stay in the interesting parts** - Policy decisions, architecture choices, edge cases840- **Minimal review burden** - Most issues caught and fixed before you see the output841- **No repetitive requests** - The workflow you want, without asking each time842- **Scale horizontally** - More sessions, more tasks, same oversight level843844</details>845846---847848## Installation849850### Claude Code (Recommended way)851852```bash853/plugin marketplace add avifenesh/agentsys854/plugin install next-task@agentsys855/plugin install ship@agentsys856```857858### All Platforms (npm)859860```bash861npm install -g agentsys && agentsys862```863864Interactive installer for Claude Code, OpenCode, and Codex CLI.865866```bash867# Non-interactive install868agentsys --tool claude # Single tool869agentsys --tools "claude,opencode" # Multiple tools870agentsys --development # Dev mode (bypasses marketplace)871```872873[Full installation guide →](./docs/INSTALLATION.md)874875---876877## Requirements878879**Required:**880- Git881- Node.js 18+882883**For GitHub workflows:**884- GitHub CLI (`gh`) authenticated885886**For GitLab workflows:**887- GitLab CLI (`glab`) authenticated888889**For /repo-map:**890- ast-grep (`sg`) installed891892**For /agnix:**893- [agnix CLI](https://github.com/avifenesh/agnix) installed (`cargo install agnix-cli` or `brew install agnix`)894895**Local diagnostics (optional):**896```bash897npm run detect # Platform detection (CI, deploy, project type)898npm run verify # Tool availability + versions899```900901---902903## Research & Testing904905The system is built on research, not guesswork.906907**Knowledge base** (`agent-docs/`): 8,000 lines of curated documentation from Anthropic, OpenAI, Google, and Microsoft covering:908- Agent architecture and design patterns909- Prompt engineering techniques910- Function calling and tool use911- Context efficiency and token optimization912- Multi-agent systems and orchestration913- Instruction following reliability914915**Testing:**916- 1,818 tests passing917- Drift-detect validated on 1,000+ repositories918- E2E workflow testing across all commands919- Cross-platform validation (Claude Code, OpenCode, Codex CLI)920921**Methodology:**922- `/perf` investigation phases based on recorded real performance investigation sessions923- Certainty levels derived from pattern analysis across repositories924- Token optimization measured and validated (77% reduction in drift-detect)925926---927928## Documentation929930| Topic | Link |931|-------|------|932| Installation | [docs/INSTALLATION.md](./docs/INSTALLATION.md) |933| Cross-Platform Setup | [docs/CROSS_PLATFORM.md](./docs/CROSS_PLATFORM.md) |934| Usage Examples | [docs/USAGE.md](./docs/USAGE.md) |935| Architecture | [docs/ARCHITECTURE.md](./docs/ARCHITECTURE.md) |936937### Workflow Deep-Dives938939| Workflow | Link |940|----------|------|941| /next-task Flow | [docs/workflows/NEXT-TASK.md](./docs/workflows/NEXT-TASK.md) |942| /ship Flow | [docs/workflows/SHIP.md](./docs/workflows/SHIP.md) |943944### Reference945946| Topic | Link |947|-------|------|948| Slop Patterns | [docs/reference/SLOP-PATTERNS.md](./docs/reference/SLOP-PATTERNS.md) |949| Agent Reference | [docs/reference/AGENTS.md](./docs/reference/AGENTS.md) |950951---952953## Support954955- **Issues:** [github.com/avifenesh/agentsys/issues](https://github.com/avifenesh/agentsys/issues)956- **Discussions:** [github.com/avifenesh/agentsys/discussions](https://github.com/avifenesh/agentsys/discussions)957958---959960MIT License | Made by [Avi Fenesh](https://github.com/avifenesh)961
Full transparency — inspect the skill content before installing.