AI-assisted coding is transforming how software gets built. But the data tells a sobering story: - 1.7x more bugs in AI-generated code compared to human-written code (CodeRabbit, 2025) - 40% of AI-generated code contains security vulnerabilities - 2026 is "the year of technical debt from vibe coding" (Forrester) The root cause is not the AI. It is the absence of engineering discipline between sess
Add this skill
npx mdskills install Vimalk0703/shipworthyComprehensive engineering framework enforcing 55+ production practices across security, testing, and architecture with auto-activation.
Vibe coding is how you start; engineering is what keeps it alive.
AI-assisted coding is transforming how software gets built. But the data tells a sobering story:
The root cause is not the AI. It is the absence of engineering discipline between sessions. The AI forgets your architecture the moment the session ends. It skips tests because you did not ask for them. It introduces security holes because nobody told it your auth strategy. Every new session starts from zero.
You end up building the same feature three times: once to get it working, once to fix what it broke, and once more when you realize the fix broke something else.
Shipworthy is a Claude Code plugin that auto-activates every session and enforces production engineering practices with full transparency. It detects your project type, generates an architecture spec, and maintains it across sessions. You vibe code at full speed -- the plugin handles TDD, security, quality gates, and 55 engineering skills while showing you exactly what it's doing. No configuration, no ceremony, no workflow changes.
# Any AI agent (CLI setup — hooks + skills + quality gates)
npx shipworthy init
# Specific agent
npx shipworthy init --agent cursor
npx shipworthy init --agent copilot
npx shipworthy init --agent codex
npx shipworthy init --agent windsurf
npx shipworthy init --agent gemini
| Agent | Setup | Hooks | Skills | Quality Gates |
|---|---|---|---|---|
| Claude Code | npx shipworthy init | Full | Full (55) | Automated |
| Cursor | npx shipworthy init --agent cursor | Rules | Full | Manual |
| GitHub Copilot | npx shipworthy init --agent copilot | Rules | Full | Manual |
| OpenAI Codex | npx shipworthy init --agent codex | Rules | Full | Manual |
| Windsurf | npx shipworthy init --agent windsurf | Rules | Full | Manual |
| Gemini CLI | npx shipworthy init --agent gemini | Rules | Full | Manual |
That is the only setup. Here is what happens next:
.shipworthy/architecture.md captures your project's conventions, mandatory rules, and structure.Engineering guardrails activate automatically based on what you are doing. Writing a new feature triggers brainstorming, then planning, then TDD. Creating an API endpoint activates API design standards and security. You never invoke these manually -- they fire when relevant and stay silent when they are not.
Every Shipworthy action is visible. Hooks log color-coded activity to your terminal in real time — security scans, compliance checks, push validation. Skills announce themselves before activating. Commands, agents, templates, and adapters all identify when they're contributing. You always know what Shipworthy is doing and why.
┌─ ⚓ shipworthy ─────────────────────────────┐
│ Tier: ENGINEER │ Health: all passed │
│ Skills: 55 │ Hooks: 6 active │
└──────────────────────────────────────────────┘
⚓ shipworthy 14:32:05 pre-tool-use › Scanning: service.ts
⚓ shipworthy 14:32:05 pre-tool-use › All checks passed ✓
⚓ shipworthy › skill:
api-design-standards+security-first-development— designing secure endpoint
Toggle off with SHIPWORTHY_TRANSPARENCY=0 or "transparency": false in .shipworthy/config.json.
The architecture spec is Claude's long-term memory for your project. Mandatory rules, directory conventions, naming patterns, tech choices -- all persisted and enforced. Session 5 knows everything session 1 decided. No more "Claude forgot we use Prisma" or "it put the route in the wrong directory again."
Inspired by production agent memory architectures, Shipworthy manages a .shipworthy/ directory as persistent project memory:
description and last_updated fields. The description feeds into INDEX.md for one-line scanning without reading full files./retro offers to merge duplicates, prune stale entries, fix relative dates, and remove facts contradicted by current code.YYYY-MM-DD, never "yesterday" or "last week". Relative dates become meaningless across sessions.A weekend prototype should not face the same ceremony as an enterprise platform. The plugin scales its enforcement: lightweight checks for small projects, full quality gates as your codebase grows. You start fast and the guardrails tighten as complexity demands it.
| Tier | Who | Experience |
|---|---|---|
| Builder | Non-technical, prototyping | Guardrails are silent. Tests happen invisibly. Plain language feedback when something needs attention. |
| Maker | Some experience, growing project | Moderate ceremony. Explains why tests matter. Offers choices on architecture decisions. |
| Engineer | Production codebase, CI/CD | Full TDD, quality gates, architecture enforcement. Every PR is verified before completion. |
| Skill | What It Does |
|---|---|
| using-shipworthy | Master router -- loaded every session, dispatches to relevant skills |
| architecture-awareness | Auto-detects project type, generates and enforces architecture spec |
| intent-to-spec | Converts vague requests into detailed specs (invisible for Builder, shown for Engineer) |
| Skill | What It Does |
|---|---|
| brainstorming | 5-step design discovery with HARD-GATE approval before proceeding |
| writing-plans | Breaks work into bite-sized TDD implementation plans with HARD-GATE |
| executing-plans | Systematic task execution with verification at each step |
| design-documents | Creates Architecture Decision Records (ADRs) |
| decision-frameworks | Structured decision-making for trade-offs |
| Skill | What It Does |
|---|---|
| test-driven-development | RED-GREEN-REFACTOR discipline for every feature |
| quality-gates | Graduated pre-commit checks that scale with project size |
| verification-before-completion | Requires evidence (passing tests, clean build) before marking work done |
| error-handling-patterns | Structured errors, recovery strategies, and user-facing messages |
| code-complexity | Identifies and refactors complex code |
| Skill | What It Does |
|---|---|
| security-first-development | OWASP-aware coding -- input validation, auth, secrets management |
| adaptive-security | Auto-detects app type (web/API/GraphQL/mobile/CLI/IoT/desktop/IaC/container) and applies type-specific security profiles |
| secrets-management | Comprehensive lifecycle: rotation, vault integration, leak detection |
| dependency-management | Vet, audit, and pin packages before adding them |
| supply-chain-security | Lock file integrity, typosquatting detection, SBOM, license compliance |
| pii-detection | Identifies and protects personally identifiable data |
| threat-modeling | Structured threat analysis |
| compliance-awareness | HIPAA, PCI-DSS, SOC2, GDPR guidance |
| container-security | Docker/container-specific hardening |
| Skill | What It Does |
|---|---|
| api-design-standards | REST conventions, type-safe contracts, consistent error responses |
| database-design | Schemas, migrations, indexing, N+1 prevention |
| performance-budgets | Bundle size limits, response time targets, query count caps |
| observability-by-default | Structured logging, tracing, health checks from day one |
| resilience-patterns | Circuit breakers, bulkheads, retries, timeouts, graceful degradation |
| twelve-factor-app | Stateless design, env config, backing services |
| distributed-systems | Multi-service coordination, eventual consistency |
| api-versioning | Breaking change management |
| api-backward-compatibility | Non-breaking API evolution |
| Skill | What It Does |
|---|---|
| subagent-driven-development | Dispatch specialized agents with 2-stage review |
| dispatching-parallel-agents | Run independent tasks concurrently for speed |
| requesting-code-review | Structured review via the code-reviewer agent |
| receiving-code-review | Technical verification over performative agreement |
| Skill | What It Does |
|---|---|
| using-git-worktrees | Isolated workspaces for parallel development branches |
| finishing-a-development-branch | 5-step completion workflow: tests, cleanup, docs, PR, verify |
| ci-cd-awareness | Pipeline design, rollback strategies, feature flags |
| tech-debt-tracking | Document shortcuts so they get fixed, not forgotten |
| session-memory | Cross-session persistence via .shipworthy/ with INDEX.md, pruning, and consolidation |
| production-readiness | Pre-deployment checklist |
| migration-strategies | Database migration safety |
| zero-downtime-migrations | Gradual migration patterns |
| environment-setup | Local, staging, production configuration |
| feature-flag-discipline | Gradual rollout, kill switches |
| incident-response | Outage response procedures |
| slo-sli-definition | Service level objectives and indicators |
| Skill | What It Does |
|---|---|
| accessibility | WCAG 2.1 AA baseline for every UI component |
| frontend-standards | Component patterns, state management, rendering best practices |
| Skill | What It Does |
|---|---|
| documentation-as-code | JSDoc, README sync, ADRs, changelog -- documentation that stays current |
| Skill | What It Does |
|---|---|
| systematic-debugging | 4-phase root cause investigation: reproduce, isolate, fix, verify |
| Skill | What It Does |
|---|---|
| writing-skills | TDD for documentation -- create new skills using the RED-GREEN-REFACTOR process |
| retrospective | Self-improving loop -- extracts signals from each session, saves learnings, consolidates memory |
| Level | Threshold | What Gets Checked |
|---|---|---|
| 0 | Any project | Build runs, no obvious errors (Builder-friendly) |
| 1 | Always | Tests pass, build clean, no hardcoded secrets |
| 2 | 10+ files | Coverage > 70%, no untracked TODOs, lint clean |
| 3 | 50+ files | Bundle budgets enforced, no circular imports, API contracts validated |
| 4 | 100+ files | Performance benchmarks, accessibility audit, security scan, dependency audit |
Pre-built architecture specs for common stacks. The plugin selects the right one automatically, or you can run /scaffold to choose.
| Template | Stack |
|---|---|
nextjs.md | Next.js (App Router, Server Components, API Routes) |
express.md | Express.js (REST API, middleware patterns) |
fastapi.md | FastAPI (Python async API, Pydantic models) |
go-service.md | Go (standard library HTTP, clean architecture) |
react-spa.md | React SPA (client-side routing, state management) |
generic-typescript.md | TypeScript (general-purpose, library or CLI) |
generic-python.md | Python (general-purpose, scripts or packages) |
monorepo.md | Monorepo (multi-package, shared dependencies) |
Specialized AI personas dispatched by skills for focused review:
| Agent | Role |
|---|---|
| code-reviewer | Line-by-line review for correctness, style, and maintainability |
| architecture-analyzer | Validates structural decisions against the architecture spec |
| security-auditor | Scans for vulnerabilities, secrets, auth gaps, injection risks |
| test-strategist | Evaluates test coverage, suggests missing test cases, reviews test quality |
| project-doctor | Infrastructure gap analysis with auto-fix recommendations |
| pre-push-validator | Runs 7-check validation suite (hooks, frontmatter, CSO, routing, cross-refs, quality, structure) |
| Command | What It Does |
|---|---|
/scaffold | Generate or regenerate the architecture specification for your project |
/audit | Run a full quality audit across all dimensions (tests, security, architecture, performance) |
/health | Quick project health dashboard -- see where you stand at a glance |
/diagnose | Infrastructure gap analysis with auto-fix options via project-doctor agent |
/retro | Run a retrospective -- extract signals, save learnings, consolidate memory |
/validate | Pre-push validation gate -- runs the full 7-check suite before pushing |
Without Shipworthy:
With Shipworthy:
We tested the plugin with an unbiased benchmark: same prompt, same starter project, scored by 15 automated checks. The only variable is whether the plugin is loaded.
Task 01 — Build a REST API with CRUD (Express + TypeScript):
| With Plugin | Without Plugin | |
|---|---|---|
| Score | 22/25 (A) | 12/25 (C) |
| Tests | 22 tests, all passing | 0 tests |
| Input validation | Zod schemas | Manual if/else |
| Error handling | 3 structured error types | 1 basic class |
| Architecture | 8 files, separated concerns | 5 files, simpler |
+83% score improvement. The plugin's TDD skill drove test creation, the security skill enforced Zod validation, and the API design skill produced proper status codes and error formatting.
Full methodology, all 10 task definitions, and reproducible benchmark scripts: BENCHMARKS.md
# Run benchmarks yourself
cd benchmarks && ./run-benchmark.sh --task 1 --both
We welcome contributions. See CONTRIBUTING.md for guidelines on writing new skills, adding templates, proposing agents, and submitting pull requests.
Good first contributions: add a new architecture template, improve a skill's edge case coverage, or add code examples to existing skills.
If this plugin helps you ship production-quality code, consider giving it a star.
Best experience: Claude Code
/plugin marketplace add Vimalk0703/shipworthyThen /plugin menu → select skill → restart. Use /skill-name:init for first-time setup.
Other platforms
Install via CLI
npx mdskills install Vimalk0703/shipworthyShipworthy is a free, open-source AI agent skill. AI-assisted coding is transforming how software gets built. But the data tells a sobering story: - 1.7x more bugs in AI-generated code compared to human-written code (CodeRabbit, 2025) - 40% of AI-generated code contains security vulnerabilities - 2026 is "the year of technical debt from vibe coding" (Forrester) The root cause is not the AI. It is the absence of engineering discipline between sess
Install Shipworthy with a single command:
npx mdskills install Vimalk0703/shipworthyThis downloads the skill files into your project and your AI agent picks them up automatically.
Shipworthy works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.