Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
Add this skill
npx mdskills install sickn33/agent-orchestration-improve-agentComprehensive agent optimization workflow with metrics, A/B testing, and staged deployment protocols
Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
[Extended thinking: Agent optimization requires a data-driven approach combining performance metrics, user feedback analysis, and advanced prompt engineering techniques. Success depends on systematic evaluation, targeted improvements, and rigorous testing with rollback capabilities for production safety.]
Comprehensive analysis of agent performance using context-manager for historical data collection.
Use: context-manager
Command: analyze-agent-performance $ARGUMENTS --days 30
Collect metrics including:
Identify recurring patterns in user interactions:
Categorize failures by root cause:
Generate quantitative baseline metrics:
Performance Baseline:
- Task Success Rate: [X%]
- Average Corrections per Task: [Y]
- Tool Call Efficiency: [Z%]
- User Satisfaction Score: [1-10]
- Average Response Latency: [Xms]
- Token Efficiency Ratio: [X:Y]
Apply advanced prompt optimization techniques using prompt-engineer agent.
Implement structured reasoning patterns:
Use: prompt-engineer
Technique: chain-of-thought-optimization
Curate high-quality examples from successful interactions:
Example structure:
Good Example:
Input: [User request]
Reasoning: [Step-by-step thought process]
Output: [Successful response]
Why this works: [Key success factors]
Bad Example:
Input: [Similar request]
Output: [Failed response]
Why this fails: [Specific issues]
Correct approach: [Fixed version]
Strengthen agent identity and capabilities:
Implement self-correction mechanisms:
Constitutional Principles:
1. Verify factual accuracy before responding
2. Self-check for potential biases or harmful content
3. Validate output format matches requirements
4. Ensure response completeness
5. Maintain consistency with previous responses
Add critique-and-revise loops:
Optimize response structure:
Comprehensive testing framework with A/B comparison.
Create representative test scenarios:
Test Categories:
1. Golden path scenarios (common successful cases)
2. Previously failed tasks (regression testing)
3. Edge cases and corner scenarios
4. Stress tests (complex, multi-step tasks)
5. Adversarial inputs (potential breaking points)
6. Cross-domain tasks (combining capabilities)
Compare original vs improved agent:
Use: parallel-test-runner
Config:
- Agent A: Original version
- Agent B: Improved version
- Test set: 100 representative tasks
- Metrics: Success rate, speed, token usage
- Evaluation: Blind human review + automated scoring
Statistical significance testing:
Rollback Process:
### 4.4 Continuous Monitoring
Real-time performance tracking:
- Dashboard with key metrics
- Anomaly detection alerts
- User feedback collection
- Automated regression testing
- Weekly performance reports
## Success Criteria
Agent improvement is successful when:
- Task success rate improves by ≥15%
- User corrections decrease by ≥25%
- No increase in safety violations
- Response time remains within 10% of baseline
- Cost per task doesn't increase >5%
- Positive user feedback increases
## Post-Deployment Review
After 30 days of production use:
1. Analyze accumulated performance data
2. Compare against baseline and targets
3. Identify new improvement opportunities
4. Document lessons learned
5. Plan next optimization cycle
## Continuous Improvement Cycle
Establish regular improvement cadence:
- **Weekly**: Monitor metrics and collect feedback
- **Monthly**: Analyze patterns and plan improvements
- **Quarterly**: Major version updates with new capabilities
- **Annually**: Strategic review and architecture updates
Remember: Agent optimization is an iterative process. Each cycle builds upon previous learnings, gradually improving performance while maintaining stability and safety.
Install via CLI
npx mdskills install sickn33/agent-orchestration-improve-agentAgent Orchestration Improve Agent is a free, open-source AI agent skill. Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
Install Agent Orchestration Improve Agent with a single command:
npx mdskills install sickn33/agent-orchestration-improve-agentThis downloads the skill files into your project and your AI agent picks them up automatically.
Agent Orchestration Improve Agent works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.