SKILL.md Best Practices
Everything we’ve learned about what separates skills that agents use well from skills that agents stumble through. These are opinionated recommendations from the community — not spec requirements.
1. Start with the description
The description field is the most important line in your entire skill. When an agent has 50 or 100 skills loaded, it reads every description to decide which one to activate. Yours needs to win that competition.
A good description includes both what the skill does and when to trigger it. Include the specific nouns and verbs a user would actually type. If your skill processes Excel files, the words “Excel” and “.xlsx” both need to be there.
Always write in third person. Descriptions written in first person (“I can help you...”) cause real discovery problems because the description is injected into the system prompt — mixing points of view confuses the agent about who’s speaking.
2. Be concise — context window is shared
Your skill shares the context window with everything else the agent needs: the system prompt, conversation history, other skills’ metadata, and the user’s actual request. Every unnecessary token in your skill pushes out something the user said.
The default assumption: agents already know how to code, what common file formats are, and how libraries work. Only add context the agent doesn’t already have. Challenge every paragraph — “Does the agent really need me to explain what a PDF is?”
Keep the main SKILL.md under 500 lines. This isn’t a soft guideline — skills that exceed this measurably degrade agent performance. If you have more content, split it into separate reference files.
3. Match freedom to fragility
Not every instruction needs to be precise. The right level of specificity depends on how fragile the operation is. Think of it like navigation: a narrow bridge with cliffs on both sides needs exact step-by-step instructions. An open field with no hazards just needs a general direction.
High freedom — multiple valid approaches
Use for creative tasks, analysis, and code reviews where context determines the best path.
Medium freedom — preferred pattern with room to adapt
Use when a pattern exists but details depend on the situation.
Low freedom — fragile operations
Use for database migrations, deployments, and anything where one wrong flag breaks things.
4. Structure for progressive disclosure
SKILL.md is the table of contents, not the encyclopedia. It should tell the agent what’s available and where to find it. The agent loads the main file on activation, then reads reference files only when they’re actually needed — so bundled files cost zero tokens until accessed.
Three rules that matter:
- Keep references one level deep. If SKILL.md links to A, and A links to B, the agent may only partially read B. Link everything directly from SKILL.md.
- Add a table of contents to any reference file over 100 lines. The agent can scan it before deciding which section to read in full.
- Name files descriptively. Use
form_validation_rules.md, notdoc2.md. The agent uses file names to decide what to read.
5. Use workflows for multi-step tasks
For complex operations, break the work into numbered steps. Agents follow sequential instructions far more reliably than freeform paragraphs. For particularly complex workflows, provide a copyable checklist — the agent can paste it into its response and track progress.
If workflows become large, push them into separate files and tell the agent to read the appropriate one based on the task at hand.
6. Build in feedback loops
The single most effective quality improvement is adding a validation step. The pattern is simple: do the thing, check the result, fix if needed, check again. Without explicit loop instructions, agents tend to validate once and move on regardless of the result.
This works for both code-based skills (run a script) and instruction-based skills (review against a checklist). The key is explicitly telling the agent to repeat until clean.
7. Name things consistently
Skill names must use lowercase letters, numbers, and hyphens. Consider the gerund form — it reads naturally as a capability:
Avoid vague names like helper, utils, or tools. When someone is browsing the marketplace, processing-pdfs tells them what they’re getting. pdf-helper doesn’t.
Inside your skill, pick one term and use it everywhere. If you call it an “API endpoint” in one place, don’t call it a “route” somewhere else. Inconsistent terminology confuses agents the same way it confuses people.
8. Common patterns that work
Template pattern
Provide a template when output format consistency matters. Be explicit about how strict the template is:
Examples pattern
Input/output pairs work in skills the same way few-shot prompting works in conversations. Show 2-3 examples so the agent understands the expected style:
Conditional workflow pattern
When a task branches based on context, route the agent explicitly rather than leaving it to figure out the right path:
9. Test with real usage, not assumptions
The best way to write a skill: do the task once with an agent using normal prompting. Notice what context you keep providing over and over. That repeated context is what should go in the skill.
Then test it for real. Use one agent instance to help write the skill, and a fresh instance (with the skill loaded) to test it on actual tasks. Watch how the agent navigates your skill:
- Does it find the right reference files?
- Does it skip important sections?
- Does it read files in an order you didn’t expect?
- Does a smaller model need more detail than a larger one?
Adjust based on what you observe, not what you assume. If you plan to support multiple models (Haiku, Sonnet, Opus), test with all of them — what works for a powerful model might be too vague for a faster one.
10. Mistakes we see often
Explaining things agents already know
You don’t need to explain what JSON is, how REST APIs work, or what a CSV file contains. Every line of explanation the agent doesn’t need is a line of conversation history it loses.
Descriptions that are too vague
“Helps with documents” tells the agent nothing. Be specific about the format, the operation, and the trigger. A skill that “Extracts text and tables from PDF files” is findable. A skill that “helps with documents” is invisible.
Offering too many choices
“You can use pypdf, or pdfplumber, or PyMuPDF, or pdf2image...” Pick one default. Mention an alternative only if there’s a genuine reason to choose it (e.g., scanned PDFs requiring OCR).
Windows-style backslash paths
Always use forward slashes: scripts/helper.py, not scripts\helper.py. Forward slashes work everywhere. Backslashes break on Unix.
Time-sensitive instructions
“If you’re doing this before August 2025, use the old API” will be wrong eventually. Put current instructions front and center. Push legacy info into a collapsible section or separate file.
Deeply nested file references
SKILL.md links to A, A links to B, B has the actual content. The agent may only partially read B. Keep references one level deep from SKILL.md.
11. Declare permissions honestly
Every skill declares which permissions it needs — filesystem read/write, shell execution, network access, git write. The Skill Advisor cross-references these declarations against what your instructions actually do. Mismatches are the #1 reason skills get flagged.
Only request what you need
If your skill only reads files, don’t declare filesystem_write. If it never runs shell commands, don’t declare shell_exec. Over-scoped permissions erode trust and get flagged in reviews.
Shell commands need guardrails
If your skill runs shell commands, use exact commands — never interpolate user input directly into a shell string. Show the agent exactly what to run.
Never hardcode credentials
Always use environment variables for secrets. Never log or display credential values. Tell the agent how to access secrets safely.
Guard against prompt injection
If your skill processes untrusted content (user uploads, web pages, external files), add explicit instructions telling the agent to treat that content as data, not instructions.
Document your network endpoints
If your skill makes HTTP requests, declare network_access and document which endpoints are called and why. Users and reviewers should be able to verify that network calls are expected and necessary.