How do I install Skill Seekers?

Install Skill Seekers with a single command: npx mdskills install yusufkaraaslan/skill-seekers. This downloads the skill files into your project and your AI agent picks them up automatically.
What platforms support Skill Seekers?

Skill Seekers works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code, Chatgpt. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.
← Back to skills
Skill Seekers

Name: Skill Seekers: AI Agent Skill
Rating: 2 (1 reviews)
Author: yusufkaraaslan
AI & Machine LearningIntermediate
English | 简体中文 🧠 The data layer for AI systems. Skill Seekers turns any documentation, GitHub repo, or PDF into structured knowledge assets—ready to power AI Skills (Claude, Gemini, OpenAI), RAG pipelines (LangChain, LlamaIndex, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline) in minutes, not hours. Skill Seekers is the universal preprocessing layer that sits between raw documentatio
by @yusufkaraaslan0Updated 2/20/2026
Add this skill
npx mdskills install yusufkaraaslan/skill-seekers
Fork & Edit
Skill Advisor2.0
Minimal placeholder with no actionable instructions or implementation details
+Declares relevant permissions for documentation conversion tasks
-Provides no actual implementation, steps, or guidance for converting documentation
-Contains only self-referential descriptions without actionable instructions
SKILL.md
Edit in Browser
1<p align="center">
2  <img src="docs/assets/logo.png" alt="Skill Seekers" width="200"/>
3</p>
4 
5# Skill Seekers
6 
7English | [简体中文](https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/README.zh-CN.md)
8 
9[![Version](https://img.shields.io/badge/version-3.1.0--dev-blue.svg)](https://github.com/yusufkaraaslan/Skill_Seekers/releases)
10[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
11[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
12[![MCP Integration](https://img.shields.io/badge/MCP-Integrated-blue.svg)](https://modelcontextprotocol.io)
13[![Tested](https://img.shields.io/badge/Tests-1880%2B%20Passing-brightgreen.svg)](tests/)
14[![Project Board](https://img.shields.io/badge/Project-Board-purple.svg)](https://github.com/users/yusufkaraaslan/projects/2)
15[![PyPI version](https://badge.fury.io/py/skill-seekers.svg)](https://pypi.org/project/skill-seekers/)
16[![PyPI - Downloads](https://img.shields.io/pypi/dm/skill-seekers.svg)](https://pypi.org/project/skill-seekers/)
17[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/skill-seekers.svg)](https://pypi.org/project/skill-seekers/)
18[![Website](https://img.shields.io/badge/Website-skillseekersweb.com-blue.svg)](https://skillseekersweb.com/)
19[![Twitter Follow](https://img.shields.io/twitter/follow/_yUSyUS_?style=social)](https://x.com/_yUSyUS_)
20[![GitHub Repo stars](https://img.shields.io/github/stars/yusufkaraaslan/Skill_Seekers?style=social)](https://github.com/yusufkaraaslan/Skill_Seekers)
21 
22**🧠 The data layer for AI systems.** Skill Seekers turns any documentation, GitHub repo, or PDF into structured knowledge assets—ready to power AI Skills (Claude, Gemini, OpenAI), RAG pipelines (LangChain, LlamaIndex, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline) in minutes, not hours.
23 
24> 🌐 **[Visit SkillSeekersWeb.com](https://skillseekersweb.com/)** - Browse 24+ preset configs, share your configs, and access complete documentation!
25 
26> 📋 **[View Development Roadmap & Tasks](https://github.com/users/yusufkaraaslan/projects/2)** - 134 tasks across 10 categories, pick any to contribute!
27 
28## 🧠 The Data Layer for AI Systems
29 
30**Skill Seekers is the universal preprocessing layer** that sits between raw documentation and every AI system that consumes it. Whether you are building Claude skills, a LangChain RAG pipeline, or a Cursor `.cursorrules` file — the data preparation is identical. You do it once, and export to all targets.
31 
32```bash
33# One command → structured knowledge asset
34skill-seekers create https://docs.react.dev/
35# or: skill-seekers create facebook/react
36# or: skill-seekers create ./my-project
37 
38# Export to any AI system
39skill-seekers package output/react --target claude      # → Claude AI Skill (ZIP)
40skill-seekers package output/react --target langchain   # → LangChain Documents
41skill-seekers package output/react --target llama-index # → LlamaIndex TextNodes
42skill-seekers package output/react --target cursor      # → .cursorrules
43```
44 
45### What gets built
46 
47| Output | Target | What it powers |
48|--------|--------|---------------|
49| **Claude Skill** (ZIP + YAML) | `--target claude` | Claude Code, Claude API |
50| **Gemini Skill** (tar.gz) | `--target gemini` | Google Gemini |
51| **OpenAI / Custom GPT** (ZIP) | `--target openai` | GPT-4o, custom assistants |
52| **LangChain Documents** | `--target langchain` | QA chains, agents, retrievers |
53| **LlamaIndex TextNodes** | `--target llama-index` | Query engines, chat engines |
54| **Haystack Documents** | `--target haystack` | Enterprise RAG pipelines |
55| **Pinecone-ready** (Markdown) | `--target markdown` | Vector upsert |
56| **ChromaDB / FAISS / Qdrant** | `--format chroma/faiss/qdrant` | Local vector DBs |
57| **Cursor** `.cursorrules` | `--target claude` → copy | Cursor IDE AI context |
58| **Windsurf / Cline / Continue** | `--target claude` → copy | VS Code, IntelliJ, Vim |
59 
60### Why it matters
61 
62- ⚡ **99% faster** — Days of manual data prep → 15–45 minutes
63- 🎯 **AI Skill quality** — 500+ line SKILL.md files with examples, patterns, and guides
64- 📊 **RAG-ready chunks** — Smart chunking preserves code blocks and maintains context
65- 🔄 **Multi-source** — Combine docs + GitHub + PDFs into one knowledge asset
66- 🌐 **One prep, every target** — Export the same asset to 16 platforms without re-scraping
67- ✅ **Battle-tested** — 1,880+ tests, 24+ framework presets, production-ready
68 
69## 🚀 Quick Start (3 Commands)
70 
71```bash
72# 1. Install
73pip install skill-seekers
74 
75# 2. Create skill from any source
76skill-seekers create https://docs.django.com/
77 
78# 3. Package for your AI platform
79skill-seekers package output/django --target claude
80```
81 
82**That's it!** You now have `output/django-claude.zip` ready to use.
83 
84### Other Sources
85 
86```bash
87# GitHub repository
88skill-seekers create facebook/react
89 
90# Local project
91skill-seekers create ./my-project
92 
93# PDF document
94skill-seekers create manual.pdf
95```
96 
97### Export Everywhere
98 
99```bash
100# Package for multiple platforms
101for platform in claude gemini openai langchain; do
102  skill-seekers package output/django --target $platform
103done
104```
105 
106## What is Skill Seekers?
107 
108Skill Seekers is the **data layer for AI systems**. It transforms documentation websites, GitHub repositories, and PDF files into structured knowledge assets for every AI target:
109 
110| Use Case | What you get | Examples |
111|----------|-------------|---------|
112| **AI Skills** | Comprehensive SKILL.md + references | Claude Code, Gemini, GPT |
113| **RAG Pipelines** | Chunked documents with rich metadata | LangChain, LlamaIndex, Haystack |
114| **Vector Databases** | Pre-formatted data ready for upsert | Pinecone, Chroma, Weaviate, FAISS |
115| **AI Coding Assistants** | Context files your IDE AI reads automatically | Cursor, Windsurf, Cline, Continue.dev |
116 
117## 📚 Documentation
118 
119| I want to... | Read this |
120|--------------|-----------|
121| **Get started quickly** | [Quick Start](docs/getting-started/02-quick-start.md) - 3 commands to first skill |
122| **Understand concepts** | [Core Concepts](docs/user-guide/01-core-concepts.md) - How it works |
123| **Scrape sources** | [Scraping Guide](docs/user-guide/02-scraping.md) - All source types |
124| **Enhance skills** | [Enhancement Guide](docs/user-guide/03-enhancement.md) - AI enhancement |
125| **Export skills** | [Packaging Guide](docs/user-guide/04-packaging.md) - Platform export |
126| **Look up commands** | [CLI Reference](docs/reference/CLI_REFERENCE.md) - All 20 commands |
127| **Configure** | [Config Format](docs/reference/CONFIG_FORMAT.md) - JSON specification |
128| **Fix issues** | [Troubleshooting](docs/user-guide/06-troubleshooting.md) - Common problems |
129 
130**Complete documentation:** [docs/README.md](docs/README.md)
131 
132Instead of spending days on manual preprocessing, Skill Seekers:
133 
1341. **Ingests** — docs, GitHub repos, local codebases, PDFs
1352. **Analyzes** — deep AST parsing, pattern detection, API extraction
1363. **Structures** — categorized reference files with metadata
1374. **Enhances** — AI-powered SKILL.md generation (Claude, Gemini, or local)
1385. **Exports** — 16 platform-specific formats from one asset
139 
140## Why Use This?
141 
142### For AI Skill Builders (Claude, Gemini, OpenAI)
143 
144- 🎯 **Production-grade Skills** — 500+ line SKILL.md files with code examples, patterns, and guides
145- 🔄 **Enhancement Workflows** — Apply `security-focus`, `architecture-comprehensive`, or custom YAML presets
146- 🎮 **Any Domain** — Game engines (Godot, Unity), frameworks (React, Django), internal tools
147- 🔧 **Teams** — Combine internal docs + code into a single source of truth
148- 📚 **Quality** — AI-enhanced with examples, quick reference, and navigation guidance
149 
150### For RAG Builders & AI Engineers
151 
152- 🤖 **RAG-ready data** — Pre-chunked LangChain `Documents`, LlamaIndex `TextNodes`, Haystack `Documents`
153- 🚀 **99% faster** — Days of preprocessing → 15–45 minutes
154- 📊 **Smart metadata** — Categories, sources, types → better retrieval accuracy
155- 🔄 **Multi-source** — Combine docs + GitHub + PDFs in one pipeline
156- 🌐 **Platform-agnostic** — Export to any vector DB or framework without re-scraping
157 
158### For AI Coding Assistant Users
159 
160- 💻 **Cursor / Windsurf / Cline** — Generate `.cursorrules` / `.windsurfrules` / `.clinerules` automatically
161- 🎯 **Persistent context** — AI "knows" your frameworks without repeated prompting
162- 📚 **Always current** — Update context in minutes when docs change
163 
164## Key Features
165 
166### 🌐 Documentation Scraping
167- ✅ **llms.txt Support** - Automatically detects and uses LLM-ready documentation files (10x faster)
168- ✅ **Universal Scraper** - Works with ANY documentation website
169- ✅ **Smart Categorization** - Automatically organizes content by topic
170- ✅ **Code Language Detection** - Recognizes Python, JavaScript, C++, GDScript, etc.
171- ✅ **24+ Ready-to-Use Presets** - Godot, React, Vue, Django, FastAPI, and more
172 
173### 📄 PDF Support
174- ✅ **Basic PDF Extraction** - Extract text, code, and images from PDF files
175- ✅ **OCR for Scanned PDFs** - Extract text from scanned documents
176- ✅ **Password-Protected PDFs** - Handle encrypted PDFs
177- ✅ **Table Extraction** - Extract complex tables from PDFs
178- ✅ **Parallel Processing** - 3x faster for large PDFs
179- ✅ **Intelligent Caching** - 50% faster on re-runs
180 
181### 🐙 GitHub Repository Analysis
182- ✅ **Deep Code Analysis** - AST parsing for Python, JavaScript, TypeScript, Java, C++, Go
183- ✅ **API Extraction** - Functions, classes, methods with parameters and types
184- ✅ **Repository Metadata** - README, file tree, language breakdown, stars/forks
185- ✅ **GitHub Issues & PRs** - Fetch open/closed issues with labels and milestones
186- ✅ **CHANGELOG & Releases** - Automatically extract version history
187- ✅ **Conflict Detection** - Compare documented APIs vs actual code implementation
188- ✅ **MCP Integration** - Natural language: "Scrape GitHub repo facebook/react"
189 
190### 🔄 Unified Multi-Source Scraping
191- ✅ **Combine Multiple Sources** - Mix documentation + GitHub + PDF in one skill
192- ✅ **Conflict Detection** - Automatically finds discrepancies between docs and code
193- ✅ **Intelligent Merging** - Rule-based or AI-powered conflict resolution
194- ✅ **Transparent Reporting** - Side-by-side comparison with ⚠️ warnings
195- ✅ **Documentation Gap Analysis** - Identifies outdated docs and undocumented features
196- ✅ **Single Source of Truth** - One skill showing both intent (docs) and reality (code)
197- ✅ **Backward Compatible** - Legacy single-source configs still work
198 
199### 🤖 Multi-LLM Platform Support
200- ✅ **4 LLM Platforms** - Claude AI, Google Gemini, OpenAI ChatGPT, Generic Markdown
201- ✅ **Universal Scraping** - Same documentation works for all platforms
202- ✅ **Platform-Specific Packaging** - Optimized formats for each LLM
203- ✅ **One-Command Export** - `--target` flag selects platform
204- ✅ **Optional Dependencies** - Install only what you need
205- ✅ **100% Backward Compatible** - Existing Claude workflows unchanged
206 
207| Platform | Format | Upload | Enhancement | API Key | Custom Endpoint |
208|----------|--------|--------|-------------|---------|-----------------|
209| **Claude AI** | ZIP + YAML | ✅ Auto | ✅ Yes | ANTHROPIC_API_KEY | ANTHROPIC_BASE_URL |
210| **Google Gemini** | tar.gz | ✅ Auto | ✅ Yes | GOOGLE_API_KEY | - |
211| **OpenAI ChatGPT** | ZIP + Vector Store | ✅ Auto | ✅ Yes | OPENAI_API_KEY | - |
212| **Generic Markdown** | ZIP | ❌ Manual | ❌ No | - | - |
213 
214```bash
215# Claude (default - no changes needed!)
216skill-seekers package output/react/
217skill-seekers upload react.zip
218 
219# Google Gemini
220pip install skill-seekers[gemini]
221skill-seekers package output/react/ --target gemini
222skill-seekers upload react-gemini.tar.gz --target gemini
223 
224# OpenAI ChatGPT
225pip install skill-seekers[openai]
226skill-seekers package output/react/ --target openai
227skill-seekers upload react-openai.zip --target openai
228 
229# Generic Markdown (universal export)
230skill-seekers package output/react/ --target markdown
231# Use the markdown files directly in any LLM
232```
233 
234<details>
235<summary>🔧 <strong>Environment Variables for Claude-Compatible APIs (e.g., GLM-4.7)</strong></summary>
236 
237Skill Seekers supports any Claude-compatible API endpoint:
238 
239```bash
240# Option 1: Official Anthropic API (default)
241export ANTHROPIC_API_KEY=sk-ant-...
242 
243# Option 2: GLM-4.7 Claude-compatible API
244export ANTHROPIC_API_KEY=your-glm-47-api-key
245export ANTHROPIC_BASE_URL=https://glm-4-7-endpoint.com/v1
246 
247# All AI enhancement features will use the configured endpoint
248skill-seekers enhance output/react/
249skill-seekers analyze --directory . --enhance
250```
251 
252**Note**: Setting `ANTHROPIC_BASE_URL` allows you to use any Claude-compatible API endpoint, such as GLM-4.7 (智谱 AI) or other compatible services.
253 
254</details>
255 
256**Installation:**
257```bash
258# Install with Gemini support
259pip install skill-seekers[gemini]
260 
261# Install with OpenAI support
262pip install skill-seekers[openai]
263 
264# Install with all LLM platforms
265pip install skill-seekers[all-llms]
266```
267 
268### 🔗 RAG Framework Integrations
269 
270- ✅ **LangChain Documents** - Direct export to `Document` format with `page_content` + metadata
271  - Perfect for: QA chains, retrievers, vector stores, agents
272  - Example: [LangChain RAG Pipeline](examples/langchain-rag-pipeline/)
273  - Guide: [LangChain Integration](docs/integrations/LANGCHAIN.md)
274 
275- ✅ **LlamaIndex TextNodes** - Export to `TextNode` format with unique IDs + embeddings
276  - Perfect for: Query engines, chat engines, storage context
277  - Example: [LlamaIndex Query Engine](examples/llama-index-query-engine/)
278  - Guide: [LlamaIndex Integration](docs/integrations/LLAMA_INDEX.md)
279 
280- ✅ **Pinecone-Ready Format** - Optimized for vector database upsert
281  - Perfect for: Production vector search, semantic search, hybrid search
282  - Example: [Pinecone Upsert](examples/pinecone-upsert/)
283  - Guide: [Pinecone Integration](docs/integrations/PINECONE.md)
284 
285**Quick Export:**
286```bash
287# LangChain Documents (JSON)
288skill-seekers package output/django --target langchain
289# → output/django-langchain.json
290 
291# LlamaIndex TextNodes (JSON)
292skill-seekers package output/django --target llama-index
293# → output/django-llama-index.json
294 
295# Markdown (Universal)
296skill-seekers package output/django --target markdown
297# → output/django-markdown/SKILL.md + references/
298```
299 
300**Complete RAG Pipeline Guide:** [RAG Pipelines Documentation](docs/integrations/RAG_PIPELINES.md)
301 
302---
303 
304### 🧠 AI Coding Assistant Integrations
305 
306Transform any framework documentation into expert coding context for 4+ AI assistants:
307 
308- ✅ **Cursor IDE** - Generate `.cursorrules` for AI-powered code suggestions
309  - Perfect for: Framework-specific code generation, consistent patterns
310  - Works with: Cursor IDE (VS Code fork)
311  - Guide: [Cursor Integration](docs/integrations/CURSOR.md)
312  - Example: [Cursor React Skill](examples/cursor-react-skill/)
313 
314- ✅ **Windsurf** - Customize Windsurf's AI assistant context with `.windsurfrules`
315  - Perfect for: IDE-native AI assistance, flow-based coding
316  - Works with: Windsurf IDE by Codeium
317  - Guide: [Windsurf Integration](docs/integrations/WINDSURF.md)
318  - Example: [Windsurf FastAPI Context](examples/windsurf-fastapi-context/)
319 
320- ✅ **Cline (VS Code)** - System prompts + MCP for VS Code agent
321  - Perfect for: Agentic code generation in VS Code
322  - Works with: Cline extension for VS Code
323  - Guide: [Cline Integration](docs/integrations/CLINE.md)
324  - Example: [Cline Django Assistant](examples/cline-django-assistant/)
325 
326- ✅ **Continue.dev** - Context servers for IDE-agnostic AI
327  - Perfect for: Multi-IDE environments (VS Code, JetBrains, Vim), custom LLM providers
328  - Works with: Any IDE with Continue.dev plugin
329  - Guide: [Continue Integration](docs/integrations/CONTINUE_DEV.md)
330  - Example: [Continue Universal Context](examples/continue-dev-universal/)
331 
332**Quick Export for AI Coding Tools:**
333```bash
334# For any AI coding assistant (Cursor, Windsurf, Cline, Continue.dev)
335skill-seekers scrape --config configs/django.json
336skill-seekers package output/django --target claude  # or --target markdown
337 
338# Copy to your project (example for Cursor)
339cp output/django-claude/SKILL.md my-project/.cursorrules
340 
341# Or for Windsurf
342cp output/django-claude/SKILL.md my-project/.windsurf/rules/django.md
343 
344# Or for Cline
345cp output/django-claude/SKILL.md my-project/.clinerules
346 
347# Or for Continue.dev (HTTP server)
348python examples/continue-dev-universal/context_server.py
349# Configure in ~/.continue/config.json
350```
351 
352**Integration Hub:** [All AI System Integrations](docs/integrations/INTEGRATIONS.md)
353 
354---
355 
356### 🌊 Three-Stream GitHub Architecture
357- ✅ **Triple-Stream Analysis** - Split GitHub repos into Code, Docs, and Insights streams
358- ✅ **Unified Codebase Analyzer** - Works with GitHub URLs AND local paths
359- ✅ **C3.x as Analysis Depth** - Choose 'basic' (1-2 min) or 'c3x' (20-60 min) analysis
360- ✅ **Enhanced Router Generation** - GitHub metadata, README quick start, common issues
361- ✅ **Issue Integration** - Top problems and solutions from GitHub issues
362- ✅ **Smart Routing Keywords** - GitHub labels weighted 2x for better topic detection
363 
364**Three Streams Explained:**
365- **Stream 1: Code** - Deep C3.x analysis (patterns, examples, guides, configs, architecture)
366- **Stream 2: Docs** - Repository documentation (README, CONTRIBUTING, docs/*.md)
367- **Stream 3: Insights** - Community knowledge (issues, labels, stars, forks)
368 
369```python
370from skill_seekers.cli.unified_codebase_analyzer import UnifiedCodebaseAnalyzer
371 
372# Analyze GitHub repo with all three streams
373analyzer = UnifiedCodebaseAnalyzer()
374result = analyzer.analyze(
375    source="https://github.com/facebook/react",
376    depth="c3x",  # or "basic" for fast analysis
377    fetch_github_metadata=True
378)
379 
380# Access code stream (C3.x analysis)
381print(f"Design patterns: {len(result.code_analysis['c3_1_patterns'])}")
382print(f"Test examples: {result.code_analysis['c3_2_examples_count']}")
383 
384# Access docs stream (repository docs)
385print(f"README: {result.github_docs['readme'][:100]}")
386 
387# Access insights stream (GitHub metadata)
388print(f"Stars: {result.github_insights['metadata']['stars']}")
389print(f"Common issues: {len(result.github_insights['common_problems'])}")
390```
391 
392**See complete documentation**: [Three-Stream Implementation Summary](docs/IMPLEMENTATION_SUMMARY_THREE_STREAM.md)
393 
394### 🔐 Smart Rate Limit Management & Configuration
395- ✅ **Multi-Token Configuration System** - Manage multiple GitHub accounts (personal, work, OSS)
396  - Secure config storage at `~/.config/skill-seekers/config.json` (600 permissions)
397  - Per-profile rate limit strategies: `prompt`, `wait`, `switch`, `fail`
398  - Configurable timeout per profile (default: 30 min, prevents indefinite waits)
399  - Smart fallback chain: CLI arg → Env var → Config file → Prompt
400  - API key management for Claude, Gemini, OpenAI
401- ✅ **Interactive Configuration Wizard** - Beautiful terminal UI for easy setup
402  - Browser integration for token creation (auto-opens GitHub, etc.)
403  - Token validation and connection testing
404  - Visual status display with color coding
405- ✅ **Intelligent Rate Limit Handler** - No more indefinite waits!
406  - Upfront warning about rate limits (60/hour vs 5000/hour)
407  - Real-time detection from GitHub API responses
408  - Live countdown timers with progress
409  - Automatic profile switching when rate limited
410  - Four strategies: prompt (ask), wait (countdown), switch (try another), fail (abort)
411- ✅ **Resume Capability** - Continue interrupted jobs
412  - Auto-save progress at configurable intervals (default: 60 sec)
413  - List all resumable jobs with progress details
414  - Auto-cleanup of old jobs (default: 7 days)
415- ✅ **CI/CD Support** - Non-interactive mode for automation
416  - `--non-interactive` flag fails fast without prompts
417  - `--profile` flag to select specific GitHub account
418  - Clear error messages for pipeline logs
419 
420**Quick Setup:**
421```bash
422# One-time configuration (5 minutes)
423skill-seekers config --github
424 
425# Use specific profile for private repos
426skill-seekers github --repo mycompany/private-repo --profile work
427 
428# CI/CD mode (fail fast, no prompts)
429skill-seekers github --repo owner/repo --non-interactive
430 
431# Resume interrupted job
432skill-seekers resume --list
433skill-seekers resume github_react_20260117_143022
434```
435 
436**Rate Limit Strategies Explained:**
437- **prompt** (default) - Ask what to do when rate limited (wait, switch, setup token, cancel)
438- **wait** - Automatically wait with countdown timer (respects timeout)
439- **switch** - Automatically try next available profile (for multi-account setups)
440- **fail** - Fail immediately with clear error (perfect for CI/CD)
441 
442### 🎯 Bootstrap Skill - Self-Hosting
443 
444Generate skill-seekers as a Claude Code skill to use within Claude:
445 
446```bash
447# Generate the skill
448./scripts/bootstrap_skill.sh
449 
450# Install to Claude Code
451cp -r output/skill-seekers ~/.claude/skills/
452```
453 
454**What you get:**
455- ✅ **Complete skill documentation** - All CLI commands and usage patterns
456- ✅ **CLI command reference** - Every tool and its options documented
457- ✅ **Quick start examples** - Common workflows and best practices
458- ✅ **Auto-generated API docs** - Code analysis, patterns, and examples
459 
460### 🔐 Private Config Repositories
461- ✅ **Git-Based Config Sources** - Fetch configs from private/team git repositories
462- ✅ **Multi-Source Management** - Register unlimited GitHub, GitLab, Bitbucket repos
463- ✅ **Team Collaboration** - Share custom configs across 3-5 person teams
464- ✅ **Enterprise Support** - Scale to 500+ developers with priority-based resolution
465- ✅ **Secure Authentication** - Environment variable tokens (GITHUB_TOKEN, GITLAB_TOKEN)
466- ✅ **Intelligent Caching** - Clone once, pull updates automatically
467- ✅ **Offline Mode** - Work with cached configs when offline
468 
469### 🤖 Codebase Analysis (C3.x)
470 
471**C3.4: Configuration Pattern Extraction with AI Enhancement**
472- ✅ **9 Config Formats** - JSON, YAML, TOML, ENV, INI, Python, JavaScript, Dockerfile, Docker Compose
473- ✅ **7 Pattern Types** - Database, API, logging, cache, email, auth, server configurations
474- ✅ **AI Enhancement** - Optional dual-mode AI analysis (API + LOCAL)
475  - Explains what each config does
476  - Suggests best practices and improvements
477  - **Security analysis** - Finds hardcoded secrets, exposed credentials
478- ✅ **Auto-Documentation** - Generates JSON + Markdown documentation of all configs
479- ✅ **MCP Integration** - `extract_config_patterns` tool with enhancement support
480 
481**C3.3: AI-Enhanced How-To Guides**
482- ✅ **Comprehensive AI Enhancement** - Transforms basic guides into professional tutorials
483- ✅ **5 Automatic Improvements** - Step descriptions, troubleshooting, prerequisites, next steps, use cases
484- ✅ **Dual-Mode Support** - API mode (Claude API) or LOCAL mode (Claude Code CLI)
485- ✅ **No API Costs with LOCAL Mode** - FREE enhancement using your Claude Code Max plan
486- ✅ **Quality Transformation** - 75-line templates → 500+ line comprehensive guides
487 
488**Usage:**
489```bash
490# Quick analysis (1-2 min, basic features only)
491skill-seekers analyze --directory tests/ --quick
492 
493# Comprehensive analysis with AI (20-60 min, all features)
494skill-seekers analyze --directory tests/ --comprehensive
495 
496# With AI enhancement
497skill-seekers analyze --directory tests/ --enhance
498```
499 
500**Full Documentation:** [docs/HOW_TO_GUIDES.md](docs/HOW_TO_GUIDES.md#ai-enhancement-new)
501 
502### 🔄 Enhancement Workflow Presets
503 
504Reusable YAML-defined enhancement pipelines that control how AI transforms your raw documentation into a polished skill.
505 
506- ✅ **5 Bundled Presets** — `default`, `minimal`, `security-focus`, `architecture-comprehensive`, `api-documentation`
507- ✅ **User-Defined Presets** — add custom workflows to `~/.config/skill-seekers/workflows/`
508- ✅ **Multiple Workflows** — chain two or more workflows in one command
509- ✅ **Fully Managed CLI** — list, inspect, copy, add, remove, and validate workflows
510 
511```bash
512# Apply a single workflow
513skill-seekers create ./my-project --enhance-workflow security-focus
514 
515# Chain multiple workflows (applied in order)
516skill-seekers create ./my-project \
517  --enhance-workflow security-focus \
518  --enhance-workflow minimal
519 
520# Manage presets
521skill-seekers workflows list                          # List all (bundled + user)
522skill-seekers workflows show security-focus           # Print YAML content
523skill-seekers workflows copy security-focus           # Copy to user dir for editing
524skill-seekers workflows add ./my-workflow.yaml        # Install a custom preset
525skill-seekers workflows remove my-workflow            # Remove a user preset
526skill-seekers workflows validate security-focus       # Validate preset structure
527 
528# Copy multiple at once
529skill-seekers workflows copy security-focus minimal api-documentation
530 
531# Add multiple files at once
532skill-seekers workflows add ./wf-a.yaml ./wf-b.yaml
533 
534# Remove multiple at once
535skill-seekers workflows remove my-wf-a my-wf-b
536```
537 
538**YAML preset format:**
539```yaml
540name: security-focus
541description: "Security-focused review: vulnerabilities, auth, data handling"
542version: "1.0"
543stages:
544  - name: vulnerabilities
545    type: custom
546    prompt: "Review for OWASP top 10 and common security vulnerabilities..."
547  - name: auth-review
548    type: custom
549    prompt: "Examine authentication and authorisation patterns..."
550    uses_history: true
551```
552 
553### ⚡ Performance & Scale
554- ✅ **Async Mode** - 2-3x faster scraping with async/await (use `--async` flag)
555- ✅ **Large Documentation Support** - Handle 10K-40K+ page docs with intelligent splitting
556- ✅ **Router/Hub Skills** - Intelligent routing to specialized sub-skills
557- ✅ **Parallel Scraping** - Process multiple skills simultaneously
558- ✅ **Checkpoint/Resume** - Never lose progress on long scrapes
559- ✅ **Caching System** - Scrape once, rebuild instantly
560 
561### ✅ Quality Assurance
562- ✅ **Fully Tested** - 1,880+ tests with comprehensive coverage
563 
564---
565 
566## 📦 Installation
567 
568```bash
569# Basic install (documentation scraping, GitHub analysis, PDF, packaging)
570pip install skill-seekers
571 
572# With all LLM platform support
573pip install skill-seekers[all-llms]
574 
575# With MCP server
576pip install skill-seekers[mcp]
577 
578# Everything
579pip install skill-seekers[all]
580```
581 
582**Need help choosing?** Run the setup wizard:
583```bash
584skill-seekers-setup
585```
586 
587### Installation Options
588 
589| Install | Features |
590|---------|----------|
591| `pip install skill-seekers` | Scraping, GitHub analysis, PDF, all platforms |
592| `pip install skill-seekers[gemini]` | + Google Gemini support |
593| `pip install skill-seekers[openai]` | + OpenAI ChatGPT support |
594| `pip install skill-seekers[all-llms]` | + All LLM platforms |
595| `pip install skill-seekers[mcp]` | + MCP server for Claude Code, Cursor, etc. |
596| `pip install skill-seekers[all]` | Everything enabled |
597 
598---
599 
600## 🚀 One-Command Install Workflow
601 
602**The fastest way to go from config to uploaded skill - complete automation:**
603 
604```bash
605# Install React skill from official configs (auto-uploads to Claude)
606skill-seekers install --config react
607 
608# Install from local config file
609skill-seekers install --config configs/custom.json
610 
611# Install without uploading (package only)
612skill-seekers install --config django --no-upload
613 
614# Preview workflow without executing
615skill-seekers install --config react --dry-run
616```
617 
618**Time:** 20-45 minutes total | **Quality:** Production-ready (9/10) | **Cost:** Free
619 
620**Phases executed:**
621```
622📥 PHASE 1: Fetch Config (if config name provided)
623📖 PHASE 2: Scrape Documentation
624✨ PHASE 3: AI Enhancement (MANDATORY - no skip option)
625📦 PHASE 4: Package Skill
626☁️  PHASE 5: Upload to Claude (optional, requires API key)
627```
628 
629**Requirements:**
630- ANTHROPIC_API_KEY environment variable (for auto-upload)
631- Claude Code Max plan (for local AI enhancement)
632 
633---
634 
635## 📊 Feature Matrix
636 
637Skill Seekers supports **4 LLM platforms** and **5 skill modes** with full feature parity.
638 
639**Platforms:** Claude AI, Google Gemini, OpenAI ChatGPT, Generic Markdown
640**Skill Modes:** Documentation, GitHub, PDF, Unified Multi-Source, Local Repository
641 
642See [Complete Feature Matrix](docs/FEATURE_MATRIX.md) for detailed platform and feature support.
643 
644### Quick Platform Comparison
645 
646| Feature | Claude | Gemini | OpenAI | Markdown |
647|---------|--------|--------|--------|----------|
648| Format | ZIP + YAML | tar.gz | ZIP + Vector | ZIP |
649| Upload | ✅ API | ✅ API | ✅ API | ❌ Manual |
650| Enhancement | ✅ Sonnet 4 | ✅ 2.0 Flash | ✅ GPT-4o | ❌ None |
651| All Skill Modes | ✅ | ✅ | ✅ | ✅ |
652 
653---
654 
655## Usage Examples
656 
657### Documentation Scraping
658 
659```bash
660# Scrape documentation website
661skill-seekers scrape --config configs/react.json
662 
663# Quick scrape without config
664skill-seekers scrape --url https://react.dev --name react
665 
666# With async mode (3x faster)
667skill-seekers scrape --config configs/godot.json --async --workers 8
668```
669 
670### PDF Extraction
671 
672```bash
673# Basic PDF extraction
674skill-seekers pdf --pdf docs/manual.pdf --name myskill
675 
676# Advanced features
677skill-seekers pdf --pdf docs/manual.pdf --name myskill \
678    --extract-tables \        # Extract tables
679    --parallel \              # Fast parallel processing
680    --workers 8               # Use 8 CPU cores
681 
682# Scanned PDFs (requires: pip install pytesseract Pillow)
683skill-seekers pdf --pdf docs/scanned.pdf --name myskill --ocr
684```
685 
686### GitHub Repository Analysis
687 
688```bash
689# Basic repository scraping
690skill-seekers github --repo facebook/react
691 
692# With authentication (higher rate limits)
693export GITHUB_TOKEN=ghp_your_token_here
694skill-seekers github --repo facebook/react
695 
696# Customize what to include
697skill-seekers github --repo django/django \
698    --include-issues \        # Extract GitHub Issues
699    --max-issues 100 \        # Limit issue count
700    --include-changelog       # Extract CHANGELOG.md
701```
702 
703### Unified Multi-Source Scraping
704 
705**Combine documentation + GitHub + PDF into one unified skill with conflict detection:**
706 
707```bash
708# Use existing unified configs
709skill-seekers unified --config configs/react_unified.json
710skill-seekers unified --config configs/django_unified.json
711 
712# Or create unified config
713cat > configs/myframework_unified.json << 'EOF'
714{
715  "name": "myframework",
716  "merge_mode": "rule-based",
717  "sources": [
718    {
719      "type": "documentation",
720      "base_url": "https://docs.myframework.com/",
721      "max_pages": 200
722    },
723    {
724      "type": "github",
725      "repo": "owner/myframework",
726      "code_analysis_depth": "surface"
727    }
728  ]
729}
730EOF
731 
732skill-seekers unified --config configs/myframework_unified.json
733```
734 
735**Conflict Detection automatically finds:**
736- 🔴 **Missing in code** (high): Documented but not implemented
737- 🟡 **Missing in docs** (medium): Implemented but not documented
738- ⚠️ **Signature mismatch**: Different parameters/types
739- ℹ️ **Description mismatch**: Different explanations
740 
741**Full Guide:** See [docs/UNIFIED_SCRAPING.md](docs/UNIFIED_SCRAPING.md) for complete documentation.
742 
743### Private Config Repositories
744 
745**Share custom configs across teams using private git repositories:**
746 
747```bash
748# Option 1: Using MCP tools (recommended)
749# Register your team's private repo
750add_config_source(
751    name="team",
752    git_url="https://github.com/mycompany/skill-configs.git",
753    token_env="GITHUB_TOKEN"
754)
755 
756# Fetch config from team repo
757fetch_config(source="team", config_name="internal-api")
758```
759 
760**Supported Platforms:**
761- GitHub (`GITHUB_TOKEN`), GitLab (`GITLAB_TOKEN`), Gitea (`GITEA_TOKEN`), Bitbucket (`BITBUCKET_TOKEN`)
762 
763**Full Guide:** See [docs/GIT_CONFIG_SOURCES.md](docs/GIT_CONFIG_SOURCES.md) for complete documentation.
764 
765## How It Works
766 
767```mermaid
768graph LR
769    A[Documentation Website] --> B[Skill Seekers]
770    B --> C[Scraper]
771    B --> D[AI Enhancement]
772    B --> E[Packager]
773    C --> F[Organized References]
774    D --> F
775    F --> E
776    E --> G[Claude Skill .zip]
777    G --> H[Upload to Claude AI]
778```
779 
7800. **Detect llms.txt** - Checks for llms-full.txt, llms.txt, llms-small.txt first
7811. **Scrape**: Extracts all pages from documentation
7822. **Categorize**: Organizes content into topics (API, guides, tutorials, etc.)
7833. **Enhance**: AI analyzes docs and creates comprehensive SKILL.md with examples
7844. **Package**: Bundles everything into a Claude-ready `.zip` file
785 
786## 📋 Prerequisites
787 
788**Before you start, make sure you have:**
789 
7901. **Python 3.10 or higher** - [Download](https://www.python.org/downloads/) | Check: `python3 --version`
7912. **Git** - [Download](https://git-scm.com/) | Check: `git --version`
7923. **15-30 minutes** for first-time setup
793 
794**First time user?** → **[Start Here: Bulletproof Quick Start Guide](BULLETPROOF_QUICKSTART.md)** 🎯
795 
796---
797 
798## 📤 Uploading Skills to Claude
799 
800Once your skill is packaged, you need to upload it to Claude:
801 
802### Option 1: Automatic Upload (API-based)
803 
804```bash
805# Set your API key (one-time)
806export ANTHROPIC_API_KEY=sk-ant-...
807 
808# Package and upload automatically
809skill-seekers package output/react/ --upload
810 
811# OR upload existing .zip
812skill-seekers upload output/react.zip
813```
814 
815### Option 2: Manual Upload (No API Key)
816 
817```bash
818# Package skill
819skill-seekers package output/react/
820# → Creates output/react.zip
821 
822# Then manually upload:
823# - Go to https://claude.ai/skills
824# - Click "Upload Skill"
825# - Select output/react.zip
826```
827 
828### Option 3: MCP (Claude Code)
829 
830```
831In Claude Code, just ask:
832"Package and upload the React skill"
833```
834 
835---
836 
837## 🤖 Installing to AI Agents
838 
839Skill Seekers can automatically install skills to 10+ AI coding agents.
840 
841```bash
842# Install to specific agent
843skill-seekers install-agent output/react/ --agent cursor
844 
845# Install to all agents at once
846skill-seekers install-agent output/react/ --agent all
847 
848# Preview without installing
849skill-seekers install-agent output/react/ --agent cursor --dry-run
850```
851 
852### Supported Agents
853 
854| Agent | Path | Type |
855|-------|------|------|
856| **Claude Code** | `~/.claude/skills/` | Global |
857| **Cursor** | `.cursor/skills/` | Project |
858| **VS Code / Copilot** | `.github/skills/` | Project |
859| **Amp** | `~/.amp/skills/` | Global |
860| **Goose** | `~/.config/goose/skills/` | Global |
861| **OpenCode** | `~/.opencode/skills/` | Global |
862| **Windsurf** | `~/.windsurf/skills/` | Global |
863 
864---
865 
866## 🔌 MCP Integration (26 Tools)
867 
868Skill Seekers ships an MCP server for use from Claude Code, Cursor, Windsurf, VS Code + Cline, or IntelliJ IDEA.
869 
870```bash
871# stdio mode (Claude Code, VS Code + Cline)
872python -m skill_seekers.mcp.server_fastmcp
873 
874# HTTP mode (Cursor, Windsurf, IntelliJ)
875python -m skill_seekers.mcp.server_fastmcp --transport http --port 8765
876 
877# Auto-configure all agents at once
878./setup_mcp.sh
879```
880 
881**All 26 tools available:**
882- **Core (9):** `list_configs`, `generate_config`, `validate_config`, `estimate_pages`, `scrape_docs`, `package_skill`, `upload_skill`, `enhance_skill`, `install_skill`
883- **Extended (10):** `scrape_github`, `scrape_pdf`, `unified_scrape`, `merge_sources`, `detect_conflicts`, `add_config_source`, `fetch_config`, `list_config_sources`, `remove_config_source`, `split_config`
884- **Vector DB (4):** `export_to_chroma`, `export_to_weaviate`, `export_to_faiss`, `export_to_qdrant`
885- **Cloud (3):** `cloud_upload`, `cloud_download`, `cloud_list`
886 
887**Full Guide:** [docs/MCP_SETUP.md](docs/MCP_SETUP.md)
888 
889---
890 
891## ⚙️ Configuration
892 
893### Available Presets (24+)
894 
895```bash
896# List all presets
897skill-seekers list-configs
898```
899 
900| Category | Presets |
901|----------|---------|
902| **Web Frameworks** | `react`, `vue`, `angular`, `svelte`, `nextjs` |
903| **Python** | `django`, `flask`, `fastapi`, `sqlalchemy`, `pytest` |
904| **Game Development** | `godot`, `pygame`, `unity` |
905| **Tools & DevOps** | `docker`, `kubernetes`, `terraform`, `ansible` |
906| **Unified (Docs + GitHub)** | `react-unified`, `vue-unified`, `nextjs-unified`, and more |
907 
908### Creating Your Own Config
909 
910```bash
911# Option 1: Interactive
912skill-seekers scrape --interactive
913 
914# Option 2: Copy and edit a preset
915cp configs/react.json configs/myframework.json
916nano configs/myframework.json
917skill-seekers scrape --config configs/myframework.json
918```
919 
920### Config File Structure
921 
922```json
923{
924  "name": "myframework",
925  "description": "When to use this skill",
926  "base_url": "https://docs.myframework.com/",
927  "selectors": {
928    "main_content": "article",
929    "title": "h1",
930    "code_blocks": "pre code"
931  },
932  "url_patterns": {
933    "include": ["/docs", "/guide"],
934    "exclude": ["/blog", "/about"]
935  },
936  "categories": {
937    "getting_started": ["intro", "quickstart"],
938    "api": ["api", "reference"]
939  },
940  "rate_limit": 0.5,
941  "max_pages": 500
942}
943```
944 
945### Where to Store Configs
946 
947The tool searches in this order:
9481. Exact path as provided
9492. `./configs/` (current directory)
9503. `~/.config/skill-seekers/configs/` (user config directory)
9514. SkillSeekersWeb.com API (preset configs)
952 
953---
954 
955## 📊 What Gets Created
956 
957```
958output/
959├── godot_data/              # Scraped raw data
960│   ├── pages/              # JSON files (one per page)
961│   └── summary.json        # Overview
962│
963└── godot/                   # The skill
964    ├── SKILL.md            # Enhanced with real examples
965    ├── references/         # Categorized docs
966    │   ├── index.md
967    │   ├── getting_started.md
968    │   ├── scripting.md
969    │   └── ...
970    ├── scripts/            # Empty (add your own)
971    └── assets/             # Empty (add your own)
972```
973 
974---
975 
976## 🐛 Troubleshooting
977 
978### No Content Extracted?
979- Check your `main_content` selector
980- Try: `article`, `main`, `div[role="main"]`
981 
982### Data Exists But Won't Use It?
983```bash
984# Force re-scrape
985rm -rf output/myframework_data/
986skill-seekers scrape --config configs/myframework.json
987```
988 
989### Categories Not Good?
990Edit the config `categories` section with better keywords.
991 
992### Want to Update Docs?
993```bash
994# Delete old data and re-scrape
995rm -rf output/godot_data/
996skill-seekers scrape --config configs/godot.json
997```
998 
999### Enhancement Not Working?
1000```bash
1001# Check if API key is set
1002echo $ANTHROPIC_API_KEY
1003 
1004# Try LOCAL mode instead (uses Claude Code Max, no API key needed)
1005skill-seekers enhance output/react/ --mode LOCAL
1006 
1007# Monitor background enhancement status
1008skill-seekers enhance-status output/react/ --watch
1009```
1010 
1011### GitHub Rate Limit Issues?
1012```bash
1013# Set a GitHub token (5000 req/hour vs 60/hour anonymous)
1014export GITHUB_TOKEN=ghp_your_token_here
1015 
1016# Or configure multiple profiles
1017skill-seekers config --github
1018```
1019 
1020---
1021 
1022## 📈 Performance
1023 
1024| Task | Time | Notes |
1025|------|------|-------|
1026| Scraping (sync) | 15-45 min | First time only, thread-based |
1027| Scraping (async) | 5-15 min | 2-3x faster with `--async` flag |
1028| Building | 1-3 min | Fast rebuild from cache |
1029| Re-building | <1 min | With `--skip-scrape` |
1030| Enhancement (LOCAL) | 30-60 sec | Uses Claude Code Max |
1031| Enhancement (API) | 20-40 sec | Requires API key |
1032| Packaging | 5-10 sec | Final .zip creation |
1033 
1034---
1035 
1036## 📚 Documentation
1037 
1038### Getting Started
1039- **[BULLETPROOF_QUICKSTART.md](BULLETPROOF_QUICKSTART.md)** - 🎯 **START HERE** if you're new!
1040- **[QUICKSTART.md](QUICKSTART.md)** - Quick start for experienced users
1041- **[TROUBLESHOOTING.md](TROUBLESHOOTING.md)** - Common issues and solutions
1042- **[docs/QUICK_REFERENCE.md](docs/QUICK_REFERENCE.md)** - One-page cheat sheet
1043 
1044### Guides
1045- **[docs/LARGE_DOCUMENTATION.md](docs/LARGE_DOCUMENTATION.md)** - Handle 10K-40K+ page docs
1046- **[ASYNC_SUPPORT.md](ASYNC_SUPPORT.md)** - Async mode guide (2-3x faster scraping)
1047- **[docs/ENHANCEMENT_MODES.md](docs/ENHANCEMENT_MODES.md)** - AI enhancement modes guide
1048- **[docs/MCP_SETUP.md](docs/MCP_SETUP.md)** - MCP integration setup
1049- **[docs/UNIFIED_SCRAPING.md](docs/UNIFIED_SCRAPING.md)** - Multi-source scraping
1050 
1051### Integration Guides
1052- **[docs/integrations/LANGCHAIN.md](docs/integrations/LANGCHAIN.md)** - LangChain RAG
1053- **[docs/integrations/CURSOR.md](docs/integrations/CURSOR.md)** - Cursor IDE
1054- **[docs/integrations/WINDSURF.md](docs/integrations/WINDSURF.md)** - Windsurf IDE
1055- **[docs/integrations/CLINE.md](docs/integrations/CLINE.md)** - Cline (VS Code)
1056- **[docs/integrations/RAG_PIPELINES.md](docs/integrations/RAG_PIPELINES.md)** - All RAG pipelines
1057 
1058---
1059 
1060## 📝 License
1061 
1062MIT License - see [LICENSE](LICENSE) file for details
1063 
1064---
1065 
1066Happy skill building! 🚀
1067 
1068---
1069 
1070## 🔒 Security
1071 
1072[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/yusufkaraaslan-skill-seekers-badge.png)](https://mseep.ai/app/yusufkaraaslan-skill-seekers)
1073
Full transparency — inspect the skill content before installing.