English | ็ฎไฝไธญๆ ๐ง The data layer for AI systems. Skill Seekers turns any documentation, GitHub repo, or PDF into structured knowledge assetsโready to power AI Skills (Claude, Gemini, OpenAI), RAG pipelines (LangChain, LlamaIndex, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline) in minutes, not hours. Skill Seekers is the universal preprocessing layer that sits between raw documentatio
Add this skill
npx mdskills install yusufkaraaslan/skill-seekersMinimal placeholder with no actionable instructions or implementation details
1<p align="center">2 <img src="docs/assets/logo.png" alt="Skill Seekers" width="200"/>3</p>45# Skill Seekers67English | [็ฎไฝไธญๆ](https://github.com/yusufkaraaslan/Skill_Seekers/blob/main/README.zh-CN.md)89[](https://github.com/yusufkaraaslan/Skill_Seekers/releases)10[](https://opensource.org/licenses/MIT)11[](https://www.python.org/downloads/)12[](https://modelcontextprotocol.io)13[](tests/)14[](https://github.com/users/yusufkaraaslan/projects/2)15[](https://pypi.org/project/skill-seekers/)16[](https://pypi.org/project/skill-seekers/)17[](https://pypi.org/project/skill-seekers/)18[](https://skillseekersweb.com/)19[](https://x.com/_yUSyUS_)20[](https://github.com/yusufkaraaslan/Skill_Seekers)2122**๐ง The data layer for AI systems.** Skill Seekers turns any documentation, GitHub repo, or PDF into structured knowledge assetsโready to power AI Skills (Claude, Gemini, OpenAI), RAG pipelines (LangChain, LlamaIndex, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline) in minutes, not hours.2324> ๐ **[Visit SkillSeekersWeb.com](https://skillseekersweb.com/)** - Browse 24+ preset configs, share your configs, and access complete documentation!2526> ๐ **[View Development Roadmap & Tasks](https://github.com/users/yusufkaraaslan/projects/2)** - 134 tasks across 10 categories, pick any to contribute!2728## ๐ง The Data Layer for AI Systems2930**Skill Seekers is the universal preprocessing layer** that sits between raw documentation and every AI system that consumes it. Whether you are building Claude skills, a LangChain RAG pipeline, or a Cursor `.cursorrules` file โ the data preparation is identical. You do it once, and export to all targets.3132```bash33# One command โ structured knowledge asset34skill-seekers create https://docs.react.dev/35# or: skill-seekers create facebook/react36# or: skill-seekers create ./my-project3738# Export to any AI system39skill-seekers package output/react --target claude # โ Claude AI Skill (ZIP)40skill-seekers package output/react --target langchain # โ LangChain Documents41skill-seekers package output/react --target llama-index # โ LlamaIndex TextNodes42skill-seekers package output/react --target cursor # โ .cursorrules43```4445### What gets built4647| Output | Target | What it powers |48|--------|--------|---------------|49| **Claude Skill** (ZIP + YAML) | `--target claude` | Claude Code, Claude API |50| **Gemini Skill** (tar.gz) | `--target gemini` | Google Gemini |51| **OpenAI / Custom GPT** (ZIP) | `--target openai` | GPT-4o, custom assistants |52| **LangChain Documents** | `--target langchain` | QA chains, agents, retrievers |53| **LlamaIndex TextNodes** | `--target llama-index` | Query engines, chat engines |54| **Haystack Documents** | `--target haystack` | Enterprise RAG pipelines |55| **Pinecone-ready** (Markdown) | `--target markdown` | Vector upsert |56| **ChromaDB / FAISS / Qdrant** | `--format chroma/faiss/qdrant` | Local vector DBs |57| **Cursor** `.cursorrules` | `--target claude` โ copy | Cursor IDE AI context |58| **Windsurf / Cline / Continue** | `--target claude` โ copy | VS Code, IntelliJ, Vim |5960### Why it matters6162- โก **99% faster** โ Days of manual data prep โ 15โ45 minutes63- ๐ฏ **AI Skill quality** โ 500+ line SKILL.md files with examples, patterns, and guides64- ๐ **RAG-ready chunks** โ Smart chunking preserves code blocks and maintains context65- ๐ **Multi-source** โ Combine docs + GitHub + PDFs into one knowledge asset66- ๐ **One prep, every target** โ Export the same asset to 16 platforms without re-scraping67- โ **Battle-tested** โ 1,880+ tests, 24+ framework presets, production-ready6869## ๐ Quick Start (3 Commands)7071```bash72# 1. Install73pip install skill-seekers7475# 2. Create skill from any source76skill-seekers create https://docs.django.com/7778# 3. Package for your AI platform79skill-seekers package output/django --target claude80```8182**That's it!** You now have `output/django-claude.zip` ready to use.8384### Other Sources8586```bash87# GitHub repository88skill-seekers create facebook/react8990# Local project91skill-seekers create ./my-project9293# PDF document94skill-seekers create manual.pdf95```9697### Export Everywhere9899```bash100# Package for multiple platforms101for platform in claude gemini openai langchain; do102 skill-seekers package output/django --target $platform103done104```105106## What is Skill Seekers?107108Skill Seekers is the **data layer for AI systems**. It transforms documentation websites, GitHub repositories, and PDF files into structured knowledge assets for every AI target:109110| Use Case | What you get | Examples |111|----------|-------------|---------|112| **AI Skills** | Comprehensive SKILL.md + references | Claude Code, Gemini, GPT |113| **RAG Pipelines** | Chunked documents with rich metadata | LangChain, LlamaIndex, Haystack |114| **Vector Databases** | Pre-formatted data ready for upsert | Pinecone, Chroma, Weaviate, FAISS |115| **AI Coding Assistants** | Context files your IDE AI reads automatically | Cursor, Windsurf, Cline, Continue.dev |116117## ๐ Documentation118119| I want to... | Read this |120|--------------|-----------|121| **Get started quickly** | [Quick Start](docs/getting-started/02-quick-start.md) - 3 commands to first skill |122| **Understand concepts** | [Core Concepts](docs/user-guide/01-core-concepts.md) - How it works |123| **Scrape sources** | [Scraping Guide](docs/user-guide/02-scraping.md) - All source types |124| **Enhance skills** | [Enhancement Guide](docs/user-guide/03-enhancement.md) - AI enhancement |125| **Export skills** | [Packaging Guide](docs/user-guide/04-packaging.md) - Platform export |126| **Look up commands** | [CLI Reference](docs/reference/CLI_REFERENCE.md) - All 20 commands |127| **Configure** | [Config Format](docs/reference/CONFIG_FORMAT.md) - JSON specification |128| **Fix issues** | [Troubleshooting](docs/user-guide/06-troubleshooting.md) - Common problems |129130**Complete documentation:** [docs/README.md](docs/README.md)131132Instead of spending days on manual preprocessing, Skill Seekers:1331341. **Ingests** โ docs, GitHub repos, local codebases, PDFs1352. **Analyzes** โ deep AST parsing, pattern detection, API extraction1363. **Structures** โ categorized reference files with metadata1374. **Enhances** โ AI-powered SKILL.md generation (Claude, Gemini, or local)1385. **Exports** โ 16 platform-specific formats from one asset139140## Why Use This?141142### For AI Skill Builders (Claude, Gemini, OpenAI)143144- ๐ฏ **Production-grade Skills** โ 500+ line SKILL.md files with code examples, patterns, and guides145- ๐ **Enhancement Workflows** โ Apply `security-focus`, `architecture-comprehensive`, or custom YAML presets146- ๐ฎ **Any Domain** โ Game engines (Godot, Unity), frameworks (React, Django), internal tools147- ๐ง **Teams** โ Combine internal docs + code into a single source of truth148- ๐ **Quality** โ AI-enhanced with examples, quick reference, and navigation guidance149150### For RAG Builders & AI Engineers151152- ๐ค **RAG-ready data** โ Pre-chunked LangChain `Documents`, LlamaIndex `TextNodes`, Haystack `Documents`153- ๐ **99% faster** โ Days of preprocessing โ 15โ45 minutes154- ๐ **Smart metadata** โ Categories, sources, types โ better retrieval accuracy155- ๐ **Multi-source** โ Combine docs + GitHub + PDFs in one pipeline156- ๐ **Platform-agnostic** โ Export to any vector DB or framework without re-scraping157158### For AI Coding Assistant Users159160- ๐ป **Cursor / Windsurf / Cline** โ Generate `.cursorrules` / `.windsurfrules` / `.clinerules` automatically161- ๐ฏ **Persistent context** โ AI "knows" your frameworks without repeated prompting162- ๐ **Always current** โ Update context in minutes when docs change163164## Key Features165166### ๐ Documentation Scraping167- โ **llms.txt Support** - Automatically detects and uses LLM-ready documentation files (10x faster)168- โ **Universal Scraper** - Works with ANY documentation website169- โ **Smart Categorization** - Automatically organizes content by topic170- โ **Code Language Detection** - Recognizes Python, JavaScript, C++, GDScript, etc.171- โ **24+ Ready-to-Use Presets** - Godot, React, Vue, Django, FastAPI, and more172173### ๐ PDF Support174- โ **Basic PDF Extraction** - Extract text, code, and images from PDF files175- โ **OCR for Scanned PDFs** - Extract text from scanned documents176- โ **Password-Protected PDFs** - Handle encrypted PDFs177- โ **Table Extraction** - Extract complex tables from PDFs178- โ **Parallel Processing** - 3x faster for large PDFs179- โ **Intelligent Caching** - 50% faster on re-runs180181### ๐ GitHub Repository Analysis182- โ **Deep Code Analysis** - AST parsing for Python, JavaScript, TypeScript, Java, C++, Go183- โ **API Extraction** - Functions, classes, methods with parameters and types184- โ **Repository Metadata** - README, file tree, language breakdown, stars/forks185- โ **GitHub Issues & PRs** - Fetch open/closed issues with labels and milestones186- โ **CHANGELOG & Releases** - Automatically extract version history187- โ **Conflict Detection** - Compare documented APIs vs actual code implementation188- โ **MCP Integration** - Natural language: "Scrape GitHub repo facebook/react"189190### ๐ Unified Multi-Source Scraping191- โ **Combine Multiple Sources** - Mix documentation + GitHub + PDF in one skill192- โ **Conflict Detection** - Automatically finds discrepancies between docs and code193- โ **Intelligent Merging** - Rule-based or AI-powered conflict resolution194- โ **Transparent Reporting** - Side-by-side comparison with โ ๏ธ warnings195- โ **Documentation Gap Analysis** - Identifies outdated docs and undocumented features196- โ **Single Source of Truth** - One skill showing both intent (docs) and reality (code)197- โ **Backward Compatible** - Legacy single-source configs still work198199### ๐ค Multi-LLM Platform Support200- โ **4 LLM Platforms** - Claude AI, Google Gemini, OpenAI ChatGPT, Generic Markdown201- โ **Universal Scraping** - Same documentation works for all platforms202- โ **Platform-Specific Packaging** - Optimized formats for each LLM203- โ **One-Command Export** - `--target` flag selects platform204- โ **Optional Dependencies** - Install only what you need205- โ **100% Backward Compatible** - Existing Claude workflows unchanged206207| Platform | Format | Upload | Enhancement | API Key | Custom Endpoint |208|----------|--------|--------|-------------|---------|-----------------|209| **Claude AI** | ZIP + YAML | โ Auto | โ Yes | ANTHROPIC_API_KEY | ANTHROPIC_BASE_URL |210| **Google Gemini** | tar.gz | โ Auto | โ Yes | GOOGLE_API_KEY | - |211| **OpenAI ChatGPT** | ZIP + Vector Store | โ Auto | โ Yes | OPENAI_API_KEY | - |212| **Generic Markdown** | ZIP | โ Manual | โ No | - | - |213214```bash215# Claude (default - no changes needed!)216skill-seekers package output/react/217skill-seekers upload react.zip218219# Google Gemini220pip install skill-seekers[gemini]221skill-seekers package output/react/ --target gemini222skill-seekers upload react-gemini.tar.gz --target gemini223224# OpenAI ChatGPT225pip install skill-seekers[openai]226skill-seekers package output/react/ --target openai227skill-seekers upload react-openai.zip --target openai228229# Generic Markdown (universal export)230skill-seekers package output/react/ --target markdown231# Use the markdown files directly in any LLM232```233234<details>235<summary>๐ง <strong>Environment Variables for Claude-Compatible APIs (e.g., GLM-4.7)</strong></summary>236237Skill Seekers supports any Claude-compatible API endpoint:238239```bash240# Option 1: Official Anthropic API (default)241export ANTHROPIC_API_KEY=sk-ant-...242243# Option 2: GLM-4.7 Claude-compatible API244export ANTHROPIC_API_KEY=your-glm-47-api-key245export ANTHROPIC_BASE_URL=https://glm-4-7-endpoint.com/v1246247# All AI enhancement features will use the configured endpoint248skill-seekers enhance output/react/249skill-seekers analyze --directory . --enhance250```251252**Note**: Setting `ANTHROPIC_BASE_URL` allows you to use any Claude-compatible API endpoint, such as GLM-4.7 (ๆบ่ฐฑ AI) or other compatible services.253254</details>255256**Installation:**257```bash258# Install with Gemini support259pip install skill-seekers[gemini]260261# Install with OpenAI support262pip install skill-seekers[openai]263264# Install with all LLM platforms265pip install skill-seekers[all-llms]266```267268### ๐ RAG Framework Integrations269270- โ **LangChain Documents** - Direct export to `Document` format with `page_content` + metadata271 - Perfect for: QA chains, retrievers, vector stores, agents272 - Example: [LangChain RAG Pipeline](examples/langchain-rag-pipeline/)273 - Guide: [LangChain Integration](docs/integrations/LANGCHAIN.md)274275- โ **LlamaIndex TextNodes** - Export to `TextNode` format with unique IDs + embeddings276 - Perfect for: Query engines, chat engines, storage context277 - Example: [LlamaIndex Query Engine](examples/llama-index-query-engine/)278 - Guide: [LlamaIndex Integration](docs/integrations/LLAMA_INDEX.md)279280- โ **Pinecone-Ready Format** - Optimized for vector database upsert281 - Perfect for: Production vector search, semantic search, hybrid search282 - Example: [Pinecone Upsert](examples/pinecone-upsert/)283 - Guide: [Pinecone Integration](docs/integrations/PINECONE.md)284285**Quick Export:**286```bash287# LangChain Documents (JSON)288skill-seekers package output/django --target langchain289# โ output/django-langchain.json290291# LlamaIndex TextNodes (JSON)292skill-seekers package output/django --target llama-index293# โ output/django-llama-index.json294295# Markdown (Universal)296skill-seekers package output/django --target markdown297# โ output/django-markdown/SKILL.md + references/298```299300**Complete RAG Pipeline Guide:** [RAG Pipelines Documentation](docs/integrations/RAG_PIPELINES.md)301302---303304### ๐ง AI Coding Assistant Integrations305306Transform any framework documentation into expert coding context for 4+ AI assistants:307308- โ **Cursor IDE** - Generate `.cursorrules` for AI-powered code suggestions309 - Perfect for: Framework-specific code generation, consistent patterns310 - Works with: Cursor IDE (VS Code fork)311 - Guide: [Cursor Integration](docs/integrations/CURSOR.md)312 - Example: [Cursor React Skill](examples/cursor-react-skill/)313314- โ **Windsurf** - Customize Windsurf's AI assistant context with `.windsurfrules`315 - Perfect for: IDE-native AI assistance, flow-based coding316 - Works with: Windsurf IDE by Codeium317 - Guide: [Windsurf Integration](docs/integrations/WINDSURF.md)318 - Example: [Windsurf FastAPI Context](examples/windsurf-fastapi-context/)319320- โ **Cline (VS Code)** - System prompts + MCP for VS Code agent321 - Perfect for: Agentic code generation in VS Code322 - Works with: Cline extension for VS Code323 - Guide: [Cline Integration](docs/integrations/CLINE.md)324 - Example: [Cline Django Assistant](examples/cline-django-assistant/)325326- โ **Continue.dev** - Context servers for IDE-agnostic AI327 - Perfect for: Multi-IDE environments (VS Code, JetBrains, Vim), custom LLM providers328 - Works with: Any IDE with Continue.dev plugin329 - Guide: [Continue Integration](docs/integrations/CONTINUE_DEV.md)330 - Example: [Continue Universal Context](examples/continue-dev-universal/)331332**Quick Export for AI Coding Tools:**333```bash334# For any AI coding assistant (Cursor, Windsurf, Cline, Continue.dev)335skill-seekers scrape --config configs/django.json336skill-seekers package output/django --target claude # or --target markdown337338# Copy to your project (example for Cursor)339cp output/django-claude/SKILL.md my-project/.cursorrules340341# Or for Windsurf342cp output/django-claude/SKILL.md my-project/.windsurf/rules/django.md343344# Or for Cline345cp output/django-claude/SKILL.md my-project/.clinerules346347# Or for Continue.dev (HTTP server)348python examples/continue-dev-universal/context_server.py349# Configure in ~/.continue/config.json350```351352**Integration Hub:** [All AI System Integrations](docs/integrations/INTEGRATIONS.md)353354---355356### ๐ Three-Stream GitHub Architecture357- โ **Triple-Stream Analysis** - Split GitHub repos into Code, Docs, and Insights streams358- โ **Unified Codebase Analyzer** - Works with GitHub URLs AND local paths359- โ **C3.x as Analysis Depth** - Choose 'basic' (1-2 min) or 'c3x' (20-60 min) analysis360- โ **Enhanced Router Generation** - GitHub metadata, README quick start, common issues361- โ **Issue Integration** - Top problems and solutions from GitHub issues362- โ **Smart Routing Keywords** - GitHub labels weighted 2x for better topic detection363364**Three Streams Explained:**365- **Stream 1: Code** - Deep C3.x analysis (patterns, examples, guides, configs, architecture)366- **Stream 2: Docs** - Repository documentation (README, CONTRIBUTING, docs/*.md)367- **Stream 3: Insights** - Community knowledge (issues, labels, stars, forks)368369```python370from skill_seekers.cli.unified_codebase_analyzer import UnifiedCodebaseAnalyzer371372# Analyze GitHub repo with all three streams373analyzer = UnifiedCodebaseAnalyzer()374result = analyzer.analyze(375 source="https://github.com/facebook/react",376 depth="c3x", # or "basic" for fast analysis377 fetch_github_metadata=True378)379380# Access code stream (C3.x analysis)381print(f"Design patterns: {len(result.code_analysis['c3_1_patterns'])}")382print(f"Test examples: {result.code_analysis['c3_2_examples_count']}")383384# Access docs stream (repository docs)385print(f"README: {result.github_docs['readme'][:100]}")386387# Access insights stream (GitHub metadata)388print(f"Stars: {result.github_insights['metadata']['stars']}")389print(f"Common issues: {len(result.github_insights['common_problems'])}")390```391392**See complete documentation**: [Three-Stream Implementation Summary](docs/IMPLEMENTATION_SUMMARY_THREE_STREAM.md)393394### ๐ Smart Rate Limit Management & Configuration395- โ **Multi-Token Configuration System** - Manage multiple GitHub accounts (personal, work, OSS)396 - Secure config storage at `~/.config/skill-seekers/config.json` (600 permissions)397 - Per-profile rate limit strategies: `prompt`, `wait`, `switch`, `fail`398 - Configurable timeout per profile (default: 30 min, prevents indefinite waits)399 - Smart fallback chain: CLI arg โ Env var โ Config file โ Prompt400 - API key management for Claude, Gemini, OpenAI401- โ **Interactive Configuration Wizard** - Beautiful terminal UI for easy setup402 - Browser integration for token creation (auto-opens GitHub, etc.)403 - Token validation and connection testing404 - Visual status display with color coding405- โ **Intelligent Rate Limit Handler** - No more indefinite waits!406 - Upfront warning about rate limits (60/hour vs 5000/hour)407 - Real-time detection from GitHub API responses408 - Live countdown timers with progress409 - Automatic profile switching when rate limited410 - Four strategies: prompt (ask), wait (countdown), switch (try another), fail (abort)411- โ **Resume Capability** - Continue interrupted jobs412 - Auto-save progress at configurable intervals (default: 60 sec)413 - List all resumable jobs with progress details414 - Auto-cleanup of old jobs (default: 7 days)415- โ **CI/CD Support** - Non-interactive mode for automation416 - `--non-interactive` flag fails fast without prompts417 - `--profile` flag to select specific GitHub account418 - Clear error messages for pipeline logs419420**Quick Setup:**421```bash422# One-time configuration (5 minutes)423skill-seekers config --github424425# Use specific profile for private repos426skill-seekers github --repo mycompany/private-repo --profile work427428# CI/CD mode (fail fast, no prompts)429skill-seekers github --repo owner/repo --non-interactive430431# Resume interrupted job432skill-seekers resume --list433skill-seekers resume github_react_20260117_143022434```435436**Rate Limit Strategies Explained:**437- **prompt** (default) - Ask what to do when rate limited (wait, switch, setup token, cancel)438- **wait** - Automatically wait with countdown timer (respects timeout)439- **switch** - Automatically try next available profile (for multi-account setups)440- **fail** - Fail immediately with clear error (perfect for CI/CD)441442### ๐ฏ Bootstrap Skill - Self-Hosting443444Generate skill-seekers as a Claude Code skill to use within Claude:445446```bash447# Generate the skill448./scripts/bootstrap_skill.sh449450# Install to Claude Code451cp -r output/skill-seekers ~/.claude/skills/452```453454**What you get:**455- โ **Complete skill documentation** - All CLI commands and usage patterns456- โ **CLI command reference** - Every tool and its options documented457- โ **Quick start examples** - Common workflows and best practices458- โ **Auto-generated API docs** - Code analysis, patterns, and examples459460### ๐ Private Config Repositories461- โ **Git-Based Config Sources** - Fetch configs from private/team git repositories462- โ **Multi-Source Management** - Register unlimited GitHub, GitLab, Bitbucket repos463- โ **Team Collaboration** - Share custom configs across 3-5 person teams464- โ **Enterprise Support** - Scale to 500+ developers with priority-based resolution465- โ **Secure Authentication** - Environment variable tokens (GITHUB_TOKEN, GITLAB_TOKEN)466- โ **Intelligent Caching** - Clone once, pull updates automatically467- โ **Offline Mode** - Work with cached configs when offline468469### ๐ค Codebase Analysis (C3.x)470471**C3.4: Configuration Pattern Extraction with AI Enhancement**472- โ **9 Config Formats** - JSON, YAML, TOML, ENV, INI, Python, JavaScript, Dockerfile, Docker Compose473- โ **7 Pattern Types** - Database, API, logging, cache, email, auth, server configurations474- โ **AI Enhancement** - Optional dual-mode AI analysis (API + LOCAL)475 - Explains what each config does476 - Suggests best practices and improvements477 - **Security analysis** - Finds hardcoded secrets, exposed credentials478- โ **Auto-Documentation** - Generates JSON + Markdown documentation of all configs479- โ **MCP Integration** - `extract_config_patterns` tool with enhancement support480481**C3.3: AI-Enhanced How-To Guides**482- โ **Comprehensive AI Enhancement** - Transforms basic guides into professional tutorials483- โ **5 Automatic Improvements** - Step descriptions, troubleshooting, prerequisites, next steps, use cases484- โ **Dual-Mode Support** - API mode (Claude API) or LOCAL mode (Claude Code CLI)485- โ **No API Costs with LOCAL Mode** - FREE enhancement using your Claude Code Max plan486- โ **Quality Transformation** - 75-line templates โ 500+ line comprehensive guides487488**Usage:**489```bash490# Quick analysis (1-2 min, basic features only)491skill-seekers analyze --directory tests/ --quick492493# Comprehensive analysis with AI (20-60 min, all features)494skill-seekers analyze --directory tests/ --comprehensive495496# With AI enhancement497skill-seekers analyze --directory tests/ --enhance498```499500**Full Documentation:** [docs/HOW_TO_GUIDES.md](docs/HOW_TO_GUIDES.md#ai-enhancement-new)501502### ๐ Enhancement Workflow Presets503504Reusable YAML-defined enhancement pipelines that control how AI transforms your raw documentation into a polished skill.505506- โ **5 Bundled Presets** โ `default`, `minimal`, `security-focus`, `architecture-comprehensive`, `api-documentation`507- โ **User-Defined Presets** โ add custom workflows to `~/.config/skill-seekers/workflows/`508- โ **Multiple Workflows** โ chain two or more workflows in one command509- โ **Fully Managed CLI** โ list, inspect, copy, add, remove, and validate workflows510511```bash512# Apply a single workflow513skill-seekers create ./my-project --enhance-workflow security-focus514515# Chain multiple workflows (applied in order)516skill-seekers create ./my-project \517 --enhance-workflow security-focus \518 --enhance-workflow minimal519520# Manage presets521skill-seekers workflows list # List all (bundled + user)522skill-seekers workflows show security-focus # Print YAML content523skill-seekers workflows copy security-focus # Copy to user dir for editing524skill-seekers workflows add ./my-workflow.yaml # Install a custom preset525skill-seekers workflows remove my-workflow # Remove a user preset526skill-seekers workflows validate security-focus # Validate preset structure527528# Copy multiple at once529skill-seekers workflows copy security-focus minimal api-documentation530531# Add multiple files at once532skill-seekers workflows add ./wf-a.yaml ./wf-b.yaml533534# Remove multiple at once535skill-seekers workflows remove my-wf-a my-wf-b536```537538**YAML preset format:**539```yaml540name: security-focus541description: "Security-focused review: vulnerabilities, auth, data handling"542version: "1.0"543stages:544 - name: vulnerabilities545 type: custom546 prompt: "Review for OWASP top 10 and common security vulnerabilities..."547 - name: auth-review548 type: custom549 prompt: "Examine authentication and authorisation patterns..."550 uses_history: true551```552553### โก Performance & Scale554- โ **Async Mode** - 2-3x faster scraping with async/await (use `--async` flag)555- โ **Large Documentation Support** - Handle 10K-40K+ page docs with intelligent splitting556- โ **Router/Hub Skills** - Intelligent routing to specialized sub-skills557- โ **Parallel Scraping** - Process multiple skills simultaneously558- โ **Checkpoint/Resume** - Never lose progress on long scrapes559- โ **Caching System** - Scrape once, rebuild instantly560561### โ Quality Assurance562- โ **Fully Tested** - 1,880+ tests with comprehensive coverage563564---565566## ๐ฆ Installation567568```bash569# Basic install (documentation scraping, GitHub analysis, PDF, packaging)570pip install skill-seekers571572# With all LLM platform support573pip install skill-seekers[all-llms]574575# With MCP server576pip install skill-seekers[mcp]577578# Everything579pip install skill-seekers[all]580```581582**Need help choosing?** Run the setup wizard:583```bash584skill-seekers-setup585```586587### Installation Options588589| Install | Features |590|---------|----------|591| `pip install skill-seekers` | Scraping, GitHub analysis, PDF, all platforms |592| `pip install skill-seekers[gemini]` | + Google Gemini support |593| `pip install skill-seekers[openai]` | + OpenAI ChatGPT support |594| `pip install skill-seekers[all-llms]` | + All LLM platforms |595| `pip install skill-seekers[mcp]` | + MCP server for Claude Code, Cursor, etc. |596| `pip install skill-seekers[all]` | Everything enabled |597598---599600## ๐ One-Command Install Workflow601602**The fastest way to go from config to uploaded skill - complete automation:**603604```bash605# Install React skill from official configs (auto-uploads to Claude)606skill-seekers install --config react607608# Install from local config file609skill-seekers install --config configs/custom.json610611# Install without uploading (package only)612skill-seekers install --config django --no-upload613614# Preview workflow without executing615skill-seekers install --config react --dry-run616```617618**Time:** 20-45 minutes total | **Quality:** Production-ready (9/10) | **Cost:** Free619620**Phases executed:**621```622๐ฅ PHASE 1: Fetch Config (if config name provided)623๐ PHASE 2: Scrape Documentation624โจ PHASE 3: AI Enhancement (MANDATORY - no skip option)625๐ฆ PHASE 4: Package Skill626โ๏ธ PHASE 5: Upload to Claude (optional, requires API key)627```628629**Requirements:**630- ANTHROPIC_API_KEY environment variable (for auto-upload)631- Claude Code Max plan (for local AI enhancement)632633---634635## ๐ Feature Matrix636637Skill Seekers supports **4 LLM platforms** and **5 skill modes** with full feature parity.638639**Platforms:** Claude AI, Google Gemini, OpenAI ChatGPT, Generic Markdown640**Skill Modes:** Documentation, GitHub, PDF, Unified Multi-Source, Local Repository641642See [Complete Feature Matrix](docs/FEATURE_MATRIX.md) for detailed platform and feature support.643644### Quick Platform Comparison645646| Feature | Claude | Gemini | OpenAI | Markdown |647|---------|--------|--------|--------|----------|648| Format | ZIP + YAML | tar.gz | ZIP + Vector | ZIP |649| Upload | โ API | โ API | โ API | โ Manual |650| Enhancement | โ Sonnet 4 | โ 2.0 Flash | โ GPT-4o | โ None |651| All Skill Modes | โ | โ | โ | โ |652653---654655## Usage Examples656657### Documentation Scraping658659```bash660# Scrape documentation website661skill-seekers scrape --config configs/react.json662663# Quick scrape without config664skill-seekers scrape --url https://react.dev --name react665666# With async mode (3x faster)667skill-seekers scrape --config configs/godot.json --async --workers 8668```669670### PDF Extraction671672```bash673# Basic PDF extraction674skill-seekers pdf --pdf docs/manual.pdf --name myskill675676# Advanced features677skill-seekers pdf --pdf docs/manual.pdf --name myskill \678 --extract-tables \ # Extract tables679 --parallel \ # Fast parallel processing680 --workers 8 # Use 8 CPU cores681682# Scanned PDFs (requires: pip install pytesseract Pillow)683skill-seekers pdf --pdf docs/scanned.pdf --name myskill --ocr684```685686### GitHub Repository Analysis687688```bash689# Basic repository scraping690skill-seekers github --repo facebook/react691692# With authentication (higher rate limits)693export GITHUB_TOKEN=ghp_your_token_here694skill-seekers github --repo facebook/react695696# Customize what to include697skill-seekers github --repo django/django \698 --include-issues \ # Extract GitHub Issues699 --max-issues 100 \ # Limit issue count700 --include-changelog # Extract CHANGELOG.md701```702703### Unified Multi-Source Scraping704705**Combine documentation + GitHub + PDF into one unified skill with conflict detection:**706707```bash708# Use existing unified configs709skill-seekers unified --config configs/react_unified.json710skill-seekers unified --config configs/django_unified.json711712# Or create unified config713cat > configs/myframework_unified.json << 'EOF'714{715 "name": "myframework",716 "merge_mode": "rule-based",717 "sources": [718 {719 "type": "documentation",720 "base_url": "https://docs.myframework.com/",721 "max_pages": 200722 },723 {724 "type": "github",725 "repo": "owner/myframework",726 "code_analysis_depth": "surface"727 }728 ]729}730EOF731732skill-seekers unified --config configs/myframework_unified.json733```734735**Conflict Detection automatically finds:**736- ๐ด **Missing in code** (high): Documented but not implemented737- ๐ก **Missing in docs** (medium): Implemented but not documented738- โ ๏ธ **Signature mismatch**: Different parameters/types739- โน๏ธ **Description mismatch**: Different explanations740741**Full Guide:** See [docs/UNIFIED_SCRAPING.md](docs/UNIFIED_SCRAPING.md) for complete documentation.742743### Private Config Repositories744745**Share custom configs across teams using private git repositories:**746747```bash748# Option 1: Using MCP tools (recommended)749# Register your team's private repo750add_config_source(751 name="team",752 git_url="https://github.com/mycompany/skill-configs.git",753 token_env="GITHUB_TOKEN"754)755756# Fetch config from team repo757fetch_config(source="team", config_name="internal-api")758```759760**Supported Platforms:**761- GitHub (`GITHUB_TOKEN`), GitLab (`GITLAB_TOKEN`), Gitea (`GITEA_TOKEN`), Bitbucket (`BITBUCKET_TOKEN`)762763**Full Guide:** See [docs/GIT_CONFIG_SOURCES.md](docs/GIT_CONFIG_SOURCES.md) for complete documentation.764765## How It Works766767```mermaid768graph LR769 A[Documentation Website] --> B[Skill Seekers]770 B --> C[Scraper]771 B --> D[AI Enhancement]772 B --> E[Packager]773 C --> F[Organized References]774 D --> F775 F --> E776 E --> G[Claude Skill .zip]777 G --> H[Upload to Claude AI]778```7797800. **Detect llms.txt** - Checks for llms-full.txt, llms.txt, llms-small.txt first7811. **Scrape**: Extracts all pages from documentation7822. **Categorize**: Organizes content into topics (API, guides, tutorials, etc.)7833. **Enhance**: AI analyzes docs and creates comprehensive SKILL.md with examples7844. **Package**: Bundles everything into a Claude-ready `.zip` file785786## ๐ Prerequisites787788**Before you start, make sure you have:**7897901. **Python 3.10 or higher** - [Download](https://www.python.org/downloads/) | Check: `python3 --version`7912. **Git** - [Download](https://git-scm.com/) | Check: `git --version`7923. **15-30 minutes** for first-time setup793794**First time user?** โ **[Start Here: Bulletproof Quick Start Guide](BULLETPROOF_QUICKSTART.md)** ๐ฏ795796---797798## ๐ค Uploading Skills to Claude799800Once your skill is packaged, you need to upload it to Claude:801802### Option 1: Automatic Upload (API-based)803804```bash805# Set your API key (one-time)806export ANTHROPIC_API_KEY=sk-ant-...807808# Package and upload automatically809skill-seekers package output/react/ --upload810811# OR upload existing .zip812skill-seekers upload output/react.zip813```814815### Option 2: Manual Upload (No API Key)816817```bash818# Package skill819skill-seekers package output/react/820# โ Creates output/react.zip821822# Then manually upload:823# - Go to https://claude.ai/skills824# - Click "Upload Skill"825# - Select output/react.zip826```827828### Option 3: MCP (Claude Code)829830```831In Claude Code, just ask:832"Package and upload the React skill"833```834835---836837## ๐ค Installing to AI Agents838839Skill Seekers can automatically install skills to 10+ AI coding agents.840841```bash842# Install to specific agent843skill-seekers install-agent output/react/ --agent cursor844845# Install to all agents at once846skill-seekers install-agent output/react/ --agent all847848# Preview without installing849skill-seekers install-agent output/react/ --agent cursor --dry-run850```851852### Supported Agents853854| Agent | Path | Type |855|-------|------|------|856| **Claude Code** | `~/.claude/skills/` | Global |857| **Cursor** | `.cursor/skills/` | Project |858| **VS Code / Copilot** | `.github/skills/` | Project |859| **Amp** | `~/.amp/skills/` | Global |860| **Goose** | `~/.config/goose/skills/` | Global |861| **OpenCode** | `~/.opencode/skills/` | Global |862| **Windsurf** | `~/.windsurf/skills/` | Global |863864---865866## ๐ MCP Integration (26 Tools)867868Skill Seekers ships an MCP server for use from Claude Code, Cursor, Windsurf, VS Code + Cline, or IntelliJ IDEA.869870```bash871# stdio mode (Claude Code, VS Code + Cline)872python -m skill_seekers.mcp.server_fastmcp873874# HTTP mode (Cursor, Windsurf, IntelliJ)875python -m skill_seekers.mcp.server_fastmcp --transport http --port 8765876877# Auto-configure all agents at once878./setup_mcp.sh879```880881**All 26 tools available:**882- **Core (9):** `list_configs`, `generate_config`, `validate_config`, `estimate_pages`, `scrape_docs`, `package_skill`, `upload_skill`, `enhance_skill`, `install_skill`883- **Extended (10):** `scrape_github`, `scrape_pdf`, `unified_scrape`, `merge_sources`, `detect_conflicts`, `add_config_source`, `fetch_config`, `list_config_sources`, `remove_config_source`, `split_config`884- **Vector DB (4):** `export_to_chroma`, `export_to_weaviate`, `export_to_faiss`, `export_to_qdrant`885- **Cloud (3):** `cloud_upload`, `cloud_download`, `cloud_list`886887**Full Guide:** [docs/MCP_SETUP.md](docs/MCP_SETUP.md)888889---890891## โ๏ธ Configuration892893### Available Presets (24+)894895```bash896# List all presets897skill-seekers list-configs898```899900| Category | Presets |901|----------|---------|902| **Web Frameworks** | `react`, `vue`, `angular`, `svelte`, `nextjs` |903| **Python** | `django`, `flask`, `fastapi`, `sqlalchemy`, `pytest` |904| **Game Development** | `godot`, `pygame`, `unity` |905| **Tools & DevOps** | `docker`, `kubernetes`, `terraform`, `ansible` |906| **Unified (Docs + GitHub)** | `react-unified`, `vue-unified`, `nextjs-unified`, and more |907908### Creating Your Own Config909910```bash911# Option 1: Interactive912skill-seekers scrape --interactive913914# Option 2: Copy and edit a preset915cp configs/react.json configs/myframework.json916nano configs/myframework.json917skill-seekers scrape --config configs/myframework.json918```919920### Config File Structure921922```json923{924 "name": "myframework",925 "description": "When to use this skill",926 "base_url": "https://docs.myframework.com/",927 "selectors": {928 "main_content": "article",929 "title": "h1",930 "code_blocks": "pre code"931 },932 "url_patterns": {933 "include": ["/docs", "/guide"],934 "exclude": ["/blog", "/about"]935 },936 "categories": {937 "getting_started": ["intro", "quickstart"],938 "api": ["api", "reference"]939 },940 "rate_limit": 0.5,941 "max_pages": 500942}943```944945### Where to Store Configs946947The tool searches in this order:9481. Exact path as provided9492. `./configs/` (current directory)9503. `~/.config/skill-seekers/configs/` (user config directory)9514. SkillSeekersWeb.com API (preset configs)952953---954955## ๐ What Gets Created956957```958output/959โโโ godot_data/ # Scraped raw data960โ โโโ pages/ # JSON files (one per page)961โ โโโ summary.json # Overview962โ963โโโ godot/ # The skill964 โโโ SKILL.md # Enhanced with real examples965 โโโ references/ # Categorized docs966 โ โโโ index.md967 โ โโโ getting_started.md968 โ โโโ scripting.md969 โ โโโ ...970 โโโ scripts/ # Empty (add your own)971 โโโ assets/ # Empty (add your own)972```973974---975976## ๐ Troubleshooting977978### No Content Extracted?979- Check your `main_content` selector980- Try: `article`, `main`, `div[role="main"]`981982### Data Exists But Won't Use It?983```bash984# Force re-scrape985rm -rf output/myframework_data/986skill-seekers scrape --config configs/myframework.json987```988989### Categories Not Good?990Edit the config `categories` section with better keywords.991992### Want to Update Docs?993```bash994# Delete old data and re-scrape995rm -rf output/godot_data/996skill-seekers scrape --config configs/godot.json997```998999### Enhancement Not Working?1000```bash1001# Check if API key is set1002echo $ANTHROPIC_API_KEY10031004# Try LOCAL mode instead (uses Claude Code Max, no API key needed)1005skill-seekers enhance output/react/ --mode LOCAL10061007# Monitor background enhancement status1008skill-seekers enhance-status output/react/ --watch1009```10101011### GitHub Rate Limit Issues?1012```bash1013# Set a GitHub token (5000 req/hour vs 60/hour anonymous)1014export GITHUB_TOKEN=ghp_your_token_here10151016# Or configure multiple profiles1017skill-seekers config --github1018```10191020---10211022## ๐ Performance10231024| Task | Time | Notes |1025|------|------|-------|1026| Scraping (sync) | 15-45 min | First time only, thread-based |1027| Scraping (async) | 5-15 min | 2-3x faster with `--async` flag |1028| Building | 1-3 min | Fast rebuild from cache |1029| Re-building | <1 min | With `--skip-scrape` |1030| Enhancement (LOCAL) | 30-60 sec | Uses Claude Code Max |1031| Enhancement (API) | 20-40 sec | Requires API key |1032| Packaging | 5-10 sec | Final .zip creation |10331034---10351036## ๐ Documentation10371038### Getting Started1039- **[BULLETPROOF_QUICKSTART.md](BULLETPROOF_QUICKSTART.md)** - ๐ฏ **START HERE** if you're new!1040- **[QUICKSTART.md](QUICKSTART.md)** - Quick start for experienced users1041- **[TROUBLESHOOTING.md](TROUBLESHOOTING.md)** - Common issues and solutions1042- **[docs/QUICK_REFERENCE.md](docs/QUICK_REFERENCE.md)** - One-page cheat sheet10431044### Guides1045- **[docs/LARGE_DOCUMENTATION.md](docs/LARGE_DOCUMENTATION.md)** - Handle 10K-40K+ page docs1046- **[ASYNC_SUPPORT.md](ASYNC_SUPPORT.md)** - Async mode guide (2-3x faster scraping)1047- **[docs/ENHANCEMENT_MODES.md](docs/ENHANCEMENT_MODES.md)** - AI enhancement modes guide1048- **[docs/MCP_SETUP.md](docs/MCP_SETUP.md)** - MCP integration setup1049- **[docs/UNIFIED_SCRAPING.md](docs/UNIFIED_SCRAPING.md)** - Multi-source scraping10501051### Integration Guides1052- **[docs/integrations/LANGCHAIN.md](docs/integrations/LANGCHAIN.md)** - LangChain RAG1053- **[docs/integrations/CURSOR.md](docs/integrations/CURSOR.md)** - Cursor IDE1054- **[docs/integrations/WINDSURF.md](docs/integrations/WINDSURF.md)** - Windsurf IDE1055- **[docs/integrations/CLINE.md](docs/integrations/CLINE.md)** - Cline (VS Code)1056- **[docs/integrations/RAG_PIPELINES.md](docs/integrations/RAG_PIPELINES.md)** - All RAG pipelines10571058---10591060## ๐ License10611062MIT License - see [LICENSE](LICENSE) file for details10631064---10651066Happy skill building! ๐10671068---10691070## ๐ Security10711072[](https://mseep.ai/app/yusufkaraaslan-skill-seekers)1073
Full transparency โ inspect the skill content before installing.