How do I install Content Core?

Install Content Core with a single command: npx mdskills install lfnovo/content-core. This downloads the skill files into your project and your AI agent picks them up automatically.
What platforms support Content Core?

Content Core works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.
← Back to skills
Content Core

Name: Content Core: AI Agent Skill
Rating: 8 (1 reviews)
Author: lfnovo
Verified
File ProcessingIntermediate
Content Core is a powerful, AI-powered content extraction and processing platform that transforms any source into clean, structured content. Extract text from websites, transcribe videos, process documents, and generate AI summaries—all through a unified interface with multiple integration options. Extract content from anywhere: - 📄 Documents - PDF, Word, PowerPoint, Excel, Markdown, HTML, EPUB -
by @lfnovo0Updated 2/24/2026
Add this skill
npx mdskills install lfnovo/content-core
Fork & Edit
Skill Advisor8.0
Comprehensive multi-format content extraction tool with excellent documentation and multiple integration paths
+Supports extensive format range with intelligent fallback chains
+Provides multiple usage patterns (CLI, Python, MCP, macOS integration)
+Clear examples and setup instructions for each integration method
-Broad permissions declared without specific validation details in documentation
SKILL.md
Edit in Browser
1# Content Core
2 
3[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
4[![PyPI version](https://badge.fury.io/py/content-core.svg)](https://badge.fury.io/py/content-core)
5[![Downloads](https://pepy.tech/badge/content-core)](https://pepy.tech/project/content-core)
6[![Downloads](https://pepy.tech/badge/content-core/month)](https://pepy.tech/project/content-core)
7[![GitHub stars](https://img.shields.io/github/stars/lfnovo/content-core?style=social)](https://github.com/lfnovo/content-core)
8[![GitHub forks](https://img.shields.io/github/forks/lfnovo/content-core?style=social)](https://github.com/lfnovo/content-core)
9[![GitHub issues](https://img.shields.io/github/issues/lfnovo/content-core)](https://github.com/lfnovo/content-core/issues)
10[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
11[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
12 
13**Content Core** is a powerful, AI-powered content extraction and processing platform that transforms any source into clean, structured content. Extract text from websites, transcribe videos, process documents, and generate AI summaries—all through a unified interface with multiple integration options.
14 
15## 🚀 What You Can Do
16 
17**Extract content from anywhere:**
18- 📄 **Documents** - PDF, Word, PowerPoint, Excel, Markdown, HTML, EPUB
19- 🎥 **Media** - Videos (MP4, AVI, MOV) with automatic transcription  
20- 🎵 **Audio** - MP3, WAV, M4A with speech-to-text conversion
21- 🌐 **Web** - Any URL with intelligent content extraction
22- 🖼️ **Images** - JPG, PNG, TIFF with OCR text recognition
23- 📦 **Archives** - ZIP, TAR, GZ with content analysis
24 
25**Process with AI:**
26- ✨ **Clean & format** extracted content automatically
27- 📝 **Generate summaries** with customizable styles (bullet points, executive summary, etc.)
28- 🎯 **Context-aware processing** - explain to a child, technical summary, action items
29- 🔄 **Smart engine selection** - automatically chooses the best extraction method
30 
31## 🛠️ Multiple Ways to Use
32 
33### 🖥️ Command Line (Zero Install)
34```bash
35# Extract content from any source
36uvx --from "content-core" ccore https://example.com
37uvx --from "content-core" ccore document.pdf
38 
39# Generate AI summaries  
40uvx --from "content-core" csum video.mp4 --context "bullet points"
41```
42 
43### 🤖 Claude Desktop Integration
44One-click setup with Model Context Protocol (MCP) - extract content directly in Claude conversations.
45 
46### 🔍 Raycast Extension  
47Smart auto-detection commands:
48- **Extract Content** - Full interface with format options
49- **Summarize Content** - 9 summary styles available
50- **Quick Extract** - Instant clipboard extraction
51 
52### 🖱️ macOS Right-Click Integration
53Right-click any file in Finder → Services → Extract or Summarize content instantly.
54 
55### 🐍 Python Library
56```python
57import content_core as cc
58 
59# Extract from any source
60result = await cc.extract("https://example.com/article")
61summary = await cc.summarize_content(result, context="explain to a child")
62```
63 
64## ⚡ Key Features
65 
66*   **🎯 Intelligent Auto-Detection:** Automatically selects the best extraction method based on content type and available services
67*   **🔧 Smart Engine Selection:** 
68    * **URLs:** Firecrawl → Jina → Crawl4AI (optional) → BeautifulSoup fallback chain
69    * **Documents:** Docling → Enhanced PyMuPDF → Simple extraction fallback  
70    * **Media:** OpenAI Whisper transcription
71    * **Images:** OCR with multiple engine support
72*   **📊 Enhanced PDF Processing:** Advanced PyMuPDF engine with quality flags, table detection, and optional OCR for mathematical formulas
73*   **🌍 Multiple Integrations:** CLI, Python library, MCP server, Raycast extension, macOS Services
74*   **⚡ Zero-Install Options:** Use `uvx` for instant access without installation
75*   **🧠 AI-Powered Processing:** LLM integration for content cleaning and summarization
76*   **🔄 Asynchronous:** Built with `asyncio` for efficient processing
77*   **🐍 Pure Python Implementation:** No system dependencies required - simplified installation across all platforms
78 
79## Getting Started
80 
81### Installation
82 
83Install Content Core using `pip` - **no system dependencies required!**
84 
85```bash
86# Basic installation (PyMuPDF + BeautifulSoup/Jina extraction)
87pip install content-core
88 
89# With enhanced document processing (adds Docling)
90pip install content-core[docling]
91 
92# With local browser-based URL extraction (adds Crawl4AI)
93# Note: Requires Playwright browsers (~300MB). Run:
94pip install content-core[crawl4ai]
95python -m playwright install --with-deps
96 
97# Full installation (with all optional features)
98pip install content-core[docling,crawl4ai]
99```
100 
101> **Note:** The core installation uses pure Python implementations and doesn't require system libraries like libmagic, ensuring consistent, hassle-free installation across Windows, macOS, and Linux. Optional features like Crawl4AI (browser automation) may require additional system dependencies.
102 
103Alternatively, if you’re developing locally:
104 
105```bash
106# Clone the repository
107git clone https://github.com/lfnovo/content-core
108cd content-core
109 
110# Install with uv
111uv sync
112```
113 
114### Command-Line Interface
115 
116Content Core provides three CLI commands for extracting, cleaning, and summarizing content: 
117ccore, cclean, and csum. These commands support input from text, URLs, files, or piped data (e.g., via cat file | command).
118 
119**Zero-install usage with uvx:**
120```bash
121# Extract content
122uvx --from "content-core" ccore https://example.com
123 
124# Clean content  
125uvx --from "content-core" cclean "messy content"
126 
127# Summarize content
128uvx --from "content-core" csum "long text" --context "bullet points"
129```
130 
131#### ccore - Extract Content
132 
133Extracts content from text, URLs, or files, with optional formatting.
134Usage:
135```bash
136ccore [-f|--format xml|json|text] [-d|--debug] [content]
137```
138Options:
139- `-f`, `--format`: Output format (xml, json, or text). Default: text.
140- `-d`, `--debug`: Enable debug logging.
141- `content`: Input content (text, URL, or file path). If omitted, reads from stdin.
142 
143Examples:
144 
145```bash
146# Extract from a URL as text
147ccore https://example.com
148 
149# Extract from a file as JSON
150ccore -f json document.pdf
151 
152# Extract from piped text as XML
153echo "Sample text" | ccore --format xml
154```
155 
156#### cclean - Clean Content
157Cleans content by removing unnecessary formatting, spaces, or artifacts. Accepts text, JSON, XML input, URLs, or file paths.
158Usage:
159 
160```bash
161cclean [-d|--debug] [content]
162```
163 
164Options:
165- `-d`, `--debug`: Enable debug logging.
166- `content`: Input content to clean (text, URL, file path, JSON, or XML). If omitted, reads from stdin.
167 
168Examples:
169 
170```bash
171# Clean a text string
172cclean "  messy   text   "
173 
174# Clean piped JSON
175echo '{"content": "  messy   text   "}' | cclean
176 
177# Clean content from a URL
178cclean https://example.com
179 
180# Clean a file’s content
181cclean document.txt
182```
183 
184### csum - Summarize Content
185 
186Summarizes content with an optional context to guide the summary style. Accepts text, JSON, XML input, URLs, or file paths.
187 
188Usage:
189 
190```bash
191csum [--context "context text"] [-d|--debug] [content]
192```
193 
194Options:
195- `--context`: Context for summarization (e.g., "explain to a child"). Default: none.
196- `-d`, `--debug`: Enable debug logging.
197- `content`: Input content to summarize (text, URL, file path, JSON, or XML). If omitted, reads from stdin.
198 
199Examples:
200 
201```bash
202# Summarize text
203csum "AI is transforming industries."
204 
205# Summarize with context
206csum --context "in bullet points" "AI is transforming industries."
207 
208# Summarize piped content
209cat article.txt | csum --context "one sentence"
210 
211# Summarize content from URL
212csum https://example.com
213 
214# Summarize a file's content
215csum document.txt
216```
217 
218## Quick Start
219 
220You can quickly integrate `content-core` into your Python projects to extract, clean, and summarize content from various sources.
221 
222```python
223import content_core as cc
224 
225# Extract content from a URL, file, or text
226result = await cc.extract("https://example.com/article")
227 
228# Clean messy content
229cleaned_text = await cc.clean("...messy text with [brackets] and extra spaces...")
230 
231# Summarize content with optional context
232summary = await cc.summarize_content("long article text", context="explain to a child")
233 
234# Extract audio with custom speech-to-text model
235from content_core.common import ProcessSourceInput
236result = await cc.extract(ProcessSourceInput(
237    file_path="interview.mp3",
238    audio_provider="openai",
239    audio_model="whisper-1"
240))
241```
242 
243## Documentation
244 
245For more information on how to use the Content Core library, including details on AI model configuration and customization, refer to our [Usage Documentation](docs/usage.md).
246 
247## MCP Server Integration
248 
249Content Core includes a Model Context Protocol (MCP) server that enables seamless integration with Claude Desktop and other MCP-compatible applications. The MCP server exposes Content Core's powerful extraction capabilities through a standardized protocol.
250 
251<a href="https://glama.ai/mcp/servers/@lfnovo/content-core">
252  <img width="380" height="200" src="https://glama.ai/mcp/servers/@lfnovo/content-core/badge" />
253</a>
254 
255### Quick Setup with Claude Desktop
256 
257```bash
258# Install Content Core (MCP server included)
259pip install content-core
260 
261# Or use directly with uvx (no installation required)
262uvx --from "content-core" content-core-mcp
263```
264 
265Add to your `claude_desktop_config.json`:
266```json
267{
268  "mcpServers": {
269    "content-core": {
270      "command": "uvx",
271      "args": [
272        "--from",
273        "content-core",
274        "content-core-mcp"
275      ]
276    }
277  }
278}
279```
280 
281For detailed setup instructions, configuration options, and usage examples, see our [MCP Documentation](docs/mcp.md).
282 
283## Enhanced PDF Processing
284 
285Content Core features an optimized PyMuPDF extraction engine with significant improvements for scientific documents and complex PDFs.
286 
287### Key Improvements
288 
289- **🔬 Mathematical Formula Extraction**: Enhanced quality flags eliminate `<!-- formula-not-decoded -->` placeholders
290- **📊 Automatic Table Detection**: Tables converted to markdown format for LLM consumption
291- **🔧 Quality Text Rendering**: Better ligature, whitespace, and image-text integration
292- **⚡ Optional OCR Enhancement**: Selective OCR for formula-heavy pages (requires Tesseract)
293 
294### Configuration for Scientific Documents
295 
296For documents with heavy mathematical content, enable OCR enhancement:
297 
298```yaml
299# In cc_config.yaml
300extraction:
301  pymupdf:
302    enable_formula_ocr: true      # Enable OCR for formula-heavy pages
303    formula_threshold: 3          # Min formulas per page to trigger OCR
304    ocr_fallback: true           # Graceful fallback if OCR fails
305```
306 
307```python
308# Runtime configuration
309from content_core.config import set_pymupdf_ocr_enabled
310set_pymupdf_ocr_enabled(True)
311```
312 
313### Requirements for OCR Enhancement
314 
315```bash
316# Install Tesseract OCR (optional, for formula enhancement)
317# macOS
318brew install tesseract
319 
320# Ubuntu/Debian
321sudo apt-get install tesseract-ocr
322```
323 
324**Note**: OCR is optional - you get improved PDF extraction automatically without any additional setup.
325 
326## macOS Services Integration
327 
328Content Core provides powerful right-click integration with macOS Finder, allowing you to extract and summarize content from any file without installation. Choose between clipboard or TextEdit output for maximum flexibility.
329 
330### Available Services
331 
332Create **4 convenient services** for different workflows:
333 
334- **Extract Content → Clipboard** - Quick copy for immediate pasting
335- **Extract Content → TextEdit** - Review before using  
336- **Summarize Content → Clipboard** - Quick summary copying
337- **Summarize Content → TextEdit** - Formatted summary with headers
338 
339### Quick Setup
340 
3411. **Install uv** (if not already installed):
342   ```bash
343   curl -LsSf https://astral.sh/uv/install.sh | sh
344   ```
345 
3462. **Create services manually** using Automator (5 minutes setup)
347 
348### Usage
349 
350**Right-click any supported file** in Finder → **Services** → Choose your option:
351 
352- **PDFs, Word docs** - Instant text extraction
353- **Videos, audio files** - Automatic transcription  
354- **Images** - OCR text recognition
355- **Web content** - Clean text extraction
356- **Multiple files** - Batch processing support
357 
358### Features
359 
360- **Zero-install processing**: Uses `uvx` for isolated execution
361- **Multiple output options**: Clipboard or TextEdit display
362- **System notifications**: Visual feedback on completion
363- **Wide format support**: 20+ file types supported
364- **Batch processing**: Handle multiple files at once
365- **Keyboard shortcuts**: Assignable hotkeys for power users
366 
367For complete setup instructions with copy-paste scripts, see [macOS Services Documentation](docs/macos.md).
368 
369## Raycast Extension
370 
371Content Core provides a powerful Raycast extension with smart auto-detection that handles both URLs and file paths seamlessly. Extract and summarize content directly from your Raycast interface without switching applications.
372 
373### Quick Setup
374 
375**From Raycast Store** (coming soon):
3761. Open Raycast and search for "Content Core"
3772. Install the extension by `luis_novo`
3783. Configure API keys in preferences
379 
380**Manual Installation**:
3811. Download the extension from the repository
3822. Open Raycast → "Import Extension"
3833. Select the `raycast-content-core` folder
384 
385### Commands
386 
387**🔍 Extract Content** - Smart URL/file detection with full interface
388- Auto-detects URLs vs file paths in real-time
389- Multiple output formats (Text, JSON, XML)
390- Drag & drop support for files
391- Rich results view with metadata
392 
393**📝 Summarize Content** - AI-powered summaries with customizable styles  
394- 9 different summary styles (bullet points, executive summary, etc.)
395- Auto-detects source type with visual feedback
396- One-click snippet creation and quicklinks
397 
398**⚡ Quick Extract** - Instant extraction to clipboard
399- Type → Tab → Paste source → Enter
400- No UI, works directly from command bar
401- Perfect for quick workflows
402 
403### Features
404 
405- **Smart Auto-Detection**: Instantly recognizes URLs vs file paths
406- **Zero Installation**: Uses `uvx` for Content Core execution
407- **Rich Integration**: Keyboard shortcuts, clipboard actions, Raycast snippets
408- **All File Types**: Documents, videos, audio, images, archives
409- **Visual Feedback**: Real-time type detection with icons
410 
411For detailed setup, configuration, and usage examples, see [Raycast Extension Documentation](docs/raycast.md).
412 
413## Using with Langchain
414 
415For users integrating with the [Langchain](https://python.langchain.com/) framework, `content-core` exposes a set of compatible tools. These tools, located in the `src/content_core/tools` directory, allow you to leverage `content-core` extraction, cleaning, and summarization capabilities directly within your Langchain agents and chains.
416 
417You can import and use these tools like any other Langchain tool. For example:
418 
419```python
420from content_core.tools import extract_content_tool, cleanup_content_tool, summarize_content_tool
421from langchain.agents import initialize_agent, AgentType
422 
423tools = [extract_content_tool, cleanup_content_tool, summarize_content_tool]
424agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
425agent.run("Extract the content from https://example.com and then summarize it.") 
426```
427 
428Refer to the source code in `src/content_core/tools` for specific tool implementations and usage details.
429 
430## Basic Usage
431 
432The core functionality revolves around the extract_content function.
433 
434```python
435import asyncio
436from content_core.extraction import extract_content
437 
438async def main():
439    # Extract from raw text
440    text_data = await extract_content({"content": "This is my sample text content."})
441    print(text_data)
442 
443    # Extract from a URL (uses 'auto' engine by default)
444    url_data = await extract_content({"url": "https://www.example.com"})
445    print(url_data)
446 
447    # Extract from a local video file (gets transcript, engine='auto' by default)
448    video_data = await extract_content({"file_path": "path/to/your/video.mp4"})
449    print(video_data)
450 
451    # Extract from a local markdown file (engine='auto' by default)
452    md_data = await extract_content({"file_path": "path/to/your/document.md"})
453    print(md_data)
454 
455    # Per-execution override with Docling for documents
456    doc_data = await extract_content({
457        "file_path": "path/to/your/document.pdf",
458        "document_engine": "docling",
459        "output_format": "html"
460    })
461    
462    # Per-execution override with Firecrawl for URLs
463    url_data = await extract_content({
464        "url": "https://www.example.com",
465        "url_engine": "firecrawl"
466    })
467    print(doc_data)
468 
469if __name__ == "__main__":
470    asyncio.run(main())
471```
472 
473(See `src/content_core/notebooks/run.ipynb` for more detailed examples.)
474 
475## Docling Integration
476 
477Content Core supports an optional Docling-based extraction engine for rich document formats (PDF, DOCX, PPTX, XLSX, Markdown, AsciiDoc, HTML, CSV, Images).
478 
479 
480### Enabling Docling
481 
482Docling is not the default engine when parsing documents. If you don't want to use it, you need to set engine to "simple". 
483 
484#### Via configuration file
485 
486In your `cc_config.yaml` or custom config, set:
487```yaml
488extraction:
489  document_engine: docling  # 'auto' (default), 'simple', or 'docling'
490  url_engine: auto          # 'auto' (default), 'simple', 'firecrawl', or 'jina'
491  firecrawl:
492    api_url: null           # Custom API URL for self-hosted Firecrawl
493  docling:
494    output_format: markdown  # markdown | html | json
495```
496 
497#### Programmatically in Python
498 
499```python
500from content_core.config import set_document_engine, set_url_engine, set_docling_output_format
501 
502# switch document engine to Docling
503set_document_engine("docling")
504 
505# switch URL engine to Firecrawl
506set_url_engine("firecrawl")
507 
508# choose output format: 'markdown', 'html', or 'json'
509set_docling_output_format("html")
510 
511# now use ccore.extract or ccore.ccore
512result = await cc.extract("document.pdf")
513```
514 
515## Configuration
516 
517Configuration settings (like API keys for external services, logging levels) can be managed through environment variables or `.env` files, loaded automatically via `python-dotenv`.
518 
519Example `.env`:
520 
521```plaintext
522OPENAI_API_KEY=your-key-here
523GOOGLE_API_KEY=your-key-here
524 
525# Engine Selection (optional)
526CCORE_DOCUMENT_ENGINE=auto  # auto, simple, docling
527CCORE_URL_ENGINE=auto       # auto, simple, firecrawl, jina
528 
529# Audio Processing (optional)
530CCORE_AUDIO_CONCURRENCY=3   # Number of concurrent audio transcriptions (1-10, default: 3)
531 
532# Esperanto Timeout Configuration (optional)
533ESPERANTO_LLM_TIMEOUT=300   # Language model timeout in seconds (default: 300, max: 3600)
534ESPERANTO_STT_TIMEOUT=3600  # Speech-to-text timeout in seconds (default: 3600, max: 3600)
535```
536 
537### Engine Selection via Environment Variables
538 
539For deployment scenarios like MCP servers or Raycast extensions, you can override the extraction engines using environment variables:
540 
541- **`CCORE_DOCUMENT_ENGINE`**: Force document engine (`auto`, `simple`, `docling`)
542- **`CCORE_URL_ENGINE`**: Force URL engine (`auto`, `simple`, `firecrawl`, `jina`, `crawl4ai`)
543- **`CCORE_AUDIO_CONCURRENCY`**: Number of concurrent audio transcriptions (1-10, default: 3)
544 
545These variables take precedence over config file settings and provide explicit control for different deployment scenarios.
546 
547### Audio Processing Configuration
548 
549Content Core processes long audio files by splitting them into segments and transcribing them in parallel for improved performance. You can control the concurrency level to balance speed with API rate limits:
550 
551- **Default**: 3 concurrent transcriptions
552- **Range**: 1-10 concurrent transcriptions
553- **Configuration**: Set via `CCORE_AUDIO_CONCURRENCY` environment variable or `extraction.audio.concurrency` in `cc_config.yaml`
554 
555Higher concurrency values can speed up processing of long audio/video files but may hit API rate limits. Lower values are more conservative and suitable for accounts with lower API quotas.
556 
557### Retry Configuration
558 
559Content Core includes automatic retry logic for transient failures in external operations (network requests, API calls, transcription). Retries use exponential backoff with jitter to handle temporary issues gracefully.
560 
561**Supported operations:**
562- `youtube` - YouTube video title and transcript fetching (5 retries, 2-60s backoff)
563- `url_api` - URL extraction via Jina/Firecrawl APIs (3 retries, 1-30s backoff)
564- `url_network` - Network operations like HEAD requests, BeautifulSoup (3 retries, 0.5-10s backoff)
565- `audio` - Audio transcription API calls (3 retries, 2-30s backoff)
566- `llm` - LLM API calls for cleanup/summary (3 retries, 1-30s backoff)
567- `download` - Remote file downloads (3 retries, 1-15s backoff)
568 
569**Environment variable overrides:**
570```bash
571# Override retry settings per operation type
572CCORE_YOUTUBE_MAX_RETRIES=10     # Max retry attempts (1-20)
573CCORE_YOUTUBE_BASE_DELAY=3       # Base delay in seconds (0.1-60)
574CCORE_YOUTUBE_MAX_DELAY=120      # Max delay in seconds (1-300)
575 
576# Same pattern for other operations:
577CCORE_URL_API_MAX_RETRIES=5
578CCORE_AUDIO_MAX_RETRIES=5
579CCORE_LLM_MAX_RETRIES=5
580CCORE_DOWNLOAD_MAX_RETRIES=5
581```
582 
583For detailed configuration, see our [Usage Documentation](docs/usage.md#retry-configuration).
584 
585### Proxy Configuration
586 
587Content Core supports HTTP/HTTPS proxy configuration through standard environment variables, consistent with most HTTP clients.
588 
589**Quick Start:**
590 
591```bash
592# Set standard proxy environment variables
593export HTTP_PROXY=http://proxy.example.com:8080
594export HTTPS_PROXY=http://proxy.example.com:8080
595 
596# With authentication
597export HTTP_PROXY=http://user:password@proxy.example.com:8080
598 
599# Bypass proxy for specific hosts
600export NO_PROXY=localhost,127.0.0.1,internal.example.com
601```
602 
603All Content Core network requests automatically use these environment variables.
604 
605**Supported Services:**
606- All aiohttp requests (URL extraction, downloads)
607- YouTube transcript/title fetching (pytubefix, youtube-transcript-api)
608- Crawl4AI browser automation
609- Esperanto AI models (LLM, speech-to-text)
610 
611**Note:** Firecrawl does not support client-side proxy configuration. Configure proxy on the Firecrawl server side instead.
612 
613For detailed configuration, see our [Usage Documentation](docs/usage.md#proxy-configuration).
614 
615### Timeout Configuration
616 
617Content Core uses the Esperanto library for AI model interactions and supports configurable timeouts for different operations. Timeouts prevent requests from hanging indefinitely and ensure reliable processing.
618 
619**Configuration Methods** (in priority order):
620 
6211. **Config Files** (highest priority): Set in `cc_config.yaml` or `models_config.yaml`
6222. **Environment Variables**: Provide global defaults via `ESPERANTO_LLM_TIMEOUT` and `ESPERANTO_STT_TIMEOUT` when a timeout isn't specified in configuration files
623 
624**Default Timeouts:**
625 
626- **Speech-to-Text**: 3600 seconds (1 hour) - for very long audio files
627- **Language Models**: 300-600 seconds - for content processing operations
628- **Cleanup Model**: 600 seconds (10 minutes) - handles large content with 8000 max tokens
629- **Summary Model**: 300 seconds (5 minutes) - for content summarization
630 
631**Environment Variable Overrides:**
632 
633```bash
634# Override language model timeout globally (used when config files omit a timeout)
635export ESPERANTO_LLM_TIMEOUT=300
636 
637# Override speech-to-text timeout globally (used when config files omit a timeout)
638export ESPERANTO_STT_TIMEOUT=3600
639```
640 
641**Valid Range:** 1 to 3600 seconds (1 hour maximum)
642 
643For more details on Esperanto timeout configuration, see the [Esperanto documentation](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/timeout-configuration.md).
644 
645### Custom Prompt Templates
646 
647Content Core allows you to define custom prompt templates for content processing. By default, the library uses built-in prompts located in the `prompts` directory. However, you can create your own prompt templates and store them in a dedicated directory. To specify the location of your custom prompts, set the `PROMPT_PATH` environment variable in your `.env` file or system environment.
648 
649Example `.env` with custom prompt path:
650 
651```plaintext
652OPENAI_API_KEY=your-key-here
653GOOGLE_API_KEY=your-key-here
654PROMPT_PATH=/path/to/your/custom/prompts
655```
656 
657When a prompt template is requested, Content Core will first look in the custom directory specified by `PROMPT_PATH` (if set and exists). If the template is not found there, it will fall back to the default built-in prompts. This allows you to override specific prompts while still using the default ones for others.
658 
659## Development
660 
661To set up a development environment:
662 
663```bash
664# Clone the repository
665git clone <repository-url>
666cd content-core
667 
668# Create virtual environment and install dependencies
669uv venv
670source .venv/bin/activate
671uv sync --group dev
672 
673# Run tests
674make test
675 
676# Lint code
677make lint
678 
679# See all commands
680make help
681```
682 
683## License
684 
685This project is licensed under the [MIT License](LICENSE). See the [LICENSE](LICENSE) file for details.
686 
687## Contributing
688 
689Contributions are welcome! Please see our [Contributing Guide](CONTRIBUTING.md) for more details on how to get started.
690
Full transparency — inspect the skill content before installing.