Generate styled video compositions from your local video footage library using natural language. ShotAI handles shot-level indexing and semantic search; this tool handles the planning, music, and rendering. This repo ships a ready-to-install Claude Agent Skill in the skill/ directory. Install in Claude Code: Or point Claude Code settings to the local skill/ folder. Once installed, just describe wh
Add this skill
npx mdskills install abu-ShotAI/ai-video-remixWell-documented video generation pipeline with semantic search, LLM planning, and multiple composition styles
1# AI Video Remix23> AI-driven video remix generator — semantic video search + LLM planning + Remotion rendering.4>5> Requires [ShotAI](https://www.shotai.io) — local video asset management and footage search for Mac.6>7> 中文文档: [README.zh.md](README.zh.md)89Generate styled video compositions from your local video footage library using natural language. [ShotAI](https://www.shotai.io) handles shot-level indexing and semantic search; this tool handles the planning, music, and rendering.1011---1213## Demo1415> Hong Kong Cyberpunk Night — generated from local video footage with ShotAI + Remotion, no manual editing.1617[](https://www.youtube.com/watch?v=mibbqDf6uQY)1819---2021## Use as a Claude Skill2223This repo ships a ready-to-install [Claude Agent Skill](https://support.claude.com/en/articles/12512176-what-are-skills) in the [`skill/`](skill/) directory.2425**Install in Claude Code:**26```bash27/plugin install ai-video-remix@abu-ShotAI/ai-video-remix#skill28```2930Or point Claude Code settings to the local `skill/` folder.3132Once installed, just describe what you want:33> *"Make a travel vlog from my library"*34> *"Create a cyberpunk city highlight reel"*35> *"Sports highlight from last weekend's footage"*3637---3839## Prerequisites4041| Tool | Purpose | Install |42|------|---------|---------|43| [ShotAI](https://www.shotai.io) | AI video asset management + semantic footage search — provides the MCP server this tool queries | [Download for Mac](https://www.shotai.io) |44| ffmpeg | Clip extraction and keyframe analysis | `brew install ffmpeg` |45| yt-dlp | Auto background music from YouTube | `brew install yt-dlp` |46| Node.js 18+ | Runtime | `brew install node` |4748### ShotAI Setup49501. Download and open [ShotAI](https://www.shotai.io), add your video footage folders to a collection512. Wait for indexing (shot detection + semantic embeddings — automatic, takes a few minutes for large libraries)523. **Settings → MCP Server → Enable**534. Note your **MCP URL** (default: `http://127.0.0.1:23817`) and **MCP Token**5455---5657## Quick Start5859```bash60git clone https://github.com/abu-ShotAI/ai-video-remix.git61cd ai-video-editor62npm install63cp .env.example .env # fill in SHOTAI_URL, SHOTAI_TOKEN, and optionally AGENT_PROVIDER64```6566```bash67# Nature documentary (no LLM required)68AGENT_PROVIDER=none npx tsx src/skill/cli.ts "自然野生动物纪录片" --composition NatureWild6970# Sports highlight reel (generic — works with any sport footage)71npx tsx src/skill/cli.ts "世界杯足球精彩时刻混剪" --composition SportsHighlight7273# Travel vlog with English captions74npx tsx src/skill/cli.ts "Japan and Paris travel highlights" --composition TravelVlog --lang en7576# Cyberpunk city night cuts77npx tsx src/skill/cli.ts "香港赛博朋克夜景混剪" --composition CyberpunkCity7879# With local music file80npx tsx src/skill/cli.ts "scenic alpine journey" --composition SwitzerlandScenic --bgm ./music/alpine.mp381```8283---8485## Pipeline8687How AI Video Remix turns a text prompt into a finished video:8889```90User prompt91 │92 ▼931. parseIntent — LLM extracts theme, selects composition, optionally overrides music style942. refineQueries — LLM rewrites per-slot search terms to match library content953. pickShots — ShotAI semantic search across your video footage library; scored by similarity + duration + mood964. resolveMusic — yt-dlp YouTube search+download, or local --bgm file975. extractClip — ffmpeg trims each shot to an independent .mp4986. annotateClips — LLM assigns per-clip visual params (tone, kenBurns, dramatic, caption)997. File Server — HTTP server serves clips to the Remotion renderer1008. Remotion render — Final MP4 composed and rendered101```102103---104105## CLI Reference106107```bash108npx tsx src/skill/cli.ts "<request>" [options]109110Options:111 --composition <id> Force a specific composition (skip LLM selection)112 --bgm <path> Local MP3 path (skip YouTube search)113 --lang <zh|en> Caption language — zh (default) or en114 --output <dir> Output directory (default: ./output)115 --probe Scan library first; LLM plans slots from actual content116```117118---119120## Compositions121122| ID | Style | Best For |123|----|-------|----------|124| `CyberpunkCity` | Cyberpunk night | Neon city, night scenes, sci-fi |125| `TravelVlog` | Travel vlog | Multi-city travel with location cards |126| `MoodDriven` | Mood-driven cuts | Emotional fast/slow montage |127| `NatureWild` | BBC nature doc | Wildlife, landscapes, nature footage |128| `SwitzerlandScenic` | Alpine scenic | Mountain travel with elegant captions |129| `SportsHighlight` | ESPN sports | Goal/action highlights with captions |130131---132133## Modes134135**Standard mode** (default) — LLM picks the composition and generates search queries from registry templates.136137**Probe mode** (`--probe`) — Scans the library first (video names, shot samples, mood/scene tags), then LLM builds custom slots tailored to what actually exists. Use this when:138- Library content is unknown or varied139- User wants "best of my library"140- Standard queries return low-quality shots141142---143144## Configuration145146Edit `.env` (copy from `.env.example`):147148```env149# ── LLM Agent ────────────────────────────────────────────────────────────────150AGENT_PROVIDER=claude # claude | openai | openai-compat | none151ANTHROPIC_API_KEY=sk-ant-... # required when AGENT_PROVIDER=claude152OPENAI_API_KEY=sk-... # required when AGENT_PROVIDER=openai153OPENAI_COMPAT_BASE_URL=https://... # required when AGENT_PROVIDER=openai-compat154OPENAI_COMPAT_API_KEY=sk-...155AGENT_MODEL=claude-sonnet-4-6 # override default model156157# ── ShotAI ───────────────────────────────────────────────────────────────────158SHOTAI_URL=http://127.0.0.1:23817159SHOTAI_TOKEN=<your-token>160161# ── Music ────────────────────────────────────────────────────────────────────162BGM_PATH=/path/to/music.mp3 # permanent local BGM default163164# ── Quality ──────────────────────────────────────────────────────────────────165MIN_SCORE=0.5 # shot quality threshold 0–1 (recommended: 0.5)166```167168### LLM Providers169170<details>171<summary>Claude (Anthropic)</summary>172173```env174AGENT_PROVIDER=claude175ANTHROPIC_API_KEY=sk-ant-...176AGENT_MODEL=claude-sonnet-4-6177```178</details>179180<details>181<summary>OpenAI</summary>182183```env184AGENT_PROVIDER=openai185OPENAI_API_KEY=sk-...186AGENT_MODEL=gpt-4o187```188</details>189190<details>191<summary>OpenRouter (recommended for multi-provider access)</summary>192193```env194AGENT_PROVIDER=openai-compat195OPENAI_COMPAT_BASE_URL=https://openrouter.ai/api/v1196OPENAI_COMPAT_API_KEY=sk-or-v1-...197AGENT_MODEL=deepseek/deepseek-chat-v3-0324198```199</details>200201<details>202<summary>Ollama (local, no API key needed)</summary>203204```env205AGENT_PROVIDER=openai-compat206OPENAI_COMPAT_BASE_URL=http://localhost:11434/v1207OPENAI_COMPAT_API_KEY=ollama208AGENT_MODEL=llama3.1209```210</details>211212<details>213<summary>DeepSeek (direct)</summary>214215```env216AGENT_PROVIDER=openai-compat217OPENAI_COMPAT_BASE_URL=https://api.deepseek.com/v1218OPENAI_COMPAT_API_KEY=sk-...219AGENT_MODEL=deepseek-chat220```221</details>222223<details>224<summary>No LLM (heuristic fallback)</summary>225226```env227AGENT_PROVIDER=none228```229Keyword-based composition selection + registry default queries. No API key required.230</details>231232---233234## Troubleshooting235236### Clip boundary flicker (1–2 frame flash at cuts)237238An 80ms head/tail trim is applied automatically (`TRIM = 0.08`). If it persists, increase `TRIM` to `0.12` or `0.15` in `src/skill/orchestrator.ts`.239240### Red flash in CyberpunkCity241242`GlitchFlicker` triggers on very short clips. Set `MIN_SCORE=0.5` in `.env` to keep short clips out of the pipeline.243244### Low-quality or off-topic shots2452461. Raise `MIN_SCORE` (try `0.5` → `0.7`)2472. Use `--probe` mode — LLM sees your actual library before picking queries2483. Force `--composition <id>` to a composition whose slots match your content249250### Music download fails251252```bash253pip install -U yt-dlp # update yt-dlp (requires 2026.03.03+)254# or use a local file:255npx tsx src/skill/cli.ts "..." --bgm /path/to/music.mp3256```257258If yt-dlp reports `n challenge solving failed`, it needs the EJS remote solver. This is handled automatically with `--remote-components ejs:github` (already set in the code).259260---261262## Performance263264| Step | Typical time (M-series Mac) |265|------|-----------------------------|266| Remotion render (60s video) | 30–90s |267| ShotAI search per slot | 1–3s |268| ffmpeg clip extraction | ~0.5s per clip |269270---271272## Adding a New Composition273274See [references/composition-guide.md](references/composition-guide.md) for step-by-step instructions on adding a new Remotion visual style + registry entry.275
Full transparency — inspect the skill content before installing.