Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration
Add this skill
npx mdskills install sickn33/audio-transcriberComprehensive audio-to-text transcription with metadata extraction and LLM-powered summarization
Transform audio recordings into professional Markdown documentation with intelligent atas/summaries using LLM integration (Claude/Copilot CLI) and automatic prompt engineering.
transcript-YYYYMMDD-HHMMSS.md + ata-YYYYMMDD-HHMMSS.mdmetadata.json and transcription.jsonSee CHANGELOG.md for complete v1.1.0 details.
npx cli-ai-skills@latest install audio-transcriber
This automatically:
Recommended (fastest):
pip install faster-whisper tqdm rich
Alternative (original Whisper):
pip install openai-whisper tqdm rich
For format conversion support:
# macOS
brew install ffmpeg
# Linux
apt install ffmpeg
Claude CLI (recommended):
# Follow: https://docs.anthropic.com/en/docs/claude-cli
GitHub Copilot CLI (alternative):
gh extension install github/gh-copilot
Global installation (auto-updates with git pull):
cd /path/to/cli-ai-skills
./scripts/install-skills.sh $(pwd)
Repository only:
# Skill is already available if you cloned the repo
copilot> transcribe audio to markdown: meeting.mp3
Output:
meeting.md - Full Markdown report with metadata, transcription, minutes, summarycopilot> convert audio file to text with subtitles: interview.wav
Generates:
interview.md - Markdown reportinterview.srt - Subtitle filecopilot> transcreva estes áudios: recordings/*.mp3
Processes all MP3 files in the directory.
Activate the skill with any of these phrases:
Record standups, planning sessions, or retrospectives and automatically generate:
Transcribe client conversations with:
Convert interviews to text with:
Document educational content with:
Analyze podcasts, videos, YouTube content:
# Audio Transcription Report
## 📊 Metadata
| Field | Value |
|-------|-------|
| **File Name** | team-standup.mp3 |
| **File Size** | 3.2 MB |
| **Duration** | 00:12:47 |
| **Language** | English (en) |
| **Processed Date** | 2026-02-02 14:35:21 |
| **Speakers Identified** | 5 |
| **Transcription Engine** | Faster-Whisper (model: base) |
---
## 🎙️ Full Transcription
**[00:00:12 → 00:00:45]** *Speaker 1*
Good morning everyone. Let's start with updates from the frontend team.
**[00:00:46 → 00:01:23]** *Speaker 2*
We completed the dashboard redesign and deployed to staging yesterday.
---
## 📋 Meeting Minutes
### Participants
- Speaker 1 (Meeting Lead)
- Speaker 2 (Frontend Developer)
- Speaker 3 (Backend Developer)
- Speaker 4 (Designer)
- Speaker 5 (Product Manager)
### Topics Discussed
1. **Dashboard Redesign** (00:00:46)
- Completed and deployed to staging
- Positive feedback from QA team
2. **API Performance Issues** (00:03:12)
- Database query optimization needed
- Target response time transcribe audio: recordings/*.wav # Only WAV files
Q: Does this work offline?
A: Yes! 100% local processing, no internet required after initial model download.
Q: What's the difference between Whisper and Faster-Whisper?
A: Faster-Whisper is 4-5x faster with same quality. Always prefer it if available.
Q: Can I transcribe YouTube videos?
A: Not directly. Use a YouTube downloader first, then transcribe the audio file. Or use the youtube-summarizer skill instead.
Q: How accurate is speaker identification?
A: Accuracy depends on audio quality. Clear recordings with distinct voices work best. Currently uses simple estimation; future versions will use advanced diarization.
Q: What languages are supported?
A: 99 languages including English, Portuguese, Spanish, French, German, Chinese, Japanese, Arabic, and more.
Q: Can I edit the meeting minutes format?
A: Yes! Edit the Markdown template in SKILL.md Step 3.
This skill is part of the cli-ai-skills repository.
MIT License - See repository LICENSE file.
Found a bug or have a feature request?
Open an issue in the cli-ai-skills repository.
Version: 1.0.0
Author: Eric Andrade
Created: 2026-02-02
Install via CLI
npx mdskills install sickn33/audio-transcriberAudio Transcriber is a free, open-source AI agent skill. Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration
Install Audio Transcriber with a single command:
npx mdskills install sickn33/audio-transcriberThis downloads the skill files into your project and your AI agent picks them up automatically.
Audio Transcriber works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.