How do I install Agent Skills?

Install Agent Skills with a single command: npx mdskills install openai/transcribe. This downloads the skill files into your project and your AI agent picks them up automatically.

What platforms support Agent Skills?

Agent Skills works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.

← Back to skills

Agent Skills

Name: Agent Skills: AI Agent Skill
Brand: openai
Availability: InStock
Rating: 8 (1 reviews)
Author: openai

Verified

ProductivityIntermediate

Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.

by @openai1 downloads9,515Updated 2/24/2026

Add this skill

npx mdskills install openai/transcribe

Fork & Edit

Are you @openai? Sign in with GitHub to claim this listing.

Skill Advisor8.0

Clear workflow with sensible defaults, diarization support, and good CLI examples

+Provides concrete CLI examples for common transcription scenarios
+Clearly documents model selection logic and diarization capabilities
+Handles environment setup and dependency installation safely
-Lacks error-handling guidance for common failure modes like unsupported formats

SKILL.md

Edit in Browser

1---
2name: "transcribe"
3description: "Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings."
4---
5 
6 
7# Audio Transcribe
8 
9Transcribe audio using OpenAI, with optional speaker diarization when requested. Prefer the bundled CLI for deterministic, repeatable runs.
10 
11## Workflow
121. Collect inputs: audio file path(s), desired response format (text/json/diarized_json), optional language hint, and any known speaker references.
132. Verify `OPENAI_API_KEY` is set. If missing, ask the user to set it locally (do not ask them to paste the key).
143. Run the bundled `transcribe_diarize.py` CLI with sensible defaults (fast text transcription).
154. Validate the output: transcription quality, speaker labels, and segment boundaries; iterate with a single targeted change if needed.
165. Save outputs under `output/transcribe/` when working in this repo.
17 
18## Decision rules
19- Default to `gpt-4o-mini-transcribe` with `--response-format text` for fast transcription.
20- If the user wants speaker labels or diarization, use `--model gpt-4o-transcribe-diarize --response-format diarized_json`.
21- If audio is longer than ~30 seconds, keep `--chunking-strategy auto`.
22- Prompting is not supported for `gpt-4o-transcribe-diarize`.
23 
24## Output conventions
25- Use `output/transcribe/<job-id>/` for evaluation runs.
26- Use `--out-dir` for multiple files to avoid overwriting.
27 
28## Dependencies (install if missing)
29Prefer `uv` for dependency management.
30 
31```
32uv pip install openai
33```
34If `uv` is unavailable:
35```
36python3 -m pip install openai
37```
38 
39## Environment
40- `OPENAI_API_KEY` must be set for live API calls.
41- If the key is missing, instruct the user to create one in the OpenAI platform UI and export it in their shell.
42- Never ask the user to paste the full key in chat.
43 
44## Skill path (set once)
45 
46```bash
47export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
48export TRANSCRIBE_CLI="$CODEX_HOME/skills/transcribe/scripts/transcribe_diarize.py"
49```
50 
51User-scoped skills install under `$CODEX_HOME/skills` (default: `~/.codex/skills`).
52 
53## CLI quick start
54Single file (fast text default):
55```
56python3 "$TRANSCRIBE_CLI" \
57  path/to/audio.wav \
58  --out transcript.txt
59```
60 
61Diarization with known speakers (up to 4):
62```
63python3 "$TRANSCRIBE_CLI" \
64  meeting.m4a \
65  --model gpt-4o-transcribe-diarize \
66  --known-speaker "Alice=refs/alice.wav" \
67  --known-speaker "Bob=refs/bob.wav" \
68  --response-format diarized_json \
69  --out-dir output/transcribe/meeting
70```
71 
72Plain text output (explicit):
73```
74python3 "$TRANSCRIBE_CLI" \
75  interview.mp3 \
76  --response-format text \
77  --out interview.txt
78```
79 
80## Reference map
81- `references/api.md`: supported formats, limits, response formats, and known-speaker notes.
82

Full transparency — inspect the skill content before installing.

New to skill.md files?

See what a SKILL.md file is, how to install one, and how it differs from AGENTS.md or cursorrules.

Read the guide →