What is Azure AI Voicelive Ts?

Azure AI Voicelive Ts is a free, open-source AI agent skill. |

How do I install Azure AI Voicelive Ts?

Install Azure AI Voicelive Ts with a single command: npx mdskills install sickn33/azure-ai-voicelive-ts. This downloads the skill files into your project and your AI agent picks them up automatically.

What platforms support Azure AI Voicelive Ts?

Azure AI Voicelive Ts works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.

← Back to skills

Azure AI Voicelive Ts

Name: Azure AI Voicelive Ts: AI Agent Skill
Brand: sickn33
Availability: InStock
Rating: 8 (1 reviews)
Author: sickn33

Verified

DevOps & CloudIntermediate

by @sickn33 13,166Updated 2/20/2026

Add this skill

npx mdskills install sickn33/azure-ai-voicelive-ts

Fork & Edit

Are you @sickn33? Sign in with GitHub to claim this listing.

Skill Advisor8.0

Comprehensive SDK reference with excellent code examples, event handling patterns, and authentication setup

+Provides extensive working code examples for all major features and use cases
+Documents complete event handling patterns with Azure SDK subscription model
+Includes detailed configuration tables for voices, models, audio formats, and turn detection
-Declares filesystem read/write and shell execution permissions without clear justification
-Content appears truncated at browser usage section, potentially missing important examples

SKILL.md

Edit in Browser

1---
2name: azure-ai-voicelive-ts
3description: |
4  Azure AI Voice Live SDK for JavaScript/TypeScript. Build real-time voice AI applications with bidirectional WebSocket communication. Use for voice assistants, conversational AI, real-time speech-to-speech, and voice-enabled chatbots in Node.js or browser environments. Triggers: "voice live", "real-time voice", "VoiceLiveClient", "VoiceLiveSession", "voice assistant TypeScript", "bidirectional audio", "speech-to-speech JavaScript".
5package: "@azure/ai-voicelive"
6---
7 
8# @azure/ai-voicelive (JavaScript/TypeScript)
9 
10Real-time voice AI SDK for building bidirectional voice assistants with Azure AI in Node.js and browser environments.
11 
12## Installation
13 
14```bash
15npm install @azure/ai-voicelive @azure/identity
16# TypeScript users
17npm install @types/node
18```
19 
20**Current Version**: 1.0.0-beta.3
21 
22**Supported Environments**:
23- Node.js LTS versions (20+)
24- Modern browsers (Chrome, Firefox, Safari, Edge)
25 
26## Environment Variables
27 
28```bash
29AZURE_VOICELIVE_ENDPOINT=https://<resource>.cognitiveservices.azure.com
30# Optional: API key if not using Entra ID
31AZURE_VOICELIVE_API_KEY=<your-api-key>
32# Optional: Logging
33AZURE_LOG_LEVEL=info
34```
35 
36## Authentication
37 
38### Microsoft Entra ID (Recommended)
39 
40```typescript
41import { DefaultAzureCredential } from "@azure/identity";
42import { VoiceLiveClient } from "@azure/ai-voicelive";
43 
44const credential = new DefaultAzureCredential();
45const endpoint = "https://your-resource.cognitiveservices.azure.com";
46 
47const client = new VoiceLiveClient(endpoint, credential);
48```
49 
50### API Key
51 
52```typescript
53import { AzureKeyCredential } from "@azure/core-auth";
54import { VoiceLiveClient } from "@azure/ai-voicelive";
55 
56const endpoint = "https://your-resource.cognitiveservices.azure.com";
57const credential = new AzureKeyCredential("your-api-key");
58 
59const client = new VoiceLiveClient(endpoint, credential);
60```
61 
62## Client Hierarchy
63 
64```
65VoiceLiveClient
66└── VoiceLiveSession (WebSocket connection)
67    ├── updateSession()      → Configure session options
68    ├── subscribe()          → Event handlers (Azure SDK pattern)
69    ├── sendAudio()          → Stream audio input
70    ├── addConversationItem() → Add messages/function outputs
71    └── sendEvent()          → Send raw protocol events
72```
73 
74## Quick Start
75 
76```typescript
77import { DefaultAzureCredential } from "@azure/identity";
78import { VoiceLiveClient } from "@azure/ai-voicelive";
79 
80const credential = new DefaultAzureCredential();
81const endpoint = process.env.AZURE_VOICELIVE_ENDPOINT!;
82 
83// Create client and start session
84const client = new VoiceLiveClient(endpoint, credential);
85const session = await client.startSession("gpt-4o-mini-realtime-preview");
86 
87// Configure session
88await session.updateSession({
89  modalities: ["text", "audio"],
90  instructions: "You are a helpful AI assistant. Respond naturally.",
91  voice: {
92    type: "azure-standard",
93    name: "en-US-AvaNeural",
94  },
95  turnDetection: {
96    type: "server_vad",
97    threshold: 0.5,
98    prefixPaddingMs: 300,
99    silenceDurationMs: 500,
100  },
101  inputAudioFormat: "pcm16",
102  outputAudioFormat: "pcm16",
103});
104 
105// Subscribe to events
106const subscription = session.subscribe({
107  onResponseAudioDelta: async (event, context) => {
108    // Handle streaming audio output
109    const audioData = event.delta;
110    playAudioChunk(audioData);
111  },
112  onResponseTextDelta: async (event, context) => {
113    // Handle streaming text
114    process.stdout.write(event.delta);
115  },
116  onInputAudioTranscriptionCompleted: async (event, context) => {
117    console.log("User said:", event.transcript);
118  },
119});
120 
121// Send audio from microphone
122function sendAudioChunk(audioBuffer: ArrayBuffer) {
123  session.sendAudio(audioBuffer);
124}
125```
126 
127## Session Configuration
128 
129```typescript
130await session.updateSession({
131  // Modalities
132  modalities: ["audio", "text"],
133  
134  // System instructions
135  instructions: "You are a customer service representative.",
136  
137  // Voice selection
138  voice: {
139    type: "azure-standard",  // or "azure-custom", "openai"
140    name: "en-US-AvaNeural",
141  },
142  
143  // Turn detection (VAD)
144  turnDetection: {
145    type: "server_vad",      // or "azure_semantic_vad"
146    threshold: 0.5,
147    prefixPaddingMs: 300,
148    silenceDurationMs: 500,
149  },
150  
151  // Audio formats
152  inputAudioFormat: "pcm16",
153  outputAudioFormat: "pcm16",
154  
155  // Tools (function calling)
156  tools: [
157    {
158      type: "function",
159      name: "get_weather",
160      description: "Get current weather",
161      parameters: {
162        type: "object",
163        properties: {
164          location: { type: "string" }
165        },
166        required: ["location"]
167      }
168    }
169  ],
170  toolChoice: "auto",
171});
172```
173 
174## Event Handling (Azure SDK Pattern)
175 
176The SDK uses a subscription-based event handling pattern:
177 
178```typescript
179const subscription = session.subscribe({
180  // Connection lifecycle
181  onConnected: async (args, context) => {
182    console.log("Connected:", args.connectionId);
183  },
184  onDisconnected: async (args, context) => {
185    console.log("Disconnected:", args.code, args.reason);
186  },
187  onError: async (args, context) => {
188    console.error("Error:", args.error.message);
189  },
190  
191  // Session events
192  onSessionCreated: async (event, context) => {
193    console.log("Session created:", context.sessionId);
194  },
195  onSessionUpdated: async (event, context) => {
196    console.log("Session updated");
197  },
198  
199  // Audio input events (VAD)
200  onInputAudioBufferSpeechStarted: async (event, context) => {
201    console.log("Speech started at:", event.audioStartMs);
202  },
203  onInputAudioBufferSpeechStopped: async (event, context) => {
204    console.log("Speech stopped at:", event.audioEndMs);
205  },
206  
207  // Transcription events
208  onConversationItemInputAudioTranscriptionCompleted: async (event, context) => {
209    console.log("User said:", event.transcript);
210  },
211  onConversationItemInputAudioTranscriptionDelta: async (event, context) => {
212    process.stdout.write(event.delta);
213  },
214  
215  // Response events
216  onResponseCreated: async (event, context) => {
217    console.log("Response started");
218  },
219  onResponseDone: async (event, context) => {
220    console.log("Response complete");
221  },
222  
223  // Streaming text
224  onResponseTextDelta: async (event, context) => {
225    process.stdout.write(event.delta);
226  },
227  onResponseTextDone: async (event, context) => {
228    console.log("\n--- Text complete ---");
229  },
230  
231  // Streaming audio
232  onResponseAudioDelta: async (event, context) => {
233    const audioData = event.delta;
234    playAudioChunk(audioData);
235  },
236  onResponseAudioDone: async (event, context) => {
237    console.log("Audio complete");
238  },
239  
240  // Audio transcript (what assistant said)
241  onResponseAudioTranscriptDelta: async (event, context) => {
242    process.stdout.write(event.delta);
243  },
244  
245  // Function calling
246  onResponseFunctionCallArgumentsDone: async (event, context) => {
247    if (event.name === "get_weather") {
248      const args = JSON.parse(event.arguments);
249      const result = await getWeather(args.location);
250      
251      await session.addConversationItem({
252        type: "function_call_output",
253        callId: event.callId,
254        output: JSON.stringify(result),
255      });
256      
257      await session.sendEvent({ type: "response.create" });
258    }
259  },
260  
261  // Catch-all for debugging
262  onServerEvent: async (event, context) => {
263    console.log("Event:", event.type);
264  },
265});
266 
267// Clean up when done
268await subscription.close();
269```
270 
271## Function Calling
272 
273```typescript
274// Define tools in session config
275await session.updateSession({
276  modalities: ["audio", "text"],
277  instructions: "Help users with weather information.",
278  tools: [
279    {
280      type: "function",
281      name: "get_weather",
282      description: "Get current weather for a location",
283      parameters: {
284        type: "object",
285        properties: {
286          location: {
287            type: "string",
288            description: "City and state or country",
289          },
290        },
291        required: ["location"],
292      },
293    },
294  ],
295  toolChoice: "auto",
296});
297 
298// Handle function calls
299const subscription = session.subscribe({
300  onResponseFunctionCallArgumentsDone: async (event, context) => {
301    if (event.name === "get_weather") {
302      const args = JSON.parse(event.arguments);
303      const weatherData = await fetchWeather(args.location);
304      
305      // Send function result
306      await session.addConversationItem({
307        type: "function_call_output",
308        callId: event.callId,
309        output: JSON.stringify(weatherData),
310      });
311      
312      // Trigger response generation
313      await session.sendEvent({ type: "response.create" });
314    }
315  },
316});
317```
318 
319## Voice Options
320 
321| Voice Type | Config | Example |
322|------------|--------|---------|
323| Azure Standard | `{ type: "azure-standard", name: "..." }` | `"en-US-AvaNeural"` |
324| Azure Custom | `{ type: "azure-custom", name: "...", endpointId: "..." }` | Custom voice endpoint |
325| Azure Personal | `{ type: "azure-personal", speakerProfileId: "..." }` | Personal voice clone |
326| OpenAI | `{ type: "openai", name: "..." }` | `"alloy"`, `"echo"`, `"shimmer"` |
327 
328## Supported Models
329 
330| Model | Description | Use Case |
331|-------|-------------|----------|
332| `gpt-4o-realtime-preview` | GPT-4o with real-time audio | High-quality conversational AI |
333| `gpt-4o-mini-realtime-preview` | Lightweight GPT-4o | Fast, efficient interactions |
334| `phi4-mm-realtime` | Phi multimodal | Cost-effective applications |
335 
336## Turn Detection Options
337 
338```typescript
339// Server VAD (default)
340turnDetection: {
341  type: "server_vad",
342  threshold: 0.5,
343  prefixPaddingMs: 300,
344  silenceDurationMs: 500,
345}
346 
347// Azure Semantic VAD (smarter detection)
348turnDetection: {
349  type: "azure_semantic_vad",
350}
351 
352// Azure Semantic VAD (English optimized)
353turnDetection: {
354  type: "azure_semantic_vad_en",
355}
356 
357// Azure Semantic VAD (Multilingual)
358turnDetection: {
359  type: "azure_semantic_vad_multilingual",
360}
361```
362 
363## Audio Formats
364 
365| Format | Sample Rate | Use Case |
366|--------|-------------|----------|
367| `pcm16` | 24kHz | Default, high quality |
368| `pcm16-8000hz` | 8kHz | Telephony |
369| `pcm16-16000hz` | 16kHz | Voice assistants |
370| `g711_ulaw` | 8kHz | Telephony (US) |
371| `g711_alaw` | 8kHz | Telephony (EU) |
372 
373## Key Types Reference
374 
375| Type | Purpose |
376|------|---------|
377| `VoiceLiveClient` | Main client for creating sessions |
378| `VoiceLiveSession` | Active WebSocket session |
379| `VoiceLiveSessionHandlers` | Event handler interface |
380| `VoiceLiveSubscription` | Active event subscription |
381| `ConnectionContext` | Context for connection events |
382| `SessionContext` | Context for session events |
383| `ServerEventUnion` | Union of all server events |
384 
385## Error Handling
386 
387```typescript
388import {
389  VoiceLiveError,
390  VoiceLiveConnectionError,
391  VoiceLiveAuthenticationError,
392  VoiceLiveProtocolError,
393} from "@azure/ai-voicelive";
394 
395const subscription = session.subscribe({
396  onError: async (args, context) => {
397    const { error } = args;
398    
399    if (error instanceof VoiceLiveConnectionError) {
400      console.error("Connection error:", error.message);
401    } else if (error instanceof VoiceLiveAuthenticationError) {
402      console.error("Auth error:", error.message);
403    } else if (error instanceof VoiceLiveProtocolError) {
404      console.error("Protocol error:", error.message);
405    }
406  },
407  
408  onServerError: async (event, context) => {
409    console.error("Server error:", event.error?.message);
410  },
411});
412```
413 
414## Logging
415 
416```typescript
417import { setLogLevel } from "@azure/logger";
418 
419// Enable verbose logging
420setLogLevel("info");
421 
422// Or via environment variable
423// AZURE_LOG_LEVEL=info
424```
425 
426## Browser Usage
427 
428```typescript
429// Browser requires bundler (Vite, webpack, etc.)
430import { VoiceLiveClient } from "@azure/ai-voicelive";
431import { InteractiveBrowserCredential } from "@azure/identity";
432 
433// Use browser-compatible credential
434const credential = new InteractiveBrowserCredential({
435  clientId: "your-client-id",
436  tenantId: "your-tenant-id",
437});
438 
439const client = new VoiceLiveClient(endpoint, credential);
440 
441// Request microphone access
442const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
443const audioContext = new AudioContext({ sampleRate: 24000 });
444 
445// Process audio and send to session
446// ... (see samples for full implementation)
447```
448 
449## Best Practices
450 
4511. **Always use `DefaultAzureCredential`** — Never hardcode API keys
4522. **Set both modalities** — Include `["text", "audio"]` for voice assistants
4533. **Use Azure Semantic VAD** — Better turn detection than basic server VAD
4544. **Handle all error types** — Connection, auth, and protocol errors
4555. **Clean up subscriptions** — Call `subscription.close()` when done
4566. **Use appropriate audio format** — PCM16 at 24kHz for best quality
457 
458## Reference Links
459 
460| Resource | URL |
461|----------|-----|
462| npm Package | https://www.npmjs.com/package/@azure/ai-voicelive |
463| GitHub Source | https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/ai/ai-voicelive |
464| Samples | https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/ai/ai-voicelive/samples |
465| API Reference | https://learn.microsoft.com/javascript/api/@azure/ai-voicelive |
466

Full transparency — inspect the skill content before installing.

New to skill.md files?

See what a SKILL.md file is, how to install one, and how it differs from AGENTS.md or cursorrules.

Read the guide →