What is Azure AI Voicelive Java?

Azure AI Voicelive Java is a free, open-source AI agent skill. |

How do I install Azure AI Voicelive Java?

Install Azure AI Voicelive Java with a single command: npx mdskills install sickn33/azure-ai-voicelive-java. This downloads the skill files into your project and your AI agent picks them up automatically.

What platforms support Azure AI Voicelive Java?

Azure AI Voicelive Java works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Codex, Gemini Cli, Amp, Roo Code, Goose, Opencode, Trae, Qodo, Command Code. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.

← Back to skills

Azure AI Voicelive Java

Name: Azure AI Voicelive Java: AI Agent Skill
Brand: sickn33
Availability: InStock
Rating: 7 (1 reviews)
Author: sickn33

Verified

DevOps & CloudIntermediate

by @sickn33 13,166Updated 2/20/2026

Add this skill

npx mdskills install sickn33/azure-ai-voicelive-java

Fork & Edit

Are you @sickn33? Sign in with GitHub to claim this listing.

Skill Advisor7.0

Comprehensive SDK documentation with clear workflow, auth patterns, and audio configuration examples

+Provides detailed audio requirements and configuration options with multiple examples
+Covers authentication patterns, turn detection, and error handling comprehensively
+Includes function calling, voice configuration, and best practices guidance
-Lacks explicit trigger conditions for when agents should use this SDK
-Requests shell/filesystem permissions that appear unnecessary for API usage

SKILL.md

Edit in Browser

1---
2name: azure-ai-voicelive-java
3description: |
4  Azure AI VoiceLive SDK for Java. Real-time bidirectional voice conversations with AI assistants using WebSocket.
5  Triggers: "VoiceLiveClient java", "voice assistant java", "real-time voice java", "audio streaming java", "voice activity detection java".
6package: com.azure:azure-ai-voicelive
7---
8 
9# Azure AI VoiceLive SDK for Java
10 
11Real-time, bidirectional voice conversations with AI assistants using WebSocket technology.
12 
13## Installation
14 
15```xml
16<dependency>
17    <groupId>com.azure</groupId>
18    <artifactId>azure-ai-voicelive</artifactId>
19    <version>1.0.0-beta.2</version>
20</dependency>
21```
22 
23## Environment Variables
24 
25```bash
26AZURE_VOICELIVE_ENDPOINT=https://<resource>.openai.azure.com/
27AZURE_VOICELIVE_API_KEY=<your-api-key>
28```
29 
30## Authentication
31 
32### API Key
33 
34```java
35import com.azure.ai.voicelive.VoiceLiveAsyncClient;
36import com.azure.ai.voicelive.VoiceLiveClientBuilder;
37import com.azure.core.credential.AzureKeyCredential;
38 
39VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
40    .endpoint(System.getenv("AZURE_VOICELIVE_ENDPOINT"))
41    .credential(new AzureKeyCredential(System.getenv("AZURE_VOICELIVE_API_KEY")))
42    .buildAsyncClient();
43```
44 
45### DefaultAzureCredential (Recommended)
46 
47```java
48import com.azure.identity.DefaultAzureCredentialBuilder;
49 
50VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
51    .endpoint(System.getenv("AZURE_VOICELIVE_ENDPOINT"))
52    .credential(new DefaultAzureCredentialBuilder().build())
53    .buildAsyncClient();
54```
55 
56## Key Concepts
57 
58| Concept | Description |
59|---------|-------------|
60| `VoiceLiveAsyncClient` | Main entry point for voice sessions |
61| `VoiceLiveSessionAsyncClient` | Active WebSocket connection for streaming |
62| `VoiceLiveSessionOptions` | Configuration for session behavior |
63 
64### Audio Requirements
65 
66- **Sample Rate**: 24kHz (24000 Hz)
67- **Bit Depth**: 16-bit PCM
68- **Channels**: Mono (1 channel)
69- **Format**: Signed PCM, little-endian
70 
71## Core Workflow
72 
73### 1. Start Session
74 
75```java
76import reactor.core.publisher.Mono;
77 
78client.startSession("gpt-4o-realtime-preview")
79    .flatMap(session -> {
80        System.out.println("Session started");
81        
82        // Subscribe to events
83        session.receiveEvents()
84            .subscribe(
85                event -> System.out.println("Event: " + event.getType()),
86                error -> System.err.println("Error: " + error.getMessage())
87            );
88        
89        return Mono.just(session);
90    })
91    .block();
92```
93 
94### 2. Configure Session Options
95 
96```java
97import com.azure.ai.voicelive.models.*;
98import java.util.Arrays;
99 
100ServerVadTurnDetection turnDetection = new ServerVadTurnDetection()
101    .setThreshold(0.5)                    // Sensitivity (0.0-1.0)
102    .setPrefixPaddingMs(300)              // Audio before speech
103    .setSilenceDurationMs(500)            // Silence to end turn
104    .setInterruptResponse(true)           // Allow interruptions
105    .setAutoTruncate(true)
106    .setCreateResponse(true);
107 
108AudioInputTranscriptionOptions transcription = new AudioInputTranscriptionOptions(
109    AudioInputTranscriptionOptionsModel.WHISPER_1);
110 
111VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
112    .setInstructions("You are a helpful AI voice assistant.")
113    .setVoice(BinaryData.fromObject(new OpenAIVoice(OpenAIVoiceName.ALLOY)))
114    .setModalities(Arrays.asList(InteractionModality.TEXT, InteractionModality.AUDIO))
115    .setInputAudioFormat(InputAudioFormat.PCM16)
116    .setOutputAudioFormat(OutputAudioFormat.PCM16)
117    .setInputAudioSamplingRate(24000)
118    .setInputAudioNoiseReduction(new AudioNoiseReduction(AudioNoiseReductionType.NEAR_FIELD))
119    .setInputAudioEchoCancellation(new AudioEchoCancellation())
120    .setInputAudioTranscription(transcription)
121    .setTurnDetection(turnDetection);
122 
123// Send configuration
124ClientEventSessionUpdate updateEvent = new ClientEventSessionUpdate(options);
125session.sendEvent(updateEvent).subscribe();
126```
127 
128### 3. Send Audio Input
129 
130```java
131byte[] audioData = readAudioChunk(); // Your PCM16 audio data
132session.sendInputAudio(BinaryData.fromBytes(audioData)).subscribe();
133```
134 
135### 4. Handle Events
136 
137```java
138session.receiveEvents().subscribe(event -> {
139    ServerEventType eventType = event.getType();
140    
141    if (ServerEventType.SESSION_CREATED.equals(eventType)) {
142        System.out.println("Session created");
143    } else if (ServerEventType.INPUT_AUDIO_BUFFER_SPEECH_STARTED.equals(eventType)) {
144        System.out.println("User started speaking");
145    } else if (ServerEventType.INPUT_AUDIO_BUFFER_SPEECH_STOPPED.equals(eventType)) {
146        System.out.println("User stopped speaking");
147    } else if (ServerEventType.RESPONSE_AUDIO_DELTA.equals(eventType)) {
148        if (event instanceof SessionUpdateResponseAudioDelta) {
149            SessionUpdateResponseAudioDelta audioEvent = (SessionUpdateResponseAudioDelta) event;
150            playAudioChunk(audioEvent.getDelta());
151        }
152    } else if (ServerEventType.RESPONSE_DONE.equals(eventType)) {
153        System.out.println("Response complete");
154    } else if (ServerEventType.ERROR.equals(eventType)) {
155        if (event instanceof SessionUpdateError) {
156            SessionUpdateError errorEvent = (SessionUpdateError) event;
157            System.err.println("Error: " + errorEvent.getError().getMessage());
158        }
159    }
160});
161```
162 
163## Voice Configuration
164 
165### OpenAI Voices
166 
167```java
168// Available: ALLOY, ASH, BALLAD, CORAL, ECHO, SAGE, SHIMMER, VERSE
169VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
170    .setVoice(BinaryData.fromObject(new OpenAIVoice(OpenAIVoiceName.ALLOY)));
171```
172 
173### Azure Voices
174 
175```java
176// Azure Standard Voice
177options.setVoice(BinaryData.fromObject(new AzureStandardVoice("en-US-JennyNeural")));
178 
179// Azure Custom Voice
180options.setVoice(BinaryData.fromObject(new AzureCustomVoice("myVoice", "endpointId")));
181 
182// Azure Personal Voice
183options.setVoice(BinaryData.fromObject(
184    new AzurePersonalVoice("speakerProfileId", PersonalVoiceModels.PHOENIX_LATEST_NEURAL)));
185```
186 
187## Function Calling
188 
189```java
190VoiceLiveFunctionDefinition weatherFunction = new VoiceLiveFunctionDefinition("get_weather")
191    .setDescription("Get current weather for a location")
192    .setParameters(BinaryData.fromObject(parametersSchema));
193 
194VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
195    .setTools(Arrays.asList(weatherFunction))
196    .setInstructions("You have access to weather information.");
197```
198 
199## Best Practices
200 
2011. **Use async client** — VoiceLive requires reactive patterns
2022. **Configure turn detection** for natural conversation flow
2033. **Enable noise reduction** for better speech recognition
2044. **Handle interruptions** gracefully with `setInterruptResponse(true)`
2055. **Use Whisper transcription** for input audio transcription
2066. **Close sessions** properly when conversation ends
207 
208## Error Handling
209 
210```java
211session.receiveEvents()
212    .doOnError(error -> System.err.println("Connection error: " + error.getMessage()))
213    .onErrorResume(error -> {
214        // Attempt reconnection or cleanup
215        return Flux.empty();
216    })
217    .subscribe();
218```
219 
220## Reference Links
221 
222| Resource | URL |
223|----------|-----|
224| GitHub Source | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/ai/azure-ai-voicelive |
225| Samples | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/ai/azure-ai-voicelive/src/samples |
226

Full transparency — inspect the skill content before installing.

New to skill.md files?

See what a SKILL.md file is, how to install one, and how it differs from AGENTS.md or cursorrules.

Read the guide →