Add this skill
npx mdskills install sickn33/azure-ai-contentunderstanding-pyComprehensive SDK reference with clear examples for multimodal content extraction
1---2name: azure-ai-contentunderstanding-py3description: |4 Azure AI Content Understanding SDK for Python. Use for multimodal content extraction from documents, images, audio, and video.5 Triggers: "azure-ai-contentunderstanding", "ContentUnderstandingClient", "multimodal analysis", "document extraction", "video analysis", "audio transcription".6package: azure-ai-contentunderstanding7---89# Azure AI Content Understanding SDK for Python1011Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.1213## Installation1415```bash16pip install azure-ai-contentunderstanding17```1819## Environment Variables2021```bash22CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/23```2425## Authentication2627```python28import os29from azure.ai.contentunderstanding import ContentUnderstandingClient30from azure.identity import DefaultAzureCredential3132endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]33credential = DefaultAzureCredential()34client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)35```3637## Core Workflow3839Content Understanding operations are asynchronous long-running operations:40411. **Begin Analysis** — Start the analysis operation with `begin_analyze()` (returns a poller)422. **Poll for Results** — Poll until analysis completes (SDK handles this with `.result()`)433. **Process Results** — Extract structured results from `AnalyzeResult.contents`4445## Prebuilt Analyzers4647| Analyzer | Content Type | Purpose |48|----------|--------------|---------|49| `prebuilt-documentSearch` | Documents | Extract markdown for RAG applications |50| `prebuilt-imageSearch` | Images | Extract content from images |51| `prebuilt-audioSearch` | Audio | Transcribe audio with timing |52| `prebuilt-videoSearch` | Video | Extract frames, transcripts, summaries |53| `prebuilt-invoice` | Documents | Extract invoice fields |5455## Analyze Document5657```python58import os59from azure.ai.contentunderstanding import ContentUnderstandingClient60from azure.ai.contentunderstanding.models import AnalyzeInput61from azure.identity import DefaultAzureCredential6263endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]64client = ContentUnderstandingClient(65 endpoint=endpoint,66 credential=DefaultAzureCredential()67)6869# Analyze document from URL70poller = client.begin_analyze(71 analyzer_id="prebuilt-documentSearch",72 inputs=[AnalyzeInput(url="https://example.com/document.pdf")]73)7475result = poller.result()7677# Access markdown content (contents is a list)78content = result.contents[0]79print(content.markdown)80```8182## Access Document Content Details8384```python85from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent8687content = result.contents[0]88if content.kind == MediaContentKind.DOCUMENT:89 document_content: DocumentContent = content # type: ignore90 print(document_content.start_page_number)91```9293## Analyze Image9495```python96from azure.ai.contentunderstanding.models import AnalyzeInput9798poller = client.begin_analyze(99 analyzer_id="prebuilt-imageSearch",100 inputs=[AnalyzeInput(url="https://example.com/image.jpg")]101)102result = poller.result()103content = result.contents[0]104print(content.markdown)105```106107## Analyze Video108109```python110from azure.ai.contentunderstanding.models import AnalyzeInput111112poller = client.begin_analyze(113 analyzer_id="prebuilt-videoSearch",114 inputs=[AnalyzeInput(url="https://example.com/video.mp4")]115)116117result = poller.result()118119# Access video content (AudioVisualContent)120content = result.contents[0]121122# Get transcript phrases with timing123for phrase in content.transcript_phrases:124 print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")125126# Get key frames (for video)127for frame in content.key_frames:128 print(f"Frame at {frame.time}: {frame.description}")129```130131## Analyze Audio132133```python134from azure.ai.contentunderstanding.models import AnalyzeInput135136poller = client.begin_analyze(137 analyzer_id="prebuilt-audioSearch",138 inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]139)140141result = poller.result()142143# Access audio transcript144content = result.contents[0]145for phrase in content.transcript_phrases:146 print(f"[{phrase.start_time}] {phrase.text}")147```148149## Custom Analyzers150151Create custom analyzers with field schemas for specialized extraction:152153```python154# Create custom analyzer155analyzer = client.create_analyzer(156 analyzer_id="my-invoice-analyzer",157 analyzer={158 "description": "Custom invoice analyzer",159 "base_analyzer_id": "prebuilt-documentSearch",160 "field_schema": {161 "fields": {162 "vendor_name": {"type": "string"},163 "invoice_total": {"type": "number"},164 "line_items": {165 "type": "array",166 "items": {167 "type": "object",168 "properties": {169 "description": {"type": "string"},170 "amount": {"type": "number"}171 }172 }173 }174 }175 }176 }177)178179# Use custom analyzer180from azure.ai.contentunderstanding.models import AnalyzeInput181182poller = client.begin_analyze(183 analyzer_id="my-invoice-analyzer",184 inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]185)186187result = poller.result()188189# Access extracted fields190print(result.fields["vendor_name"])191print(result.fields["invoice_total"])192```193194## Analyzer Management195196```python197# List all analyzers198analyzers = client.list_analyzers()199for analyzer in analyzers:200 print(f"{analyzer.analyzer_id}: {analyzer.description}")201202# Get specific analyzer203analyzer = client.get_analyzer("prebuilt-documentSearch")204205# Delete custom analyzer206client.delete_analyzer("my-custom-analyzer")207```208209## Async Client210211```python212import asyncio213import os214from azure.ai.contentunderstanding.aio import ContentUnderstandingClient215from azure.ai.contentunderstanding.models import AnalyzeInput216from azure.identity.aio import DefaultAzureCredential217218async def analyze_document():219 endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]220 credential = DefaultAzureCredential()221222 async with ContentUnderstandingClient(223 endpoint=endpoint,224 credential=credential225 ) as client:226 poller = await client.begin_analyze(227 analyzer_id="prebuilt-documentSearch",228 inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]229 )230 result = await poller.result()231 content = result.contents[0]232 return content.markdown233234asyncio.run(analyze_document())235```236237## Content Types238239| Class | For | Provides |240|-------|-----|----------|241| `DocumentContent` | PDF, images, Office docs | Pages, tables, figures, paragraphs |242| `AudioVisualContent` | Audio, video files | Transcript phrases, timing, key frames |243244Both derive from `MediaContent` which provides basic info and markdown representation.245246## Model Imports247248```python249from azure.ai.contentunderstanding.models import (250 AnalyzeInput,251 AnalyzeResult,252 MediaContentKind,253 DocumentContent,254 AudioVisualContent,255)256```257258## Client Types259260| Client | Purpose |261|--------|---------|262| `ContentUnderstandingClient` | Sync client for all operations |263| `ContentUnderstandingClient` (aio) | Async client for all operations |264265## Best Practices2662671. **Use `begin_analyze` with `AnalyzeInput`** — this is the correct method signature2682. **Access results via `result.contents[0]`** — results are returned as a list2693. **Use prebuilt analyzers** for common scenarios (document/image/audio/video search)2704. **Create custom analyzers** only for domain-specific field extraction2715. **Use async client** for high-throughput scenarios with `azure.identity.aio` credentials2726. **Handle long-running operations** — video/audio analysis can take minutes2737. **Use URL sources** when possible to avoid upload overhead274
Full transparency — inspect the skill content before installing.