Add this skill
npx mdskills install sickn33/azure-ai-translation-document-pyComprehensive Azure document translation SDK with excellent examples and async support
1---2name: azure-ai-translation-document-py3description: |4 Azure AI Document Translation SDK for batch translation of documents with format preservation. Use for translating Word, PDF, Excel, PowerPoint, and other document formats at scale.5 Triggers: "document translation", "batch translation", "translate documents", "DocumentTranslationClient".6package: azure-ai-translation-document7---89# Azure AI Document Translation SDK for Python1011Client library for Azure AI Translator document translation service for batch document translation with format preservation.1213## Installation1415```bash16pip install azure-ai-translation-document17```1819## Environment Variables2021```bash22AZURE_DOCUMENT_TRANSLATION_ENDPOINT=https://<resource>.cognitiveservices.azure.com23AZURE_DOCUMENT_TRANSLATION_KEY=<your-api-key> # If using API key2425# Storage for source and target documents26AZURE_SOURCE_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>27AZURE_TARGET_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>28```2930## Authentication3132### API Key3334```python35import os36from azure.ai.translation.document import DocumentTranslationClient37from azure.core.credentials import AzureKeyCredential3839endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]40key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]4142client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))43```4445### Entra ID (Recommended)4647```python48from azure.ai.translation.document import DocumentTranslationClient49from azure.identity import DefaultAzureCredential5051client = DocumentTranslationClient(52 endpoint=os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"],53 credential=DefaultAzureCredential()54)55```5657## Basic Document Translation5859```python60from azure.ai.translation.document import DocumentTranslationInput, TranslationTarget6162source_url = os.environ["AZURE_SOURCE_CONTAINER_URL"]63target_url = os.environ["AZURE_TARGET_CONTAINER_URL"]6465# Start translation job66poller = client.begin_translation(67 inputs=[68 DocumentTranslationInput(69 source_url=source_url,70 targets=[71 TranslationTarget(72 target_url=target_url,73 language="es" # Translate to Spanish74 )75 ]76 )77 ]78)7980# Wait for completion81result = poller.result()8283print(f"Status: {poller.status()}")84print(f"Documents translated: {poller.details.documents_succeeded_count}")85print(f"Documents failed: {poller.details.documents_failed_count}")86```8788## Multiple Target Languages8990```python91poller = client.begin_translation(92 inputs=[93 DocumentTranslationInput(94 source_url=source_url,95 targets=[96 TranslationTarget(target_url=target_url_es, language="es"),97 TranslationTarget(target_url=target_url_fr, language="fr"),98 TranslationTarget(target_url=target_url_de, language="de")99 ]100 )101 ]102)103```104105## Translate Single Document106107```python108from azure.ai.translation.document import SingleDocumentTranslationClient109110single_client = SingleDocumentTranslationClient(endpoint, AzureKeyCredential(key))111112with open("document.docx", "rb") as f:113 document_content = f.read()114115result = single_client.translate(116 body=document_content,117 target_language="es",118 content_type="application/vnd.openxmlformats-officedocument.wordprocessingml.document"119)120121# Save translated document122with open("document_es.docx", "wb") as f:123 f.write(result)124```125126## Check Translation Status127128```python129# Get all translation operations130operations = client.list_translation_statuses()131132for op in operations:133 print(f"Operation ID: {op.id}")134 print(f"Status: {op.status}")135 print(f"Created: {op.created_on}")136 print(f"Total documents: {op.documents_total_count}")137 print(f"Succeeded: {op.documents_succeeded_count}")138 print(f"Failed: {op.documents_failed_count}")139```140141## List Document Statuses142143```python144# Get status of individual documents in a job145operation_id = poller.id146document_statuses = client.list_document_statuses(operation_id)147148for doc in document_statuses:149 print(f"Document: {doc.source_document_url}")150 print(f" Status: {doc.status}")151 print(f" Translated to: {doc.translated_to}")152 if doc.error:153 print(f" Error: {doc.error.message}")154```155156## Cancel Translation157158```python159# Cancel a running translation160client.cancel_translation(operation_id)161```162163## Using Glossary164165```python166from azure.ai.translation.document import TranslationGlossary167168poller = client.begin_translation(169 inputs=[170 DocumentTranslationInput(171 source_url=source_url,172 targets=[173 TranslationTarget(174 target_url=target_url,175 language="es",176 glossaries=[177 TranslationGlossary(178 glossary_url="https://<storage>.blob.core.windows.net/glossary/terms.csv?<sas>",179 file_format="csv"180 )181 ]182 )183 ]184 )185 ]186)187```188189## Supported Document Formats190191```python192# Get supported formats193formats = client.get_supported_document_formats()194195for fmt in formats:196 print(f"Format: {fmt.format}")197 print(f" Extensions: {fmt.file_extensions}")198 print(f" Content types: {fmt.content_types}")199```200201## Supported Languages202203```python204# Get supported languages205languages = client.get_supported_languages()206207for lang in languages:208 print(f"Language: {lang.name} ({lang.code})")209```210211## Async Client212213```python214from azure.ai.translation.document.aio import DocumentTranslationClient215from azure.identity.aio import DefaultAzureCredential216217async def translate_documents():218 async with DocumentTranslationClient(219 endpoint=endpoint,220 credential=DefaultAzureCredential()221 ) as client:222 poller = await client.begin_translation(inputs=[...])223 result = await poller.result()224```225226## Supported Formats227228| Category | Formats |229|----------|---------|230| Documents | DOCX, PDF, PPTX, XLSX, HTML, TXT, RTF |231| Structured | CSV, TSV, JSON, XML |232| Localization | XLIFF, XLF, MHTML |233234## Storage Requirements235236- Source and target containers must be Azure Blob Storage237- Use SAS tokens with appropriate permissions:238 - Source: Read, List239 - Target: Write, List240241## Best Practices2422431. **Use SAS tokens** with minimal required permissions2442. **Monitor long-running operations** with `poller.status()`2453. **Handle document-level errors** by iterating document statuses2464. **Use glossaries** for domain-specific terminology2475. **Separate target containers** for each language2486. **Use async client** for multiple concurrent jobs2497. **Check supported formats** before submitting documents250
Full transparency — inspect the skill content before installing.