|
Add this skill
npx mdskills install sickn33/azure-ai-document-intelligence-dotnetComprehensive Azure Document Intelligence SDK reference with clear code examples and multiple workflows
1---2name: azure-ai-document-intelligence-dotnet3description: |4 Azure AI Document Intelligence SDK for .NET. Extract text, tables, and structured data from documents using prebuilt and custom models. Use for invoice processing, receipt extraction, ID document analysis, and custom document models. Triggers: "Document Intelligence", "DocumentIntelligenceClient", "form recognizer", "invoice extraction", "receipt OCR", "document analysis .NET".5package: Azure.AI.DocumentIntelligence6---78# Azure.AI.DocumentIntelligence (.NET)910Extract text, tables, and structured data from documents using prebuilt and custom models.1112## Installation1314```bash15dotnet add package Azure.AI.DocumentIntelligence16dotnet add package Azure.Identity17```1819**Current Version**: v1.0.0 (GA)2021## Environment Variables2223```bash24DOCUMENT_INTELLIGENCE_ENDPOINT=https://<resource-name>.cognitiveservices.azure.com/25DOCUMENT_INTELLIGENCE_API_KEY=<your-api-key>26BLOB_CONTAINER_SAS_URL=https://<storage>.blob.core.windows.net/<container>?<sas-token>27```2829## Authentication3031### Microsoft Entra ID (Recommended)3233```csharp34using Azure.Identity;35using Azure.AI.DocumentIntelligence;3637string endpoint = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_ENDPOINT");38var credential = new DefaultAzureCredential();39var client = new DocumentIntelligenceClient(new Uri(endpoint), credential);40```4142> **Note**: Entra ID requires a **custom subdomain** (e.g., `https://<resource-name>.cognitiveservices.azure.com/`), not a regional endpoint.4344### API Key4546```csharp47string endpoint = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_ENDPOINT");48string apiKey = Environment.GetEnvironmentVariable("DOCUMENT_INTELLIGENCE_API_KEY");49var client = new DocumentIntelligenceClient(new Uri(endpoint), new AzureKeyCredential(apiKey));50```5152## Client Types5354| Client | Purpose |55|--------|---------|56| `DocumentIntelligenceClient` | Analyze documents, classify documents |57| `DocumentIntelligenceAdministrationClient` | Build/manage custom models and classifiers |5859## Prebuilt Models6061| Model ID | Description |62|----------|-------------|63| `prebuilt-read` | Extract text, languages, handwriting |64| `prebuilt-layout` | Extract text, tables, selection marks, structure |65| `prebuilt-invoice` | Extract invoice fields (vendor, items, totals) |66| `prebuilt-receipt` | Extract receipt fields (merchant, items, total) |67| `prebuilt-idDocument` | Extract ID document fields (name, DOB, address) |68| `prebuilt-businessCard` | Extract business card fields |69| `prebuilt-tax.us.w2` | Extract W-2 tax form fields |70| `prebuilt-healthInsuranceCard.us` | Extract health insurance card fields |7172## Core Workflows7374### 1. Analyze Invoice7576```csharp77using Azure.AI.DocumentIntelligence;7879Uri invoiceUri = new Uri("https://example.com/invoice.pdf");8081Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(82 WaitUntil.Completed,83 "prebuilt-invoice",84 invoiceUri);8586AnalyzeResult result = operation.Value;8788foreach (AnalyzedDocument document in result.Documents)89{90 if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField)91 && vendorNameField.FieldType == DocumentFieldType.String)92 {93 string vendorName = vendorNameField.ValueString;94 Console.WriteLine($"Vendor Name: '{vendorName}', confidence: {vendorNameField.Confidence}");95 }9697 if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField)98 && invoiceTotalField.FieldType == DocumentFieldType.Currency)99 {100 CurrencyValue invoiceTotal = invoiceTotalField.ValueCurrency;101 Console.WriteLine($"Invoice Total: '{invoiceTotal.CurrencySymbol}{invoiceTotal.Amount}'");102 }103104 // Extract line items105 if (document.Fields.TryGetValue("Items", out DocumentField itemsField)106 && itemsField.FieldType == DocumentFieldType.List)107 {108 foreach (DocumentField item in itemsField.ValueList)109 {110 var itemFields = item.ValueDictionary;111 if (itemFields.TryGetValue("Description", out DocumentField descField))112 Console.WriteLine($" Item: {descField.ValueString}");113 }114 }115}116```117118### 2. Extract Layout (Text, Tables, Structure)119120```csharp121Uri fileUri = new Uri("https://example.com/document.pdf");122123Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(124 WaitUntil.Completed,125 "prebuilt-layout",126 fileUri);127128AnalyzeResult result = operation.Value;129130// Extract text by page131foreach (DocumentPage page in result.Pages)132{133 Console.WriteLine($"Page {page.PageNumber}: {page.Lines.Count} lines, {page.Words.Count} words");134135 foreach (DocumentLine line in page.Lines)136 {137 Console.WriteLine($" Line: '{line.Content}'");138 }139}140141// Extract tables142foreach (DocumentTable table in result.Tables)143{144 Console.WriteLine($"Table: {table.RowCount} rows x {table.ColumnCount} columns");145 foreach (DocumentTableCell cell in table.Cells)146 {147 Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}): {cell.Content}");148 }149}150```151152### 3. Analyze Receipt153154```csharp155Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(156 WaitUntil.Completed,157 "prebuilt-receipt",158 receiptUri);159160AnalyzeResult result = operation.Value;161162foreach (AnalyzedDocument document in result.Documents)163{164 if (document.Fields.TryGetValue("MerchantName", out DocumentField merchantField))165 Console.WriteLine($"Merchant: {merchantField.ValueString}");166167 if (document.Fields.TryGetValue("Total", out DocumentField totalField))168 Console.WriteLine($"Total: {totalField.ValueCurrency.Amount}");169170 if (document.Fields.TryGetValue("TransactionDate", out DocumentField dateField))171 Console.WriteLine($"Date: {dateField.ValueDate}");172}173```174175### 4. Build Custom Model176177```csharp178var adminClient = new DocumentIntelligenceAdministrationClient(179 new Uri(endpoint),180 new AzureKeyCredential(apiKey));181182string modelId = "my-custom-model";183Uri blobContainerUri = new Uri("<blob-container-sas-url>");184185var blobSource = new BlobContentSource(blobContainerUri);186var options = new BuildDocumentModelOptions(modelId, DocumentBuildMode.Template, blobSource);187188Operation<DocumentModelDetails> operation = await adminClient.BuildDocumentModelAsync(189 WaitUntil.Completed,190 options);191192DocumentModelDetails model = operation.Value;193194Console.WriteLine($"Model ID: {model.ModelId}");195Console.WriteLine($"Created: {model.CreatedOn}");196197foreach (var docType in model.DocumentTypes)198{199 Console.WriteLine($"Document type: {docType.Key}");200 foreach (var field in docType.Value.FieldSchema)201 {202 Console.WriteLine($" Field: {field.Key}, Confidence: {docType.Value.FieldConfidence[field.Key]}");203 }204}205```206207### 5. Build Document Classifier208209```csharp210string classifierId = "my-classifier";211Uri blobContainerUri = new Uri("<blob-container-sas-url>");212213var sourceA = new BlobContentSource(blobContainerUri) { Prefix = "TypeA/train" };214var sourceB = new BlobContentSource(blobContainerUri) { Prefix = "TypeB/train" };215216var docTypes = new Dictionary<string, ClassifierDocumentTypeDetails>()217{218 { "TypeA", new ClassifierDocumentTypeDetails(sourceA) },219 { "TypeB", new ClassifierDocumentTypeDetails(sourceB) }220};221222var options = new BuildClassifierOptions(classifierId, docTypes);223224Operation<DocumentClassifierDetails> operation = await adminClient.BuildClassifierAsync(225 WaitUntil.Completed,226 options);227228DocumentClassifierDetails classifier = operation.Value;229Console.WriteLine($"Classifier ID: {classifier.ClassifierId}");230```231232### 6. Classify Document233234```csharp235string classifierId = "my-classifier";236Uri documentUri = new Uri("https://example.com/document.pdf");237238var options = new ClassifyDocumentOptions(classifierId, documentUri);239240Operation<AnalyzeResult> operation = await client.ClassifyDocumentAsync(241 WaitUntil.Completed,242 options);243244AnalyzeResult result = operation.Value;245246foreach (AnalyzedDocument document in result.Documents)247{248 Console.WriteLine($"Document type: {document.DocumentType}, confidence: {document.Confidence}");249}250```251252### 7. Manage Models253254```csharp255// Get resource details256DocumentIntelligenceResourceDetails resourceDetails = await adminClient.GetResourceDetailsAsync();257Console.WriteLine($"Custom models: {resourceDetails.CustomDocumentModels.Count}/{resourceDetails.CustomDocumentModels.Limit}");258259// Get specific model260DocumentModelDetails model = await adminClient.GetModelAsync("my-model-id");261Console.WriteLine($"Model: {model.ModelId}, Created: {model.CreatedOn}");262263// List models264await foreach (DocumentModelDetails modelItem in adminClient.GetModelsAsync())265{266 Console.WriteLine($"Model: {modelItem.ModelId}");267}268269// Delete model270await adminClient.DeleteModelAsync("my-model-id");271```272273## Key Types Reference274275| Type | Description |276|------|-------------|277| `DocumentIntelligenceClient` | Main client for analysis |278| `DocumentIntelligenceAdministrationClient` | Model management |279| `AnalyzeResult` | Result of document analysis |280| `AnalyzedDocument` | Single document within result |281| `DocumentField` | Extracted field with value and confidence |282| `DocumentFieldType` | String, Date, Number, Currency, etc. |283| `DocumentPage` | Page info (lines, words, selection marks) |284| `DocumentTable` | Extracted table with cells |285| `DocumentModelDetails` | Custom model metadata |286| `BlobContentSource` | Training data source |287288## Build Modes289290| Mode | Use Case |291|------|----------|292| `DocumentBuildMode.Template` | Fixed layout documents (forms) |293| `DocumentBuildMode.Neural` | Variable layout documents |294295## Best Practices2962971. **Use DefaultAzureCredential** for production2982. **Reuse client instances** — clients are thread-safe2993. **Handle long-running operations** — Use `WaitUntil.Completed` for simplicity3004. **Check field confidence** — Always verify `Confidence` property3015. **Use appropriate model** — Prebuilt for common docs, custom for specialized3026. **Use custom subdomain** — Required for Entra ID authentication303304## Error Handling305306```csharp307using Azure;308309try310{311 var operation = await client.AnalyzeDocumentAsync(312 WaitUntil.Completed,313 "prebuilt-invoice",314 documentUri);315}316catch (RequestFailedException ex)317{318 Console.WriteLine($"Error: {ex.Status} - {ex.Message}");319}320```321322## Related SDKs323324| SDK | Purpose | Install |325|-----|---------|---------|326| `Azure.AI.DocumentIntelligence` | Document analysis (this SDK) | `dotnet add package Azure.AI.DocumentIntelligence` |327| `Azure.AI.FormRecognizer` | Legacy SDK (deprecated) | Use DocumentIntelligence instead |328329## Reference Links330331| Resource | URL |332|----------|-----|333| NuGet Package | https://www.nuget.org/packages/Azure.AI.DocumentIntelligence |334| API Reference | https://learn.microsoft.com/dotnet/api/azure.ai.documentintelligence |335| GitHub Samples | https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/documentintelligence/Azure.AI.DocumentIntelligence/samples |336| Document Intelligence Studio | https://documentintelligence.ai.azure.com/ |337| Prebuilt Models | https://aka.ms/azsdk/formrecognizer/models |338
Full transparency — inspect the skill content before installing.