Extract text, tables, and structured data from documents using Azure Document Intelligence (@azure-rest/ai-document-intelligence). Use when processing invoices, receipts, IDs, forms, or building custom document models.
Add this skill
npx mdskills install sickn33/azure-ai-document-intelligence-tsComprehensive SDK documentation with extensive examples but lacks agent-specific trigger conditions and instructions
1---2name: azure-ai-document-intelligence-ts3description: Extract text, tables, and structured data from documents using Azure Document Intelligence (@azure-rest/ai-document-intelligence). Use when processing invoices, receipts, IDs, forms, or building custom document models.4package: "@azure-rest/ai-document-intelligence"5---67# Azure Document Intelligence REST SDK for TypeScript89Extract text, tables, and structured data from documents using prebuilt and custom models.1011## Installation1213```bash14npm install @azure-rest/ai-document-intelligence @azure/identity15```1617## Environment Variables1819```bash20DOCUMENT_INTELLIGENCE_ENDPOINT=https://<resource>.cognitiveservices.azure.com21DOCUMENT_INTELLIGENCE_API_KEY=<api-key>22```2324## Authentication2526**Important**: This is a REST client. `DocumentIntelligence` is a **function**, not a class.2728### DefaultAzureCredential2930```typescript31import DocumentIntelligence from "@azure-rest/ai-document-intelligence";32import { DefaultAzureCredential } from "@azure/identity";3334const client = DocumentIntelligence(35 process.env.DOCUMENT_INTELLIGENCE_ENDPOINT!,36 new DefaultAzureCredential()37);38```3940### API Key4142```typescript43import DocumentIntelligence from "@azure-rest/ai-document-intelligence";4445const client = DocumentIntelligence(46 process.env.DOCUMENT_INTELLIGENCE_ENDPOINT!,47 { key: process.env.DOCUMENT_INTELLIGENCE_API_KEY! }48);49```5051## Analyze Document (URL)5253```typescript54import DocumentIntelligence, {55 isUnexpected,56 getLongRunningPoller,57 AnalyzeOperationOutput58} from "@azure-rest/ai-document-intelligence";5960const initialResponse = await client61 .path("/documentModels/{modelId}:analyze", "prebuilt-layout")62 .post({63 contentType: "application/json",64 body: {65 urlSource: "https://example.com/document.pdf"66 },67 queryParameters: { locale: "en-US" }68 });6970if (isUnexpected(initialResponse)) {71 throw initialResponse.body.error;72}7374const poller = getLongRunningPoller(client, initialResponse);75const result = (await poller.pollUntilDone()).body as AnalyzeOperationOutput;7677console.log("Pages:", result.analyzeResult?.pages?.length);78console.log("Tables:", result.analyzeResult?.tables?.length);79```8081## Analyze Document (Local File)8283```typescript84import { readFile } from "node:fs/promises";8586const fileBuffer = await readFile("./document.pdf");87const base64Source = fileBuffer.toString("base64");8889const initialResponse = await client90 .path("/documentModels/{modelId}:analyze", "prebuilt-invoice")91 .post({92 contentType: "application/json",93 body: { base64Source }94 });9596if (isUnexpected(initialResponse)) {97 throw initialResponse.body.error;98}99100const poller = getLongRunningPoller(client, initialResponse);101const result = (await poller.pollUntilDone()).body as AnalyzeOperationOutput;102```103104## Prebuilt Models105106| Model ID | Description |107|----------|-------------|108| `prebuilt-read` | OCR - text and language extraction |109| `prebuilt-layout` | Text, tables, selection marks, structure |110| `prebuilt-invoice` | Invoice fields |111| `prebuilt-receipt` | Receipt fields |112| `prebuilt-idDocument` | ID document fields |113| `prebuilt-tax.us.w2` | W-2 tax form fields |114| `prebuilt-healthInsuranceCard.us` | Health insurance card fields |115| `prebuilt-contract` | Contract fields |116| `prebuilt-bankStatement.us` | Bank statement fields |117118## Extract Invoice Fields119120```typescript121const initialResponse = await client122 .path("/documentModels/{modelId}:analyze", "prebuilt-invoice")123 .post({124 contentType: "application/json",125 body: { urlSource: invoiceUrl }126 });127128if (isUnexpected(initialResponse)) {129 throw initialResponse.body.error;130}131132const poller = getLongRunningPoller(client, initialResponse);133const result = (await poller.pollUntilDone()).body as AnalyzeOperationOutput;134135const invoice = result.analyzeResult?.documents?.[0];136if (invoice) {137 console.log("Vendor:", invoice.fields?.VendorName?.content);138 console.log("Total:", invoice.fields?.InvoiceTotal?.content);139 console.log("Due Date:", invoice.fields?.DueDate?.content);140}141```142143## Extract Receipt Fields144145```typescript146const initialResponse = await client147 .path("/documentModels/{modelId}:analyze", "prebuilt-receipt")148 .post({149 contentType: "application/json",150 body: { urlSource: receiptUrl }151 });152153const poller = getLongRunningPoller(client, initialResponse);154const result = (await poller.pollUntilDone()).body as AnalyzeOperationOutput;155156const receipt = result.analyzeResult?.documents?.[0];157if (receipt) {158 console.log("Merchant:", receipt.fields?.MerchantName?.content);159 console.log("Total:", receipt.fields?.Total?.content);160161 for (const item of receipt.fields?.Items?.values || []) {162 console.log("Item:", item.properties?.Description?.content);163 console.log("Price:", item.properties?.TotalPrice?.content);164 }165}166```167168## List Document Models169170```typescript171import DocumentIntelligence, { isUnexpected, paginate } from "@azure-rest/ai-document-intelligence";172173const response = await client.path("/documentModels").get();174175if (isUnexpected(response)) {176 throw response.body.error;177}178179for await (const model of paginate(client, response)) {180 console.log(model.modelId);181}182```183184## Build Custom Model185186```typescript187const initialResponse = await client.path("/documentModels:build").post({188 body: {189 modelId: "my-custom-model",190 description: "Custom model for purchase orders",191 buildMode: "template", // or "neural"192 azureBlobSource: {193 containerUrl: process.env.TRAINING_CONTAINER_SAS_URL!,194 prefix: "training-data/"195 }196 }197});198199if (isUnexpected(initialResponse)) {200 throw initialResponse.body.error;201}202203const poller = getLongRunningPoller(client, initialResponse);204const result = await poller.pollUntilDone();205console.log("Model built:", result.body);206```207208## Build Document Classifier209210```typescript211import { DocumentClassifierBuildOperationDetailsOutput } from "@azure-rest/ai-document-intelligence";212213const containerSasUrl = process.env.TRAINING_CONTAINER_SAS_URL!;214215const initialResponse = await client.path("/documentClassifiers:build").post({216 body: {217 classifierId: "my-classifier",218 description: "Invoice vs Receipt classifier",219 docTypes: {220 invoices: {221 azureBlobSource: { containerUrl: containerSasUrl, prefix: "invoices/" }222 },223 receipts: {224 azureBlobSource: { containerUrl: containerSasUrl, prefix: "receipts/" }225 }226 }227 }228});229230if (isUnexpected(initialResponse)) {231 throw initialResponse.body.error;232}233234const poller = getLongRunningPoller(client, initialResponse);235const result = (await poller.pollUntilDone()).body as DocumentClassifierBuildOperationDetailsOutput;236console.log("Classifier:", result.result?.classifierId);237```238239## Classify Document240241```typescript242const initialResponse = await client243 .path("/documentClassifiers/{classifierId}:analyze", "my-classifier")244 .post({245 contentType: "application/json",246 body: { urlSource: documentUrl },247 queryParameters: { split: "auto" }248 });249250if (isUnexpected(initialResponse)) {251 throw initialResponse.body.error;252}253254const poller = getLongRunningPoller(client, initialResponse);255const result = await poller.pollUntilDone();256console.log("Classification:", result.body.analyzeResult?.documents);257```258259## Get Service Info260261```typescript262const response = await client.path("/info").get();263264if (isUnexpected(response)) {265 throw response.body.error;266}267268console.log("Custom model limit:", response.body.customDocumentModels.limit);269console.log("Custom model count:", response.body.customDocumentModels.count);270```271272## Polling Pattern273274```typescript275import DocumentIntelligence, {276 isUnexpected,277 getLongRunningPoller,278 AnalyzeOperationOutput279} from "@azure-rest/ai-document-intelligence";280281// 1. Start operation282const initialResponse = await client283 .path("/documentModels/{modelId}:analyze", "prebuilt-layout")284 .post({ contentType: "application/json", body: { urlSource } });285286// 2. Check for errors287if (isUnexpected(initialResponse)) {288 throw initialResponse.body.error;289}290291// 3. Create poller292const poller = getLongRunningPoller(client, initialResponse);293294// 4. Optional: Monitor progress295poller.onProgress((state) => {296 console.log("Status:", state.status);297});298299// 5. Wait for completion300const result = (await poller.pollUntilDone()).body as AnalyzeOperationOutput;301```302303## Key Types304305```typescript306import DocumentIntelligence, {307 isUnexpected,308 getLongRunningPoller,309 paginate,310 parseResultIdFromResponse,311 AnalyzeOperationOutput,312 DocumentClassifierBuildOperationDetailsOutput313} from "@azure-rest/ai-document-intelligence";314```315316## Best Practices3173181. **Use getLongRunningPoller()** - Document analysis is async, always poll for results3192. **Check isUnexpected()** - Type guard for proper error handling3203. **Choose the right model** - Use prebuilt models when possible, custom for specialized docs3214. **Handle confidence scores** - Fields have confidence values, set thresholds for your use case3225. **Use pagination** - Use `paginate()` helper for listing models3236. **Prefer neural mode** - For custom models, neural handles more variation than template324
Full transparency — inspect the skill content before installing.