Build document analysis applications with Azure Document Intelligence (Form Recognizer) SDK for Java. Use when extracting text, tables, key-value pairs from documents, receipts, invoices, or building custom document models.
Add this skill
npx mdskills install sickn33/azure-ai-formrecognizer-javaComprehensive Azure Document Intelligence SDK reference with clear client setup, model patterns, and code examples
1---2name: azure-ai-formrecognizer-java3description: Build document analysis applications with Azure Document Intelligence (Form Recognizer) SDK for Java. Use when extracting text, tables, key-value pairs from documents, receipts, invoices, or building custom document models.4package: com.azure:azure-ai-formrecognizer5---67# Azure Document Intelligence (Form Recognizer) SDK for Java89Build document analysis applications using the Azure AI Document Intelligence SDK for Java.1011## Installation1213```xml14<dependency>15 <groupId>com.azure</groupId>16 <artifactId>azure-ai-formrecognizer</artifactId>17 <version>4.2.0-beta.1</version>18</dependency>19```2021## Client Creation2223### DocumentAnalysisClient2425```java26import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;27import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;28import com.azure.core.credential.AzureKeyCredential;2930DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()31 .credential(new AzureKeyCredential("{key}"))32 .endpoint("{endpoint}")33 .buildClient();34```3536### DocumentModelAdministrationClient3738```java39import com.azure.ai.formrecognizer.documentanalysis.administration.DocumentModelAdministrationClient;40import com.azure.ai.formrecognizer.documentanalysis.administration.DocumentModelAdministrationClientBuilder;4142DocumentModelAdministrationClient adminClient = new DocumentModelAdministrationClientBuilder()43 .credential(new AzureKeyCredential("{key}"))44 .endpoint("{endpoint}")45 .buildClient();46```4748### With DefaultAzureCredential4950```java51import com.azure.identity.DefaultAzureCredentialBuilder;5253DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()54 .endpoint("{endpoint}")55 .credential(new DefaultAzureCredentialBuilder().build())56 .buildClient();57```5859## Prebuilt Models6061| Model ID | Purpose |62|----------|---------|63| `prebuilt-layout` | Extract text, tables, selection marks |64| `prebuilt-document` | General document with key-value pairs |65| `prebuilt-receipt` | Receipt data extraction |66| `prebuilt-invoice` | Invoice field extraction |67| `prebuilt-businessCard` | Business card parsing |68| `prebuilt-idDocument` | ID document (passport, license) |69| `prebuilt-tax.us.w2` | US W2 tax forms |7071## Core Patterns7273### Extract Layout7475```java76import com.azure.ai.formrecognizer.documentanalysis.models.*;77import com.azure.core.util.BinaryData;78import com.azure.core.util.polling.SyncPoller;79import java.io.File;8081File document = new File("document.pdf");82BinaryData documentData = BinaryData.fromFile(document.toPath());8384SyncPoller<OperationResult, AnalyzeResult> poller =85 client.beginAnalyzeDocument("prebuilt-layout", documentData);8687AnalyzeResult result = poller.getFinalResult();8889// Process pages90for (DocumentPage page : result.getPages()) {91 System.out.printf("Page %d: %.2f x %.2f %s%n",92 page.getPageNumber(),93 page.getWidth(),94 page.getHeight(),95 page.getUnit());9697 // Lines98 for (DocumentLine line : page.getLines()) {99 System.out.println("Line: " + line.getContent());100 }101102 // Selection marks (checkboxes)103 for (DocumentSelectionMark mark : page.getSelectionMarks()) {104 System.out.printf("Checkbox: %s (confidence: %.2f)%n",105 mark.getSelectionMarkState(),106 mark.getConfidence());107 }108}109110// Tables111for (DocumentTable table : result.getTables()) {112 System.out.printf("Table: %d rows x %d columns%n",113 table.getRowCount(),114 table.getColumnCount());115116 for (DocumentTableCell cell : table.getCells()) {117 System.out.printf("Cell[%d,%d]: %s%n",118 cell.getRowIndex(),119 cell.getColumnIndex(),120 cell.getContent());121 }122}123```124125### Analyze from URL126127```java128String documentUrl = "https://example.com/invoice.pdf";129130SyncPoller<OperationResult, AnalyzeResult> poller =131 client.beginAnalyzeDocumentFromUrl("prebuilt-invoice", documentUrl);132133AnalyzeResult result = poller.getFinalResult();134```135136### Analyze Receipt137138```java139SyncPoller<OperationResult, AnalyzeResult> poller =140 client.beginAnalyzeDocumentFromUrl("prebuilt-receipt", receiptUrl);141142AnalyzeResult result = poller.getFinalResult();143144for (AnalyzedDocument doc : result.getDocuments()) {145 Map<String, DocumentField> fields = doc.getFields();146147 DocumentField merchantName = fields.get("MerchantName");148 if (merchantName != null && merchantName.getType() == DocumentFieldType.STRING) {149 System.out.printf("Merchant: %s (confidence: %.2f)%n",150 merchantName.getValueAsString(),151 merchantName.getConfidence());152 }153154 DocumentField transactionDate = fields.get("TransactionDate");155 if (transactionDate != null && transactionDate.getType() == DocumentFieldType.DATE) {156 System.out.printf("Date: %s%n", transactionDate.getValueAsDate());157 }158159 DocumentField items = fields.get("Items");160 if (items != null && items.getType() == DocumentFieldType.LIST) {161 for (DocumentField item : items.getValueAsList()) {162 Map<String, DocumentField> itemFields = item.getValueAsMap();163 System.out.printf("Item: %s, Price: %.2f%n",164 itemFields.get("Name").getValueAsString(),165 itemFields.get("Price").getValueAsDouble());166 }167 }168}169```170171### General Document Analysis172173```java174SyncPoller<OperationResult, AnalyzeResult> poller =175 client.beginAnalyzeDocumentFromUrl("prebuilt-document", documentUrl);176177AnalyzeResult result = poller.getFinalResult();178179// Key-value pairs180for (DocumentKeyValuePair kvp : result.getKeyValuePairs()) {181 System.out.printf("Key: %s => Value: %s%n",182 kvp.getKey().getContent(),183 kvp.getValue() != null ? kvp.getValue().getContent() : "null");184}185```186187## Custom Models188189### Build Custom Model190191```java192import com.azure.ai.formrecognizer.documentanalysis.administration.models.*;193194String blobContainerUrl = "{SAS_URL_of_training_data}";195String prefix = "training-docs/";196197SyncPoller<OperationResult, DocumentModelDetails> poller = adminClient.beginBuildDocumentModel(198 blobContainerUrl,199 DocumentModelBuildMode.TEMPLATE,200 prefix,201 new BuildDocumentModelOptions()202 .setModelId("my-custom-model")203 .setDescription("Custom invoice model"),204 Context.NONE);205206DocumentModelDetails model = poller.getFinalResult();207208System.out.println("Model ID: " + model.getModelId());209System.out.println("Created: " + model.getCreatedOn());210211model.getDocumentTypes().forEach((docType, details) -> {212 System.out.println("Document type: " + docType);213 details.getFieldSchema().forEach((field, schema) -> {214 System.out.printf(" Field: %s (%s)%n", field, schema.getType());215 });216});217```218219### Analyze with Custom Model220221```java222SyncPoller<OperationResult, AnalyzeResult> poller =223 client.beginAnalyzeDocumentFromUrl("my-custom-model", documentUrl);224225AnalyzeResult result = poller.getFinalResult();226227for (AnalyzedDocument doc : result.getDocuments()) {228 System.out.printf("Document type: %s (confidence: %.2f)%n",229 doc.getDocType(),230 doc.getConfidence());231232 doc.getFields().forEach((name, field) -> {233 System.out.printf("Field '%s': %s (confidence: %.2f)%n",234 name,235 field.getContent(),236 field.getConfidence());237 });238}239```240241### Compose Models242243```java244List<String> modelIds = Arrays.asList("model-1", "model-2", "model-3");245246SyncPoller<OperationResult, DocumentModelDetails> poller =247 adminClient.beginComposeDocumentModel(248 modelIds,249 new ComposeDocumentModelOptions()250 .setModelId("composed-model")251 .setDescription("Composed from multiple models"));252253DocumentModelDetails composedModel = poller.getFinalResult();254```255256### Manage Models257258```java259// List models260PagedIterable<DocumentModelSummary> models = adminClient.listDocumentModels();261for (DocumentModelSummary summary : models) {262 System.out.printf("Model: %s, Created: %s%n",263 summary.getModelId(),264 summary.getCreatedOn());265}266267// Get model details268DocumentModelDetails model = adminClient.getDocumentModel("model-id");269270// Delete model271adminClient.deleteDocumentModel("model-id");272273// Check resource limits274ResourceDetails resources = adminClient.getResourceDetails();275System.out.printf("Models: %d / %d%n",276 resources.getCustomDocumentModelCount(),277 resources.getCustomDocumentModelLimit());278```279280## Document Classification281282### Build Classifier283284```java285Map<String, ClassifierDocumentTypeDetails> docTypes = new HashMap<>();286docTypes.put("invoice", new ClassifierDocumentTypeDetails()287 .setAzureBlobSource(new AzureBlobContentSource(containerUrl).setPrefix("invoices/")));288docTypes.put("receipt", new ClassifierDocumentTypeDetails()289 .setAzureBlobSource(new AzureBlobContentSource(containerUrl).setPrefix("receipts/")));290291SyncPoller<OperationResult, DocumentClassifierDetails> poller =292 adminClient.beginBuildDocumentClassifier(docTypes,293 new BuildDocumentClassifierOptions().setClassifierId("my-classifier"));294295DocumentClassifierDetails classifier = poller.getFinalResult();296```297298### Classify Document299300```java301SyncPoller<OperationResult, AnalyzeResult> poller =302 client.beginClassifyDocumentFromUrl("my-classifier", documentUrl, Context.NONE);303304AnalyzeResult result = poller.getFinalResult();305306for (AnalyzedDocument doc : result.getDocuments()) {307 System.out.printf("Classified as: %s (confidence: %.2f)%n",308 doc.getDocType(),309 doc.getConfidence());310}311```312313## Error Handling314315```java316import com.azure.core.exception.HttpResponseException;317318try {319 client.beginAnalyzeDocumentFromUrl("prebuilt-receipt", "invalid-url");320} catch (HttpResponseException e) {321 System.out.println("Status: " + e.getResponse().getStatusCode());322 System.out.println("Error: " + e.getMessage());323}324```325326## Environment Variables327328```bash329FORM_RECOGNIZER_ENDPOINT=https://<resource>.cognitiveservices.azure.com/330FORM_RECOGNIZER_KEY=<your-api-key>331```332333## Trigger Phrases334335- "document intelligence Java"336- "form recognizer SDK"337- "extract text from PDF"338- "OCR document Java"339- "analyze invoice receipt"340- "custom document model"341- "document classification"342
Full transparency — inspect the skill content before installing.