Build image analysis applications with Azure AI Vision SDK for Java. Use when implementing image captioning, OCR text extraction, object detection, tagging, or smart cropping.
Add this skill
npx mdskills install sickn33/azure-ai-vision-imageanalysis-javaComprehensive SDK reference with clear code examples for all major image analysis features
1---2name: azure-ai-vision-imageanalysis-java3description: Build image analysis applications with Azure AI Vision SDK for Java. Use when implementing image captioning, OCR text extraction, object detection, tagging, or smart cropping.4package: com.azure:azure-ai-vision-imageanalysis5---67# Azure AI Vision Image Analysis SDK for Java89Build image analysis applications using the Azure AI Vision Image Analysis SDK for Java.1011## Installation1213```xml14<dependency>15 <groupId>com.azure</groupId>16 <artifactId>azure-ai-vision-imageanalysis</artifactId>17 <version>1.1.0-beta.1</version>18</dependency>19```2021## Client Creation2223### With API Key2425```java26import com.azure.ai.vision.imageanalysis.ImageAnalysisClient;27import com.azure.ai.vision.imageanalysis.ImageAnalysisClientBuilder;28import com.azure.core.credential.KeyCredential;2930String endpoint = System.getenv("VISION_ENDPOINT");31String key = System.getenv("VISION_KEY");3233ImageAnalysisClient client = new ImageAnalysisClientBuilder()34 .endpoint(endpoint)35 .credential(new KeyCredential(key))36 .buildClient();37```3839### Async Client4041```java42import com.azure.ai.vision.imageanalysis.ImageAnalysisAsyncClient;4344ImageAnalysisAsyncClient asyncClient = new ImageAnalysisClientBuilder()45 .endpoint(endpoint)46 .credential(new KeyCredential(key))47 .buildAsyncClient();48```4950### With DefaultAzureCredential5152```java53import com.azure.identity.DefaultAzureCredentialBuilder;5455ImageAnalysisClient client = new ImageAnalysisClientBuilder()56 .endpoint(endpoint)57 .credential(new DefaultAzureCredentialBuilder().build())58 .buildClient();59```6061## Visual Features6263| Feature | Description |64|---------|-------------|65| `CAPTION` | Generate human-readable image description |66| `DENSE_CAPTIONS` | Captions for up to 10 regions |67| `READ` | OCR - Extract text from images |68| `TAGS` | Content tags for objects, scenes, actions |69| `OBJECTS` | Detect objects with bounding boxes |70| `SMART_CROPS` | Smart thumbnail regions |71| `PEOPLE` | Detect people with locations |7273## Core Patterns7475### Generate Caption7677```java78import com.azure.ai.vision.imageanalysis.models.*;79import com.azure.core.util.BinaryData;80import java.io.File;81import java.util.Arrays;8283// From file84BinaryData imageData = BinaryData.fromFile(new File("image.jpg").toPath());8586ImageAnalysisResult result = client.analyze(87 imageData,88 Arrays.asList(VisualFeatures.CAPTION),89 new ImageAnalysisOptions().setGenderNeutralCaption(true));9091System.out.printf("Caption: \"%s\" (confidence: %.4f)%n",92 result.getCaption().getText(),93 result.getCaption().getConfidence());94```9596### Generate Caption from URL9798```java99ImageAnalysisResult result = client.analyzeFromUrl(100 "https://example.com/image.jpg",101 Arrays.asList(VisualFeatures.CAPTION),102 new ImageAnalysisOptions().setGenderNeutralCaption(true));103104System.out.printf("Caption: \"%s\"%n", result.getCaption().getText());105```106107### Extract Text (OCR)108109```java110ImageAnalysisResult result = client.analyze(111 BinaryData.fromFile(new File("document.jpg").toPath()),112 Arrays.asList(VisualFeatures.READ),113 null);114115for (DetectedTextBlock block : result.getRead().getBlocks()) {116 for (DetectedTextLine line : block.getLines()) {117 System.out.printf("Line: '%s'%n", line.getText());118 System.out.printf(" Bounding polygon: %s%n", line.getBoundingPolygon());119120 for (DetectedTextWord word : line.getWords()) {121 System.out.printf(" Word: '%s' (confidence: %.4f)%n",122 word.getText(),123 word.getConfidence());124 }125 }126}127```128129### Detect Objects130131```java132ImageAnalysisResult result = client.analyzeFromUrl(133 imageUrl,134 Arrays.asList(VisualFeatures.OBJECTS),135 null);136137for (DetectedObject obj : result.getObjects()) {138 System.out.printf("Object: %s (confidence: %.4f)%n",139 obj.getTags().get(0).getName(),140 obj.getTags().get(0).getConfidence());141142 ImageBoundingBox box = obj.getBoundingBox();143 System.out.printf(" Location: x=%d, y=%d, w=%d, h=%d%n",144 box.getX(), box.getY(), box.getWidth(), box.getHeight());145}146```147148### Get Tags149150```java151ImageAnalysisResult result = client.analyzeFromUrl(152 imageUrl,153 Arrays.asList(VisualFeatures.TAGS),154 null);155156for (DetectedTag tag : result.getTags()) {157 System.out.printf("Tag: %s (confidence: %.4f)%n",158 tag.getName(),159 tag.getConfidence());160}161```162163### Detect People164165```java166ImageAnalysisResult result = client.analyzeFromUrl(167 imageUrl,168 Arrays.asList(VisualFeatures.PEOPLE),169 null);170171for (DetectedPerson person : result.getPeople()) {172 ImageBoundingBox box = person.getBoundingBox();173 System.out.printf("Person at x=%d, y=%d (confidence: %.4f)%n",174 box.getX(), box.getY(), person.getConfidence());175}176```177178### Smart Cropping179180```java181ImageAnalysisResult result = client.analyzeFromUrl(182 imageUrl,183 Arrays.asList(VisualFeatures.SMART_CROPS),184 new ImageAnalysisOptions().setSmartCropsAspectRatios(Arrays.asList(1.0, 1.5)));185186for (CropRegion crop : result.getSmartCrops()) {187 System.out.printf("Crop region: aspect=%.2f, x=%d, y=%d, w=%d, h=%d%n",188 crop.getAspectRatio(),189 crop.getBoundingBox().getX(),190 crop.getBoundingBox().getY(),191 crop.getBoundingBox().getWidth(),192 crop.getBoundingBox().getHeight());193}194```195196### Dense Captions197198```java199ImageAnalysisResult result = client.analyzeFromUrl(200 imageUrl,201 Arrays.asList(VisualFeatures.DENSE_CAPTIONS),202 new ImageAnalysisOptions().setGenderNeutralCaption(true));203204for (DenseCaption caption : result.getDenseCaptions()) {205 System.out.printf("Caption: \"%s\" (confidence: %.4f)%n",206 caption.getText(),207 caption.getConfidence());208 System.out.printf(" Region: x=%d, y=%d, w=%d, h=%d%n",209 caption.getBoundingBox().getX(),210 caption.getBoundingBox().getY(),211 caption.getBoundingBox().getWidth(),212 caption.getBoundingBox().getHeight());213}214```215216### Multiple Features217218```java219ImageAnalysisResult result = client.analyzeFromUrl(220 imageUrl,221 Arrays.asList(222 VisualFeatures.CAPTION,223 VisualFeatures.TAGS,224 VisualFeatures.OBJECTS,225 VisualFeatures.READ),226 new ImageAnalysisOptions()227 .setGenderNeutralCaption(true)228 .setLanguage("en"));229230// Access all results231System.out.println("Caption: " + result.getCaption().getText());232System.out.println("Tags: " + result.getTags().size());233System.out.println("Objects: " + result.getObjects().size());234System.out.println("Text blocks: " + result.getRead().getBlocks().size());235```236237### Async Analysis238239```java240asyncClient.analyzeFromUrl(241 imageUrl,242 Arrays.asList(VisualFeatures.CAPTION),243 null)244 .subscribe(245 result -> System.out.println("Caption: " + result.getCaption().getText()),246 error -> System.err.println("Error: " + error.getMessage()),247 () -> System.out.println("Complete")248 );249```250251## Error Handling252253```java254import com.azure.core.exception.HttpResponseException;255256try {257 client.analyzeFromUrl(imageUrl, Arrays.asList(VisualFeatures.CAPTION), null);258} catch (HttpResponseException e) {259 System.out.println("Status: " + e.getResponse().getStatusCode());260 System.out.println("Error: " + e.getMessage());261}262```263264## Environment Variables265266```bash267VISION_ENDPOINT=https://<resource>.cognitiveservices.azure.com/268VISION_KEY=<your-api-key>269```270271## Image Requirements272273- Formats: JPEG, PNG, GIF, BMP, WEBP, ICO, TIFF, MPO274- Size: < 20 MB275- Dimensions: 50x50 to 16000x16000 pixels276277## Regional Availability278279Caption and Dense Captions require GPU-supported regions. Check [supported regions](https://learn.microsoft.com/azure/ai-services/computer-vision/concept-describe-images-40) before deployment.280281## Trigger Phrases282283- "image analysis Java"284- "Azure Vision SDK"285- "image captioning"286- "OCR image text extraction"287- "object detection image"288- "smart crop thumbnail"289- "detect people image"290
Full transparency — inspect the skill content before installing.