English | 日本語 mcp-alphabanana is a Model Context Protocol (MCP) server for generating image assets with Google Gemini. It is built for MCP-compatible clients and agent workflows that need fast image generation, transparent outputs, reference-image guidance, and flexible delivery formats. Keywords: MCP server, Model Context Protocol, Gemini AI, image generation, FastMCP Key capabilities: - Ultra-fa
Add this skill
npx mdskills install tasopen/mcp-alphabananaWell-documented Gemini image generation server with comprehensive tool parameters and clear examples
1# mcp-alphabanana23[](https://www.npmjs.com/package/@tasopen/mcp-alphabanana)4[](LICENSE)56English | [日本語](README.ja.md)78mcp-alphabanana is a Model Context Protocol (MCP) server for generating image assets with Google Gemini. It is built for MCP-compatible clients and agent workflows that need fast image generation, transparent outputs, reference-image guidance, and flexible delivery formats.910Keywords: MCP server, Model Context Protocol, Gemini AI, image generation, FastMCP1112Key capabilities:13- Ultra-fast Gemini image generation across Flash and Pro tiers14- Transparent PNG/WebP asset output for web and game pipelines15- Multi-image style guidance with local reference image files16- Flexible file, base64, or combined outputs for agent workflows17181920## Quick Start2122Run the MCP server with npx:2324```bash25npx -y @tasopen/mcp-alphabanana26```2728Or add it to your MCP configuration:2930```json31{32 "mcp": {33 "servers": {34 "alphabanana": {35 "command": "npx",36 "args": ["-y", "@tasopen/mcp-alphabanana"],37 "env": {38 "GEMINI_API_KEY": "${env:GEMINI_API_KEY}"39 }40 }41 }42 }43}44```4546Set `GEMINI_API_KEY` before starting the server.4748## MCP Server4950This repository provides an MCP server that enables AI agents to generate images using Google Gemini.5152It can be used with MCP-compatible clients such as:5354- Claude Desktop55- VS Code MCP56- Cursor5758Built with [FastMCP 3](https://www.npmjs.com/package/fastmcp) for a simplified codebase and flexible output options.5960Glama MCP Server badge:61<a href="https://glama.ai/mcp/servers/tasopen/mcp-alphabanana">62 <img width="380" height="200" src="https://glama.ai/mcp/servers/tasopen/mcp-alphabanana/badge" />63</a>6465## Available Tools6667### generate_image6869Generates images using Google Gemini with optional transparency, local reference images, grounding, and reasoning metadata.7071Key parameters:7273- `prompt` (string): description of the image to generate74- `model`: `Flash3.1`, `Flash2.5`, `Pro3`, `flash`, `pro`75- `outputWidth` and `outputHeight`: requested final image size in pixels76- `output_resolution`: `0.5K`, `1K`, `2K`, `4K`77- `output_format`: `png`, `jpg`, `webp`78- `outputType`: `file`, `base64`, `combine`79- `outputPath`: required when `outputType` is `file` or `combine`80- `transparent`: enable transparent PNG/WebP post-processing81- `referenceImages`: optional array of local reference image files82- `grounding_type` and `thinking_mode`: advanced Gemini 3.1 controls8384### Model Selection8586| Input Model ID | Internal Model ID | Description |87| --- | --- | --- |88| `Flash3.1` | `gemini-3.1-flash-image-preview` | Ultra-fast, supports Thinking/Grounding. |89| `Flash2.5` | `gemini-2.5-flash-image` | Legacy Flash. High stability. Low cost. |90| `Pro3` | `gemini-3.0-pro-image-preview` | High-fidelity Pro model. |91| `flash` | `gemini-3.1-flash-image-preview` | Alias for backward compatibility. |92| `pro` | `gemini-3.0-pro-image-preview` | Alias for backward compatibility. |9394### Parameters9596Full parameter reference for the `generate_image` tool.9798| Parameter | Type | Default | Description |99|-----------|------|---------|-------------|100| `prompt` | string | *required* | Description of the image to generate |101| `outputFileName` | string | *required* | Output filename (extension auto-added if missing) |102| `outputType` | enum | `combine` | `file`, `base64`, or `combine` |103| `model` | enum | `Flash3.1` | Model: `Flash3.1`, `Flash2.5`, `Pro3`, `flash`, `pro` |104| `output_resolution` | enum | auto | `0.5K`, `1K`, `2K`, `4K` (0.5K/2K/4K: Flash3.1 only) |105| `outputWidth` | integer | *required* | Final output width in pixels |106| `outputHeight` | integer | *required* | Final output height in pixels |107| `output_format` | enum | `png` | `png`, `jpg`, `webp` |108| `outputPath` | string | required for `file` / `combine` | Absolute output directory path |109| `transparent` | boolean | `false` | Transparent background (PNG/WebP only) |110| `transparentColor` | string or null | `null` | Color key override for transparency extraction |111| `colorTolerance` | integer | `30` | Transparency color matching tolerance |112| `fringeMode` | enum | `auto` | `auto`, `crisp`, `hd` |113| `resizeMode` | enum | `crop` | `crop`, `stretch`, `letterbox`, `contain` |114| `grounding_type` | enum | `none` | `none`, `text`, `image`, `both` (Flash3.1 only) |115| `thinking_mode` | enum | `minimal` | `minimal`, `high` (Flash3.1 only) |116| `include_thoughts` | boolean | `false` | Return model reasoning fields when metadata is enabled |117| `include_metadata` | boolean | `false` | Include grounding and reasoning metadata in JSON output |118| `referenceImages` | array | `[]` | Up to 14 local reference files (Flash3.1/Pro3), 3 for Flash2.5 |119| `debug` | boolean | `false` | Save intermediate debug artifacts |120121## Why alphabanana?122123- **Zero Watermarks:** API-native clean images.124- **Thinking/Grounding Support:** Higher prompt adherence and search-backed accuracy.125- **Production Ready:** Supports transparent WebP and exact aspect ratios for web and game assets.126127## Features128129- **Ultra-fast image generation** (Gemini 3.1 Flash, 0.5K/1K/2K/4K)130- **Advanced multi-image reasoning** (up to 14 reference images)131- **Thinking/Grounding support** (Flash3.1 only)132- **Transparent PNG/WebP output** (color-key post-processing, despill)133- **Multiple output formats**: file, base64, or both134- **Flexible resize modes**: crop, stretch, letterbox, contain135- **Multiple model tiers**: Flash3.1, Flash2.5, Pro3, legacy aliases136137## Example Outputs138139These sample outputs were generated with mcp-alphabanana and stored in [examples](examples).140141| Pixel art asset | Reference-image game scene | Photorealistic generation |142| --- | --- | --- |143|  |  |  |144145## Configuration146147Configure the `GEMINI_API_KEY` in your MCP configuration (for example, `mcp.json`).148149Examples:150151- Reference an OS environment variable from `mcp.json`:152153```json154{155 "env": {156 "GEMINI_API_KEY": "${env:GEMINI_API_KEY}"157 }158}159```160161- Provide the key directly in `mcp.json`:162163```json164{165 "env": {166 "GEMINI_API_KEY": "your_api_key_here"167 }168}169```170171### VS Code Integration172173Add to your VS Code settings (`.vscode/settings.json` or user settings), configuring the server `env` in `mcp.json` or via the VS Code MCP settings.174175```json176{177 "mcp": {178 "servers": {179 "mcp-alphabanana": {180 "command": "npx",181 "args": ["-y", "@tasopen/mcp-alphabanana"],182 "env": {183 "GEMINI_API_KEY": "${env:GEMINI_API_KEY}"184 }185 }186 }187 }188}189```190191**Optional:** Set a custom fallback directory for write failures by adding `MCP_FALLBACK_OUTPUT` to the `env` object.192193## Usage Examples194195#### Basic Generation196197```json198{199 "prompt": "A pixel art treasure chest, golden trim, wooden texture",200 "model": "Flash3.1",201 "outputFileName": "chest",202 "outputType": "base64",203 "outputWidth": 64,204 "outputHeight": 64,205 "transparent": true206}207```208209#### Advanced (Vertical poster and thinking)210211```json212{213 "prompt": "A vertical, photorealistic travel poster advertising Magical Wings Day Tours. A joyful young couple flies high above a breathtaking European countryside at golden hour, holding hands as they soar through a partly cloudy sky. Below them are vineyards, villages, forests, a winding river, and a hilltop medieval castle. The poster uses large, elegant typography with the headline FLY THE COUNTRYSIDE at the top and Magical Wings Day Tours branding near the bottom.",214 "model": "Flash3.1",215 "output_resolution": "1K",216 "outputFileName": "photoreal-travel-poster",217 "outputType": "file",218 "outputPath": "/path/to/output",219 "outputWidth": 848,220 "outputHeight": 1264,221 "output_format": "jpg",222 "thinking_mode": "high",223 "include_metadata": true224}225```226227#### Grounding Sample (Search-backed)228229```json230{231 "prompt": "A modern travel poster featuring today's weather and skyline highlights in Kuala Lumpur",232 "model": "Flash3.1",233 "outputFileName": "kl_travel_poster",234 "outputType": "base64",235 "outputWidth": 1024,236 "outputHeight": 1024,237 "grounding_type": "text",238 "thinking_mode": "high",239 "include_metadata": true,240 "include_thoughts": true241}242```243244This sample enables Google Search grounding and returns grounding and reasoning metadata in JSON.245246#### With Reference Images247248```json249{250 "prompt": "Use the reference image to create a game screen showing an opened treasure chest filled with coins and treasure, 8-bit dungeon crawler style, after-battle reward scene, dungeon corridor background, four-party status UI at the bottom",251 "model": "Flash3.1",252 "output_resolution": "0.5K",253 "outputFileName": "reference-image-dungeon-loot",254 "outputType": "file",255 "outputPath": "/path/to/output",256 "outputWidth": 600,257 "outputHeight": 448,258 "output_format": "webp",259 "transparent": false,260 "referenceImages": [261 {262 "description": "Treasure chest style reference",263 "filePath": "/path/to/references/pixel-art-treasure-chest.png"264 }265 ]266}267```268269## Transparency & Output Formats270271- **PNG**: Full alpha, color-key + despill272- **WebP**: Full alpha, better compression (Flash3.1+)273- **JPEG**: No transparency (falls back to solid background)274275## Development276277```bash278# Development mode with MCP CLI279npm run dev280281# MCP Inspector (Web UI)282npm run inspect283284# Build for production285npm run build286```287288## License289290MIT291292
Full transparency — inspect the skill content before installing.