English | 日本語 mcp-alphabanana is a Model Context Protocol (MCP) server for generating image assets with Google Gemini. It is built for MCP-compatible clients and agent workflows that need fast image generation, transparent outputs, reference-image guidance, and flexible delivery formats. Keywords: MCP server, Model Context Protocol, Gemini AI, image generation, FastMCP Key capabilities: - Ultra-fa
Add this skill
npx mdskills install tasopen/mcp-alphabananaWell-documented Gemini image generation server with comprehensive tool parameters and clear examples
English | 日本語
mcp-alphabanana is a Model Context Protocol (MCP) server for generating image assets with Google Gemini. It is built for MCP-compatible clients and agent workflows that need fast image generation, transparent outputs, reference-image guidance, and flexible delivery formats.
Keywords: MCP server, Model Context Protocol, Gemini AI, image generation, FastMCP
Key capabilities:
Run the MCP server with npx:
npx -y @tasopen/mcp-alphabanana
Or add it to your MCP configuration:
{
"mcp": {
"servers": {
"alphabanana": {
"command": "npx",
"args": ["-y", "@tasopen/mcp-alphabanana"],
"env": {
"GEMINI_API_KEY": "${env:GEMINI_API_KEY}"
}
}
}
}
}
Set GEMINI_API_KEY before starting the server.
This repository provides an MCP server that enables AI agents to generate images using Google Gemini.
It can be used with MCP-compatible clients such as:
Built with FastMCP 3 for a simplified codebase and flexible output options.
Glama MCP Server badge:
Generates images using Google Gemini with optional transparency, local reference images, grounding, and reasoning metadata.
Key parameters:
prompt (string): description of the image to generatemodel: Flash3.1, Flash2.5, Pro3, flash, prooutputWidth and outputHeight: requested final image size in pixelsoutput_resolution: 0.5K, 1K, 2K, 4Koutput_format: png, jpg, webpoutputType: file, base64, combineoutputPath: required when outputType is file or combinetransparent: enable transparent PNG/WebP post-processingreferenceImages: optional array of local reference image filesgrounding_type and thinking_mode: advanced Gemini 3.1 controls| Input Model ID | Internal Model ID | Description |
|---|---|---|
Flash3.1 | gemini-3.1-flash-image-preview | Ultra-fast, supports Thinking/Grounding. |
Flash2.5 | gemini-2.5-flash-image | Legacy Flash. High stability. Low cost. |
Pro3 | gemini-3.0-pro-image-preview | High-fidelity Pro model. |
flash | gemini-3.1-flash-image-preview | Alias for backward compatibility. |
pro | gemini-3.0-pro-image-preview | Alias for backward compatibility. |
Full parameter reference for the generate_image tool.
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt | string | required | Description of the image to generate |
outputFileName | string | required | Output filename (extension auto-added if missing) |
outputType | enum | combine | file, base64, or combine |
model | enum | Flash3.1 | Model: Flash3.1, Flash2.5, Pro3, flash, pro |
output_resolution | enum | auto | 0.5K, 1K, 2K, 4K (0.5K/2K/4K: Flash3.1 only) |
outputWidth | integer | required | Final output width in pixels |
outputHeight | integer | required | Final output height in pixels |
output_format | enum | png | png, jpg, webp |
outputPath | string | required for file / combine | Absolute output directory path |
transparent | boolean | false | Transparent background (PNG/WebP only) |
transparentColor | string or null | null | Color key override for transparency extraction |
colorTolerance | integer | 30 | Transparency color matching tolerance |
fringeMode | enum | auto | auto, crisp, hd |
resizeMode | enum | crop | crop, stretch, letterbox, contain |
grounding_type | enum | none | none, text, image, both (Flash3.1 only) |
thinking_mode | enum | minimal | minimal, high (Flash3.1 only) |
include_thoughts | boolean | false | Return model reasoning fields when metadata is enabled |
include_metadata | boolean | false | Include grounding and reasoning metadata in JSON output |
referenceImages | array | [] | Up to 14 local reference files (Flash3.1/Pro3), 3 for Flash2.5 |
debug | boolean | false | Save intermediate debug artifacts |
These sample outputs were generated with mcp-alphabanana and stored in examples.
| Pixel art asset | Reference-image game scene | Photorealistic generation |
|---|---|---|
![]() | ![]() |
Configure the GEMINI_API_KEY in your MCP configuration (for example, mcp.json).
Examples:
mcp.json:{
"env": {
"GEMINI_API_KEY": "${env:GEMINI_API_KEY}"
}
}
mcp.json:{
"env": {
"GEMINI_API_KEY": "your_api_key_here"
}
}
Add to your VS Code settings (.vscode/settings.json or user settings), configuring the server env in mcp.json or via the VS Code MCP settings.
{
"mcp": {
"servers": {
"mcp-alphabanana": {
"command": "npx",
"args": ["-y", "@tasopen/mcp-alphabanana"],
"env": {
"GEMINI_API_KEY": "${env:GEMINI_API_KEY}"
}
}
}
}
}
Optional: Set a custom fallback directory for write failures by adding MCP_FALLBACK_OUTPUT to the env object.
{
"prompt": "A pixel art treasure chest, golden trim, wooden texture",
"model": "Flash3.1",
"outputFileName": "chest",
"outputType": "base64",
"outputWidth": 64,
"outputHeight": 64,
"transparent": true
}
{
"prompt": "A vertical, photorealistic travel poster advertising Magical Wings Day Tours. A joyful young couple flies high above a breathtaking European countryside at golden hour, holding hands as they soar through a partly cloudy sky. Below them are vineyards, villages, forests, a winding river, and a hilltop medieval castle. The poster uses large, elegant typography with the headline FLY THE COUNTRYSIDE at the top and Magical Wings Day Tours branding near the bottom.",
"model": "Flash3.1",
"output_resolution": "1K",
"outputFileName": "photoreal-travel-poster",
"outputType": "file",
"outputPath": "/path/to/output",
"outputWidth": 848,
"outputHeight": 1264,
"output_format": "jpg",
"thinking_mode": "high",
"include_metadata": true
}
{
"prompt": "A modern travel poster featuring today's weather and skyline highlights in Kuala Lumpur",
"model": "Flash3.1",
"outputFileName": "kl_travel_poster",
"outputType": "base64",
"outputWidth": 1024,
"outputHeight": 1024,
"grounding_type": "text",
"thinking_mode": "high",
"include_metadata": true,
"include_thoughts": true
}
This sample enables Google Search grounding and returns grounding and reasoning metadata in JSON.
{
"prompt": "Use the reference image to create a game screen showing an opened treasure chest filled with coins and treasure, 8-bit dungeon crawler style, after-battle reward scene, dungeon corridor background, four-party status UI at the bottom",
"model": "Flash3.1",
"output_resolution": "0.5K",
"outputFileName": "reference-image-dungeon-loot",
"outputType": "file",
"outputPath": "/path/to/output",
"outputWidth": 600,
"outputHeight": 448,
"output_format": "webp",
"transparent": false,
"referenceImages": [
{
"description": "Treasure chest style reference",
"filePath": "/path/to/references/pixel-art-treasure-chest.png"
}
]
}
# Development mode with MCP CLI
npm run dev
# MCP Inspector (Web UI)
npm run inspect
# Build for production
npm run build
MIT
Install via CLI
npx mdskills install tasopen/mcp-alphabananaMCP Alphabanana is a free, open-source AI agent skill. English | 日本語 mcp-alphabanana is a Model Context Protocol (MCP) server for generating image assets with Google Gemini. It is built for MCP-compatible clients and agent workflows that need fast image generation, transparent outputs, reference-image guidance, and flexible delivery formats. Keywords: MCP server, Model Context Protocol, Gemini AI, image generation, FastMCP Key capabilities: - Ultra-fa
Install MCP Alphabanana with a single command:
npx mdskills install tasopen/mcp-alphabananaThis downloads the skill files into your project and your AI agent picks them up automatically.
MCP Alphabanana works with Claude Code, Claude Desktop, Cursor, Vscode Copilot, Windsurf, Continue Dev, Gemini Cli, Amp, Roo Code, Goose. Skills use the open SKILL.md format which is compatible with any AI coding agent that reads markdown instructions.