This is an MCP server that analyzes the screen with OmniParser and automatically operates the GUI. Confirmed on Windows. This is MIT license, but Excluding submodules and sub packages. OmniParser's repository is CC-BY-4.0. Each OmniParser model has a different license (reference). 1. Please do the following: (Other than Windows, use export instead of set.) (If you want langchainexample.py to work,
Add this skill
npx mdskills install NON906/omniparser-autogui-mcpEnables vision-powered GUI automation via OmniParser with screen analysis and control capabilities
1# omniparser-autogui-mcp23([日本語版はこちら](README_ja.md))45This is an [MCP server](https://modelcontextprotocol.io/introduction) that analyzes the screen with [OmniParser](https://github.com/microsoft/OmniParser) and automatically operates the GUI.6Confirmed on Windows.78## License notes910This is MIT license, but Excluding submodules and sub packages.11OmniParser's repository is CC-BY-4.0.12Each OmniParser model has a different license ([reference](https://github.com/microsoft/OmniParser?tab=readme-ov-file#model-weights-license)).1314## Installation15161. Please do the following:1718```19git clone --recursive https://github.com/NON906/omniparser-autogui-mcp.git20cd omniparser-autogui-mcp21uv sync22set OCR_LANG=en23uv run download_models.py24```2526(Other than Windows, use ``export`` instead of ``set``.)27(If you want ``langchain_example.py`` to work, ``uv sync --extra langchain`` instead.)28292. Add this to your ``claude_desktop_config.json``:3031```claude_desktop_config.json32{33 "mcpServers": {34 "omniparser_autogui_mcp": {35 "command": "uv",36 "args": [37 "--directory",38 "D:\\CLONED_PATH\\omniparser-autogui-mcp",39 "run",40 "omniparser-autogui-mcp"41 ],42 "env": {43 "PYTHONIOENCODING": "utf-8",44 "OCR_LANG": "en"45 }46 }47 }48}49```5051(Replace ``D:\\CLONED_PATH\\omniparser-autogui-mcp`` with the directory you cloned.)5253``env`` allows for the following additional configurations:5455- ``OMNI_PARSER_BACKEND_LOAD``56If it does not work with other clients (such as [LibreChat](https://github.com/danny-avila/LibreChat)), specify ``1``.5758- ``TARGET_WINDOW_NAME``59If you want to specify the window to operate, please specify the window name.60If not specified, operates on the entire screen.6162- ``OMNI_PARSER_SERVER``63If you want OmniParser processing to be done on another device, specify the server's address and port, such as ``127.0.0.1:8000``.64The server can be started with ``uv run omniparserserver``.6566- ``SSE_HOST``, ``SSE_PORT``67If specified, communication will be done via SSE instead of stdio.6869- ``SOM_MODEL_PATH``, ``CAPTION_MODEL_NAME``, ``CAPTION_MODEL_PATH``, ``OMNI_PARSER_DEVICE``, ``BOX_TRESHOLD``70These are for OmniParser configuration.71Usually, they are not necessary.7273## Usage Examples7475- Search for "MCP server" in the on-screen browser.7677etc.
Full transparency — inspect the skill content before installing.