loading…
Search for a command to run...
loading…
Desktop UI automation for AI agents: screenshots, window management, mouse, keyboard, UI Automation tree, OCR. Single Windows x64 binary, no dependencies.
Desktop UI automation for AI agents: screenshots, window management, mouse, keyboard, UI Automation tree, OCR. Single Windows x64 binary, no dependencies.
version: 3.10.1 tools: 19 AI generated: 100%
MCP server for automated desktop UI testing. A single binary — no runtime, no dependencies, no installation.
Windows x64 only. macOS and Linux support is planned.
Gives AI agents eyes and hands: screenshots, window management, mouse, keyboard, UI Automation, OCR, file search.
AI agents can trigger actions in applications but can't see the screen. This server bridges that gap:
Agent triggers action → takes screenshot → sees the result →
switches window → clicks a button → verifies → writes report
Fully autonomous, no user involvement required.
10 tasks. One take. Watch on YouTube →
Claude Cowork now includes built-in Computer Use — Claude takes screenshots and clicks through interfaces visually. It works with zero setup. MCP Test Utils takes a different approach: instead of guessing where to click from a screenshot, it reads the actual UI structure through Windows APIs.
| MCP Test Utils | Computer Use | |
|---|---|---|
| Click precision | Exact — UI Automation API | Visual estimate from screenshot |
| Speed & token cost | Fast, low cost — text responses | Slower, costly — image on every step |
| UI structure | Full tree: roles, states, coordinates | Not available |
| OCR | Word-level coordinates, multi-language | Not available (model vision only) |
| Window management | API-based, window-relative coords | Visual navigation |
| File search | Ripgrep engine built-in | Not available |
| Session logging | JSONL + screenshots | Not available |
| Visual analysis | ✅ Same Claude model, full-res 1:1 | ✅ Same Claude model |
| Setup | Download binary, add to config | Built-in, one toggle |
| Mobile / Dispatch | — | ✅ Tasks from phone |
| Cross-platform | Windows (macOS/Linux planned) | macOS + Windows |
MCP Test Utils is faster, more precise, and cheaper per action. Computer Use is easier to start and works across platforms. They complement each other.
| Platform | Status |
|---|---|
| Windows x64 | ✅ Full support |
| macOS arm64 | ⏳ Planned |
| Linux x64 | ⏳ Planned |
| Tool | Description |
|---|---|
take_screenshot |
Screenshot of the entire desktop with configurable quality |
take_window_screenshot |
Screenshot of a specific window (screen or window capture mode) |
read_screen_text |
OCR the entire screen (Windows.Media.Ocr) |
read_region_text |
OCR a screen region with precise word coordinates |
| Tool | Description |
|---|---|
list_windows |
List windows with id, title, app, position, size, minimized, focused |
focus_window |
Bring a window to front, restore if minimized |
| Tool | Description |
|---|---|
mouse_click |
Click (left / right / middle) at screen or window-relative coordinates |
mouse_move |
Move cursor to a point |
mouse_drag |
Drag from point A to point B |
mouse_scroll |
Scroll the mouse wheel |
keyboard_type |
Type text (full Unicode — Latin, Cyrillic, CJK, emoji) |
keyboard_press |
Press a key (Enter, Tab, F1–F12, arrows, etc.) |
keyboard_shortcut |
Key combinations (Ctrl+S, Alt+F4, Ctrl+Shift+P, etc.) |
| Tool | Description |
|---|---|
list_ui_elements |
UI Automation tree — buttons, fields, menus with exact coordinates |
| Tool | Description |
|---|---|
search_in_files |
Search text or regex in files within allowed directories (like VS Code Find in Files) |
find_files |
Find files and directories by name pattern (glob), like "Go to File" |
| Tool | Description |
|---|---|
get_usage_guide |
Compact workflow guide for LLM agents — precision clicking, coordinate metadata, quality tips |
| Tool | Description |
|---|---|
enable_logging |
Start recording tool calls to JSONL + screenshots (opt-in) |
disable_logging |
Stop recording, get session stats |
Claude Desktop: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"test-utils": {
"command": "D:\\path\\to\\mcp-test-utils.exe"
}
}
}
{
"mcpServers": {
"test-utils": {
"command": "D:\\path\\to\\mcp-test-utils.exe",
"env": {
"MCP_LOG_DIR": "D:\\path\\to\\logs",
"MCP_LOG_MAX_MB": "500",
"MCP_LOG_RETAIN_DAYS": "30",
"MCP_SEARCH_DIRS": "D:\\Projects\\app1;D:\\Projects\\app2"
}
}
}
}
Screenshots support configurable quality to balance detail and token cost:
| Preset | Scale | Format | Use Case |
|---|---|---|---|
full |
100% | JPEG q90 | Maximum detail |
standard |
50% | JPEG q70 | Balanced (default) |
compact |
50% | PNG | When PNG is needed |
minimal |
25% | Grayscale | Lowest token cost |
custom |
10–100% | JPEG / PNG / Grayscale | Full control |
| Variable | Description | Default |
|---|---|---|
MCP_LOG_DIR |
Path for log sessions. Without it, logging tools are hidden | — |
MCP_LOG_MAX_MB |
Session size limit (warning on exceed) | 500 |
MCP_LOG_RETAIN_DAYS |
Auto-delete sessions older than N days. 0 to disable |
30 |
MCP_SEARCH_DIRS |
Allowed directories for search_in_files (; on Windows, : on macOS/Linux). Without it, the tool is hidden |
— |
MCP Test Utils is a JSON-RPC 2.0 server communicating over stdin/stdout. Any MCP-compatible client launches the binary, sends tool calls, and receives structured responses (text, base64 images). Tested with Claude Desktop.
The server uses native Windows APIs directly — Win32 GDI for screenshots, SendInput for mouse and keyboard, UI Automation COM API for element inspection, WinRT Windows.Media.Ocr for text recognition. File search uses the ripgrep engine (grep-regex, grep-searcher, ignore) — cross-platform, no external dependencies. No PowerShell, no external tools, no network access.
MCP_SEARCH_DIRS are accessibleFree and unrestricted. If you find it useful — jeenyjai.github.io
Copyright 2026 JeenyJAI. All rights reserved.
🚀 Created with Claude
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"mcp-test-utils": {
"command": "npx",
"args": []
}
}
}Extract design specs and assets
An Open-Sourced UI to install and manage MCP servers for Windows, Linux and macOS.
Build, validate, and deploy multi-agent AI solutions on the ADAS platform. Design skills with tools, manage solution lifecycle, and connect from any AI environm
MCP Bundles: Create custom bundles of tools and connect providers with OAuth or API keys. Use one MCP server across thousands of integrations, with programmatic