loading…
Search for a command to run...
loading…
Gives AI agents full keyboard, mouse, and screen access to a physical PC via a KVM server and OCR.
Gives AI agents full keyboard, mouse, and screen access to a physical PC via a KVM server and OCR.
MCP (Model Context Protocol) server that gives AI agents full keyboard, mouse, and screen access to a physical PC. Thin client for serial-hid-kvm — all hardware control is delegated via TCP.
Claude / AI Agent
↕ MCP (stdio)
mcp-serial-hid-kvm ← this package (thin client + OCR)
↕ TCP (localhost:9329)
serial-hid-kvm ← standalone KVM server (owns hardware)
↕ USB Serial + HDMI
Target PC
The KVM server (serial-hid-kvm) runs as a persistent process owning the serial port and capture device. This MCP server connects to it as a TCP client. Multiple MCP instances (multiple Claude sessions) can share a single KVM server without device conflicts.
pip install -e /path/to/serial-hid-kvm
serial-hid-kvm --api # with preview window
serial-hid-kvm --api --headless # or headless
get_screen_text / execute_and_read):sudo apt install tesseract-ocrpip install -e .
This automatically installs serial-hid-kvm as a dependency.
{
"mcpServers": {
"kvm": {
"command": "mcp-serial-hid-kvm"
}
}
}
Custom KVM server address:
{
"mcpServers": {
"kvm": {
"command": "mcp-serial-hid-kvm",
"env": {
"SHKVM_API_HOST": "127.0.0.1",
"SHKVM_API_PORT": "9329"
}
}
}
}
| Variable | Default | Description |
|---|---|---|
SHKVM_API_HOST |
127.0.0.1 |
KVM server address |
SHKVM_API_PORT |
9329 |
KVM server port |
SHKVM_OCR_CMD |
auto-detect | Path to tesseract executable |
SHKVM_CAPTURE_LOG_DIR |
platform default | Capture log directory (empty string to disable) |
Hardware settings (SHKVM_SERIAL_PORT, SHKVM_SCREEN_WIDTH, etc.) are configured on the KVM server side, not here. If the target PC uses a non-US keyboard, set --target-layout (or SHKVM_TARGET_LAYOUT) on the KVM server so that type_text and send_key produce correct characters.
| Tool | Description |
|---|---|
type_text |
Type text with inline tags: ls -la{enter}, {ctrl+c}, {alt+f4}. Whitelist-based: unknown {content} passes through literally. Raw mode (raw=true) disables tags; actual line breaks become Enter. char_delay_ms: delay between keystrokes in ms (default: 20). Only ASCII printable characters, tab, and line breaks are supported; unsupported characters (Unicode, CJK, etc.) cause an error — use base64 encoding as a workaround |
send_key |
Single key press with modifiers |
send_key_sequence |
Multiple key steps with per-step delays. default_delay_ms: delay between steps in ms (default: 100); each step can override with delay_ms |
| Tool | Description |
|---|---|
mouse_move |
Move cursor (absolute or relative) |
mouse_click |
Click at optional position |
mouse_drag |
Drag from one position to another (drag-and-drop, text selection, etc.) |
mouse_scroll |
Scroll wheel |
| Tool | Description |
|---|---|
capture_screen |
Capture screen as image (high token cost) |
get_screen_text |
Capture + OCR to text (preferred for text content) |
execute_and_read |
Type command, Enter, wait, capture + OCR |
| Tool | Description |
|---|---|
get_device_info |
Serial port, capture device, config info |
list_capture_devices |
List available video devices |
set_capture_device |
Switch capture device |
set_capture_resolution |
Change capture resolution |
This package is intentionally minimal (~4 files):
mcp_serial_hid_kvm/
server.py MCP tool handlers → KvmClient TCP calls
config.py KVM host/port, tesseract, log settings
ocr.py Tesseract OCR (runs locally on fetched frames)
__init__.py
All keyboard/mouse/capture logic lives in serial-hid-kvm. This package only translates MCP tool calls to TCP API calls and runs OCR locally.
serial-hid-kvm works without MCP (interactive preview, scripts, other AI frameworks)Run in your terminal:
claude mcp add mcp-serial-hid-kvm -- npx Security
Low riskAutomated heuristic from public metadata — not a security guarantee.