chrishayuk/mcp-cli

Бесплатно

Yet another CLI tool for testing MCP servers

автор: chrishayuk

Описание

Yet another CLI tool for testing MCP servers

README

A powerful, feature-rich command-line interface for interacting with Model Context Protocol servers. This client enables seamless communication with LLMs through integration with the CHUK Tool Processor and CHUK-LLM, providing tool usage, conversation management, and multiple operational modes.

Default Configuration: MCP CLI defaults to using Ollama with the gpt-oss reasoning model for local, privacy-focused operation without requiring API keys.

🆕 Recent Updates (v0.16)

AI Virtual Memory (Experimental)

--vm flag: Enable OS-style virtual memory for conversation context management, powered by chuk-ai-session-manager
--vm-budget: Control token budget for conversation events (system prompt is uncapped on top), forcing earlier eviction and page creation
--vm-mode: Choose VM mode — passive (runtime-managed, default), relaxed (VM-aware conversation), or strict (model-driven paging with tools)
/memory command: Visualize VM state during conversations — page table, working set utilization, eviction metrics, TLB stats (aliases: /vm, /mem)
Multimodal page_fault: Image pages return multi-block content (text + image_url) so multimodal models can re-analyze recalled images
/memory page <id> --download: Export page content to local files with modality-aware extensions (.txt, .json, .png)

Execution Plans (Tier 6)

/plan command: Create, inspect, and execute reproducible tool call graphs — create, list, show, run, delete, resume
Model-driven planning (--plan-tools): The LLM autonomously creates and executes plans during conversation — no /plan command needed. It calls plan_create_and_execute when multi-step orchestration is required, and uses regular tools for simple tasks. Each step renders with real-time progress in the terminal
Parallel batch execution: Independent plan steps run concurrently via topological batching (Kahn's BFS), with configurable max_concurrency
Variable resolution: ${var}, ${var.field} nested access, and template strings like "https://${api.host}/users" — type-preserving for single refs
Dry-run mode: Trace planned tool calls without executing — safe for production inspection
Checkpointing & resume: Execution state persisted after each batch; resume interrupted plans with /plan resume <id>
Guard integration: Plans respect existing budget, per-tool limits, and runaway detection guards
DAG visualization: ASCII rendering with status indicators (○/◉/●/✗) and parallel markers (∥)
Re-planning: Optional LLM-based re-planning on step failure (enable_replan=True)
Powered by: chuk-ai-planner graph-based plan DSL

MCP Apps (SEP-1865)

Interactive HTML UIs: MCP servers can serve interactive HTML applications (charts, tables, maps, markdown viewers) that render in your browser
Sandboxed iframes: Apps run in secure sandboxed iframes with CSP protection
WebSocket bridge: Real-time bidirectional communication between browser apps and MCP servers
Automatic launch: Tools with _meta.ui annotations automatically open in the browser when called
Session reliability: Message queuing, reconnection with exponential backoff, deferred tool result delivery

Production Hardening

Secret Redaction: All log output (console and file) is automatically redacted for Bearer tokens, API keys, OAuth tokens, and Authorization headers
Structured File Logging: Optional --log-file flag enables rotating JSON log files (10MB, 3 backups) at DEBUG level
Per-Server Timeouts: Server configs support tool_timeout and init_timeout overrides, resolved per-server → global → default
Thread-Safe OAuth: Concurrent OAuth flows serialized with asyncio.Lock and copy-on-write header mutation
Server Health Monitoring: /health command, health-check-on-failure diagnostics, optional --health-interval background polling

Performance & Polish

O(1) Tool Lookups: Indexed tool lookup replacing O(n) linear scans
Cached LLM Tool Metadata: Per-provider caching with automatic invalidation
Startup Progress: Real-time progress messages during initialization
Token Usage Tracking: Per-turn and cumulative tracking with /usage command (aliases: /tokens, /cost)
Session Persistence: Save/load/list conversation sessions with auto-save every 10 turns (/sessions)
Conversation Export: Export conversations as Markdown or JSON with metadata (/export)

Dashboard (Real-Time Browser UI)

--dashboard flag: Launch a real-time browser dashboard alongside chat mode
Agent Terminal: Live conversation view with message bubbles, streaming tokens, and attachment rendering
Activity Stream: Tool call/result pairs, reasoning steps, and user attachment events
Plan Viewer: Visual execution plan progress with DAG rendering
Tool Registry: Browse discovered tools, trigger execution from the browser
Config Panel: View and switch providers, models, and system prompt
File Attachments: "+" button for browser file upload, drag-and-drop, and clipboard paste

Multi-Modal Attachments

/attach command: Stage files for the next message — images, text/code, and audio (aliases: /file, /image)
--attach CLI flag: Attach files to the first message (repeatable: --attach img.png --attach code.py)
Inline @file: references: Mention @file:path/to/file anywhere in a message to attach it
Image URL detection: HTTP/HTTPS image URLs in messages are automatically sent as vision content
Supported formats: PNG, JPEG, GIF, WebP, HEIC (images), MP3, WAV (audio), plus 25+ text/code extensions
Dashboard rendering: Image thumbnails, expandable text previews, audio players, file badges
Browser upload: "+" button in dashboard chat input with drag-and-drop and clipboard paste support

Code Quality

Core/UI Separation: Core modules use logging only — no UI imports
4,300+ tests: Comprehensive test suite with branch coverage, integration tests, and 60% minimum threshold
15 Architecture Principles: Documented and enforced (see architecture.md)
Full Roadmap: Tiers 1-6 complete, Tiers 7-12 planned (traces, memory scopes, skills, scheduling, multi-agent)

🔄 Architecture Overview

The MCP CLI is built on a modular architecture with clean separation of concerns:

CHUK Tool Processor: Production-grade async tool execution with middleware (retry, circuit breaker, rate limiting), multiple execution strategies, and observability
CHUK-LLM: Unified LLM provider with dynamic model discovery, capability-based selection, and llama.cpp integration (1.53x faster than Ollama with automatic model reuse)
CHUK-Term: Enhanced terminal UI with themes, cross-platform terminal management, and rich formatting
MCP CLI: Command orchestration and integration layer (this project)

🌟 Features

Multiple Operational Modes

Chat Mode: Conversational interface with streaming responses and automated tool usage (default: Ollama/gpt-oss)
Interactive Mode: Command-driven shell interface for direct server operations
Command Mode: Unix-friendly mode for scriptable automation and pipelines
Direct Commands: Run individual commands without entering interactive mode

Advanced Chat Interface

Streaming Responses: Real-time response generation with live UI updates
Reasoning Visibility: See AI's thinking process with reasoning models (gpt-oss, GPT-5, Claude 4.5)
Concurrent Tool Execution: Execute multiple tools simultaneously while preserving conversation order
Smart Interruption: Interrupt streaming responses or tool execution with Ctrl+C
Performance Metrics: Response timing, words/second, and execution statistics
Rich Formatting: Markdown rendering, syntax highlighting, and progress indicators
Token Usage Tracking: Per-turn and cumulative API token usage with /usage command
Multi-Modal Attachments: Attach images, text files, and audio to messages via /attach, --attach, @file: refs, or browser upload
Session Persistence: Auto-save and manual save/load of conversation sessions
Conversation Export: Export to Markdown or JSON with metadata and token usage

Comprehensive Provider Support

MCP CLI supports all providers and models from CHUK-LLM, including cutting-edge reasoning models:

Provider	Key Models	Special Features
Ollama (Default)	🧠 gpt-oss, llama3.3, llama3.2, qwen3, qwen2.5-coder, deepseek-coder, granite3.3, mistral, gemma3, phi3, codellama	Local reasoning models, privacy-focused, no API key required
OpenAI	🚀 GPT-5 family (gpt-5, gpt-5-mini, gpt-5-nano), GPT-4o family, O3 series (o3, o3-mini)	Advanced reasoning, function calling, vision
Anthropic	🧠 Claude 4.5 family (claude-4-5-opus, claude-4-5-sonnet), Claude 3.5 Sonnet	Enhanced reasoning, long context
Azure OpenAI 🏢	Enterprise GPT-5, GPT-4 models	Private endpoints, compliance, audit logs
Google Gemini	Gemini 2.0 Flash, Gemini 1.5 Pro	Multimodal, fast inference
Groq ⚡	Llama 3.1 models, Mixtral	Ultra-fast inference (500+ tokens/sec)
Perplexity 🌐	Sonar models	Real-time web search with citations
IBM watsonx 🏢	Granite, Llama models	Enterprise compliance
Mistral AI 🇪🇺	Mistral Large, Medium	European, efficient models

Robust Tool System (Powered by CHUK Tool Processor v0.22+)

Automatic Discovery: Server-provided tools are automatically detected and catalogued
Provider Adaptation: Tool names are automatically sanitized for provider compatibility
Production-Grade Execution: Middleware layers with timeouts, retries, exponential backoff, caching, and circuit breakers
Multiple Execution Strategies: In-process (fast), isolated subprocess (safe), or remote via MCP
Concurrent Execution: Multiple tools can run simultaneously with proper coordination
Rich Progress Display: Real-time progress indicators and execution timing
Tool History: Complete audit trail of all tool executions
Middleware: Retry with exponential backoff, circuit breakers, and rate limiting via CTP
Streaming Tool Calls: Support for tools that return streaming data

MCP Apps (Interactive UIs)

Browser-based UIs: MCP servers can serve interactive HTML applications that render in your browser
Automatic Detection: Tools with _meta.ui annotations automatically launch browser apps on tool call
Sandboxed Execution: Apps run in secure sandboxed iframes with Content Security Policy protection
WebSocket Bridge: Real-time JSON-RPC bridge between browser apps and MCP tool servers
Session Persistence: Message queuing during disconnects, automatic reconnection, deferred tool result delivery
structuredContent Support: Full MCP spec compliance including structured content extraction and forwarding

Execution Plans (Powered by chuk-ai-planner)

Plan Creation: Generate execution plans from natural language descriptions using LLM-based plan agents
Model-Driven Planning: With --plan-tools, the LLM autonomously decides when to plan — calls plan_create_and_execute for complex multi-step tasks, uses regular tools for simple ones
DAG Execution: Plans are directed acyclic graphs — independent steps run in parallel batches, dependent steps wait
Variable Resolution: Step outputs bind to variables (result_variable), referenced by later steps as ${var} or ${var.field}
Dry-Run Mode: Trace what a plan would do without executing any tools — safe for production
Checkpointing: Execution state saved after each batch; resume interrupted plans without re-running completed steps
Guard Integration: Plans share budget and per-tool limits with the conversation — no bypass
Re-planning: On step failure, optionally invoke the LLM to generate a revised plan for remaining work
DAG Visualization: ASCII rendering shows dependency structure, batch grouping, and parallel markers
Persistence: Plans stored as JSON at ~/.mcp-cli/plans/

Advanced Configuration Management

Environment Integration: API keys and settings via environment variables
File-based Config: YAML and JSON configuration files
User Preferences: Persistent settings for active providers and models
Validation & Diagnostics: Built-in provider health checks and configuration validation

Enhanced User Experience

Cross-Platform Support: Windows, macOS, and Linux with platform-specific optimizations via chuk-term
Rich Console Output: Powered by chuk-term with 8 built-in themes (default, dark, light, minimal, terminal, monokai, dracula, solarized)
Advanced Terminal Management: Cross-platform terminal operations including clearing, resizing, color detection, and cursor control
Interactive UI Components: User input handling through chuk-term's prompt system (ask, confirm, select_from_list, select_multiple)
Command Completion: Context-aware tab completion for all interfaces
Comprehensive Help: Detailed help system with examples and usage patterns
Graceful Error Handling: User-friendly error messages with troubleshooting hints

📚 Documentation

Comprehensive documentation is available in the docs/ directory:

Project

Architecture - 15 design principles, module layout, and coding conventions
Roadmap - Vision, completed tiers (1-5), and planned tiers (6-12: plans, traces, skills, scheduling, multi-agent, remote sessions)

Core Documentation

Commands System - Complete guide to the unified command system, patterns, and usage across all modes
Token Management - Comprehensive token management for providers and servers including OAuth, bearer tokens, and API keys

Specialized Documentation

Execution Plans - Plan creation, parallel execution, variable resolution, checkpointing, guards, and re-planning
Dashboard - Real-time browser UI with agent terminal, activity stream, and file uploads
Attachments - Multi-modal file attachments: images, text, audio, and browser upload
MCP Apps - Interactive browser UIs served by MCP servers (SEP-1865)
OAuth Authentication - OAuth flows, storage backends, and MCP server integration
Streaming Integration - Real-time response streaming architecture
Package Management - Dependency organization and feature groups

UI Documentation

Themes - Theme system and customization
Output System - Rich console output and formatting
Terminal Management - Cross-platform terminal operations

Testing Documentation

Unit Testing - Test structure and patterns
Test Coverage - Coverage requirements and reporting

📋 Prerequisites

Python 3.11 or higher
For Local Operation (Default):
- Ollama: Install from ollama.ai
- Pull the default reasoning model: ollama pull gpt-oss
For Cloud Providers (Optional):
- OpenAI: OPENAI_API_KEY environment variable (for GPT-5, GPT-4, O3 models)
- Anthropic: ANTHROPIC_API_KEY environment variable (for Claude 4.5, Claude 3.5)
- Azure: AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT (for enterprise GPT-5)
- Google: GEMINI_API_KEY (for Gemini models)
- Groq: GROQ_API_KEY (for fast Llama models)
- Custom providers: Provider-specific configuration
MCP Servers: Server configuration file (default: server_config.json)

🚀 Installation

Quick Start with Ollama (Default)

Install Ollama (if not already installed):

# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Or visit https://ollama.ai for other installation methods

Pull the default reasoning model:

ollama pull gpt-oss  # Open-source reasoning model with thinking visibility

Install and run MCP CLI:

# Using uvx (recommended)
uvx mcp-cli --help

# Or install from source
git clone https://github.com/chrishayuk/mcp-cli
cd mcp-cli
pip install -e "."
mcp-cli --help

# Optional: Enable MCP Apps (interactive browser UIs)
pip install -e ".[apps]"

Using Different Models

# === LOCAL MODELS (No API Key Required) ===

# Use default reasoning model (gpt-oss)
mcp-cli --server sqlite

# Use other Ollama models
mcp-cli --model llama3.3              # Latest Llama
mcp-cli --model qwen2.5-coder         # Coding-focused
mcp-cli --model deepseek-coder        # Another coding model
mcp-cli --model granite3.3            # IBM Granite

# === CLOUD PROVIDERS (API Keys Required) ===

# GPT-5 Family (requires OpenAI API key)
mcp-cli --provider openai --model gpt-5          # Full GPT-5 with reasoning
mcp-cli --provider openai --model gpt-5-mini     # Efficient GPT-5 variant
mcp-cli --provider openai --model gpt-5-nano     # Ultra-lightweight GPT-5

# GPT-4 Family
mcp-cli --provider openai --model gpt-4o         # GPT-4 Optimized
mcp-cli --provider openai --model gpt-4o-mini    # Smaller GPT-4

# O3 Reasoning Models
mcp-cli --provider openai --model o3             # O3 reasoning
mcp-cli --provider openai --model o3-mini        # Efficient O3

# Claude 4.5 Family (requires Anthropic API key)
mcp-cli --provider anthropic --model claude-4-5-opus    # Most advanced Claude
mcp-cli --provider anthropic --model claude-4-5-sonnet  # Balanced Claude 4.5
mcp-cli --provider anthropic --model claude-3-5-sonnet  # Claude 3.5

# Enterprise Azure (requires Azure configuration)
mcp-cli --provider azure_openai --model gpt-5    # Enterprise GPT-5

# Other Providers
mcp-cli --provider gemini --model gemini-2.0-flash      # Google Gemini
mcp-cli --provider groq --model llama-3.1-70b          # Fast Llama via Groq

🧰 Global Configuration

Default Configuration

MCP CLI defaults to:

Provider: ollama (local, no API key required)
Model: gpt-oss (open-source reasoning model with thinking visibility)

Command-line Arguments

Global options available for all modes and commands:

--server: Specify server(s) to connect to (comma-separated)
--config-file: Path to server configuration file (default: server_config.json)
--provider: LLM provider (default: ollama)
--model: Specific model to use (default: gpt-oss for Ollama)
--disable-filesystem: Disable filesystem access (default: enabled)
--api-base: Override API endpoint URL
--api-key: Override API key (not needed for Ollama)
--token-backend: Override token storage backend (auto, keychain, windows, secretservice, encrypted, vault)
--verbose: Enable detailed logging
--quiet: Suppress non-essential output
--log-file: Write debug logs to a rotating file (secrets auto-redacted)
--vm: [Experimental] Enable AI virtual memory for context management
--vm-budget: Token budget for conversation events in VM mode (default: 128000, on top of system prompt)
--vm-mode: VM mode — passive (default), relaxed, or strict
--dashboard: Launch a real-time browser dashboard UI alongside chat mode
--attach: Attach files to the first message (repeatable: --attach img.png --attach code.py)
--plan-tools: Enable model-driven planning — the LLM autonomously creates and executes multi-step plans
--no-tools: Disable MCP tool calling entirely — chat directly with the LLM without connecting to any MCP servers

Environment Variables

# Override defaults
export LLM_PROVIDER=ollama              # Default provider (already the default)
export LLM_MODEL=gpt-oss                # Default model (already the default)

# For cloud providers (optional)
export OPENAI_API_KEY=sk-...           # For GPT-5, GPT-4, O3 models
export ANTHROPIC_API_KEY=sk-ant-...    # For Claude 4.5, Claude 3.5
export AZURE_OPENAI_API_KEY=sk-...     # For enterprise GPT-5
export AZURE_OPENAI_ENDPOINT=https://...
export GEMINI_API_KEY=...              # For Gemini models
export GROQ_API_KEY=...                # For Groq fast inference

# Tool configuration
export MCP_TOOL_TIMEOUT=120            # Tool execution timeout (seconds)

🌐 Available Modes

1. Chat Mode (Default)

Provides a natural language interface with streaming responses and automatic tool usage:

# Default mode with Ollama/gpt-oss reasoning model (no API key needed)
mcp-cli --server sqlite

# See the AI's thinking process with reasoning models
mcp-cli --server sqlite --model gpt-oss     # Open-source reasoning
mcp-cli --server sqlite --provider openai --model gpt-5  # GPT-5 reasoning
mcp-cli --server sqlite --provider anthropic --model claude-4-5-opus  # Claude 4.5 reasoning

# Use different local models
mcp-cli --server sqlite --model llama3.3
mcp-cli --server sqlite --model qwen2.5-coder

# Switch to cloud providers (requires API keys)
mcp-cli chat --server sqlite --provider openai --model gpt-5
mcp-cli chat --server sqlite --provider anthropic --model claude-4-5-sonnet

# Launch with real-time browser dashboard
mcp-cli --server sqlite --dashboard

# Attach files to the first message
mcp-cli --server sqlite --attach image.png --attach data.csv

2. Interactive Mode

Command-driven shell interface for direct server operations:

mcp-cli interactive --server sqlite

# With specific models
mcp-cli interactive --server sqlite --model gpt-oss       # Local reasoning
mcp-cli interactive --server sqlite --provider openai --model gpt-5  # Cloud GPT-5

3. Command Mode

Unix-friendly interface for automation and scripting:

# Process text with reasoning models
mcp-cli cmd --server sqlite --model gpt-oss --prompt "Think through this step by step" --input data.txt

# Use GPT-5 for complex reasoning
mcp-cli cmd --server sqlite --provider openai --model gpt-5 --prompt "Analyze this data" --input data.txt

# Execute tools directly
mcp-cli cmd --server sqlite --tool list_tables --output tables.json

# Pipeline-friendly processing
echo "SELECT * FROM users LIMIT 5" | mcp-cli cmd --server sqlite --tool read_query --input -

4. Direct Commands

Execute individual commands without entering interactive mode:

# List available tools
mcp-cli tools --server sqlite

# Show provider configuration
mcp-cli provider list

# Show available models for current provider
mcp-cli models

# Show models for specific provider
mcp-cli models openai    # Shows GPT-5, GPT-4, O3 models
mcp-cli models anthropic # Shows Claude 4.5, Claude 3.5 models
mcp-cli models ollama    # Shows gpt-oss, llama3.3, etc.

# Ping servers
mcp-cli ping --server sqlite

# List resources
mcp-cli resources --server sqlite

# UI Theme Management
mcp-cli theme                     # Show current theme and list available
mcp-cli theme dark                # Switch to dark theme
mcp-cli theme --select            # Interactive theme selector
mcp-cli theme --list              # List all available themes

# Token Storage Management
mcp-cli token backends            # Show available storage backends
mcp-cli --token-backend encrypted token list  # Use specific backend

🌐 MCP Apps (Interactive Browser UIs)

MCP Apps allow tool servers to provide interactive HTML UIs that render in your browser. When a tool has a _meta.ui annotation pointing to a UI resource, mcp-cli automatically launches a local web server and opens the app in your browser.

Prerequisites

# Install the apps extra (adds websockets dependency)
pip install "mcp-cli[apps]"

How It Works

Connect to an MCP server that provides app-enabled tools
Call a tool that has _meta.ui metadata (e.g., show_chart, show_table)
mcp-cli automatically fetches the UI resource, starts a local server, and opens your browser
The app receives tool results in real-time via WebSocket

Example

# Connect to a server with app-enabled tools
mcp-cli --server view_demo

# In chat, ask for something visual:
> Show me the sales data as a chart
# Browser opens automatically with an interactive chart

# The /tools command shows which tools have app UIs (APP column)
> /tools

Architecture

Host page serves a sandboxed iframe with the app HTML
WebSocket bridge proxies JSON-RPC between the browser and MCP servers
Security: Iframe sandbox, CSP protection, XSS prevention, URL scheme validation
Reliability: Message queuing during disconnects, exponential backoff reconnection, deferred tool result delivery

See MCP Apps Documentation for the full guide.

🤖 Using Chat Mode

Chat mode provides the most advanced interface with streaming responses and intelligent tool usage.

Starting Chat Mode

# Simple startup with default reasoning model (gpt-oss)
mcp-cli --server sqlite

# Multiple servers
mcp-cli --server sqlite,filesystem

# With advanced reasoning models
mcp-cli --server sqlite --provider openai --model gpt-5
mcp-cli --server sqlite --provider anthropic --model claude-4-5-opus

Chat Commands (Slash Commands)

Provider & Model Management

/provider                           # Show current configuration (default: ollama)
/provider list                      # List all providers
/provider config                    # Show detailed configuration
/provider diagnostic               # Test provider connectivity
/provider set ollama api_base http://localhost:11434  # Configure Ollama endpoint
/provider openai                   # Switch to OpenAI (requires API key)
/provider anthropic                # Switch to Anthropic (requires API key)
/provider openai gpt-5             # Switch to OpenAI GPT-5

# Custom Provider Management
/provider custom                   # List custom providers
/provider add localai http://localhost:8080/v1 gpt-4  # Add custom provider
/provider remove localai           # Remove custom provider

/model                             # Show current model (default: gpt-oss)
/model llama3.3                    # Switch to different Ollama model
/model gpt-5                       # Switch to GPT-5 (if using OpenAI)
/model claude-4-5-opus             # Switch to Claude 4.5 (if using Anthropic)
/models                            # List available models for current provider

Tool Management

/tools                             # List available tools
/tools --all                       # Show detailed tool information
/tools --raw                       # Show raw JSON definitions
/tools call                        # Interactive tool execution

/toolhistory                       # Show tool execution history
/th -n 5                          # Last 5 tool calls
/th 3                             # Details for call #3
/th --json                        # Full history as JSON

Server Management (Runtime Configuration)

/server                            # List all configured servers
/server list                       # List servers (alias)
/server list all                   # Include disabled servers

# Add servers at runtime (persists in ~/.mcp-cli/preferences.json)
/server add <name> stdio <command> [args...]
/server add sqlite stdio uvx mcp-server-sqlite --db-path test.db
/server add playwright stdio npx @playwright/mcp@latest
/server add time stdio uvx mcp-server-time
/server add fs stdio npx @modelcontextprotocol/server-filesystem /path/to/dir

# HTTP/SSE server examples with authentication
/server add github --transport http --header "Authorization: Bearer ghp_token" -- https://api.github.com/mcp
/server add myapi --transport http --env API_KEY=secret -- https://api.example.com/mcp
/server add events --transport sse -- https://events.example.com/sse

# Manage server state
/server enable <name>              # Enable a disabled server
/server disable <name>             # Disable without removing
/server remove <name>              # Remove user-added server
/server ping <name>                # Test server connectivity

# Server details
/server <name>                     # Show server configuration details

Note: Servers added via /server add are stored in ~/.mcp-cli/preferences.json and persist across sessions. Project servers remain in server_config.json.

Multi-Modal Attachments

/attach image.png                  # Stage an image for the next message
/attach code.py                    # Stage a text file
/attach list                       # Show currently staged files
/attach clear                      # Clear staged files
/file data.csv                     # Alias for /attach
/image screenshot.heic             # Alias for /attach

# Inline file references (in any message)
@file:screenshot.png describe what you see
@file:data.csv summarize this data

# Image URLs are auto-detected
https://example.com/photo.jpg what is in this image?

Conversation Management

/conversation                      # Show conversation history
/ch -n 10                         # Last 10 messages
/ch 5                             # Details for message #5
/ch --json                        # Full history as JSON

/save conversation.json            # Save conversation to file
/compact                          # Summarize conversation
/clear                            # Clear conversation history
/cls                              # Clear screen only

UI Customization

/theme                            # Interactive theme selector with preview
/theme dark                       # Switch to dark theme
/theme monokai                    # Switch to monokai theme

# Available themes: default, dark, light, minimal, terminal, monokai, dracula, solarized
# Themes are persisted across sessions

Token Management

/token                            # List all stored tokens
/token list                       # List all tokens explicitly
/token set <name>                 # Store a bearer token
/token get <name>                 # Get token details
/token delete <name>              # Delete a token
/token clear                      # Clear all tokens (with confirmation)
/token backends                   # Show available storage backends

# Examples
/token set my-api                 # Prompts for token value (secure)
/token get notion --oauth         # Get OAuth token for Notion server
/token list --api-keys            # List only provider API keys

Token Storage Backends: MCP CLI supports multiple secure token storage backends:

Keychain (macOS) - Uses macOS Keychain (default on macOS)
Windows Credential Manager - Native Windows storage (default on Windows)
Secret Service - Linux desktop keyring (GNOME/KDE)
Encrypted File - AES-256 encrypted local files (cross-platform fallback)
HashiCorp Vault - Enterprise secret management

Override the default backend with --token-backend:

# Use encrypted file storage instead of keychain
mcp-cli --token-backend encrypted token list

# Use vault for enterprise environments
mcp-cli --token-backend vault token list

See Token Management Guide for comprehensive documentation.

Session Control

/verbose                          # Toggle verbose/compact display (Default: Enabled)
/confirm                          # Toggle tool call confirmation (Default: Enabled)
/interrupt                        # Stop running operations
/server                           # Manage MCP servers (see Server Management above)
/help                            # Show all commands
/help tools                       # Help for specific command
/exit                            # Exit chat mode

For complete command documentation, see Commands System Guide.

Chat Features

Streaming Responses with Reasoning Visibility

🧠 Reasoning Models: See the AI's thinking process with gpt-oss, GPT-5, Claude 4
Real-time Generation: Watch text appear token by token
Performance Metrics: Words/second, response time
Graceful Interruption: Ctrl+C to stop streaming
Progressive Rendering: Markdown formatted as it streams

Tool Execution

Automatic tool discovery and usage
Concurrent execution with progress indicators
Verbose and compact display modes
Complete execution history and timing

Multi-Modal Attachments

Attach images, text files, and audio to any message
/attach command with staging, list, and clear (aliases: /file, /image)
Inline @file:path references in any message
--attach CLI flag for first-message attachments
Browser "+" button with drag-and-drop and clipboard paste (with --dashboard)
Dashboard renders thumbnails, text previews, and audio players

Provider Integration

Seamless switching between providers
Model-specific optimizations
API key and endpoint management
Health monitoring and diagnostics

🖥️ Using Interactive Mode

Interactive mode provides a command shell for direct server interaction.

Starting Interactive Mode

mcp-cli interactive --server sqlite

Interactive Commands

help                              # Show available commands
exit                              # Exit interactive mode
clear                             # Clear terminal

# Provider management
provider                          # Show current provider
provider list                     # List providers
provider anthropic                # Switch provider
provider openai gpt-5             # Switch to GPT-5

# Model management
model                             # Show current model
model gpt-oss                     # Switch to reasoning model
model claude-4-5-opus             # Switch to Claude 4.5
models                            # List available models

# Tool operations
tools                             # List tools
tools --all                       # Detailed tool info
tools call                        # Interactive tool execution

# Server operations
servers                           # List servers
ping                              # Ping all servers
resources                         # List resources
prompts                           # List prompts

📄 Using Command Mode

Command mode provides Unix-friendly automation capabilities.

Command Mode Options

--input FILE                      # Input file (- for stdin)
--output FILE                     # Output file (- for stdout)
--prompt TEXT                     # Prompt template
--tool TOOL                       # Execute specific tool
--tool-args JSON                  # Tool arguments as JSON
--system-prompt TEXT              # Custom system prompt
--raw                             # Raw output without formatting
--single-turn                     # Disable multi-turn conversation
--max-turns N                     # Maximum conversation turns

Examples

# Text processing with reasoning models
echo "Analyze this data" | mcp-cli cmd --server sqlite --model gpt-oss --input - --output analysis.txt

# Use GPT-5 for complex analysis
mcp-cli cmd --server sqlite --provider openai --model gpt-5 --prompt "Provide strategic analysis" --input report.txt

# Tool execution
mcp-cli cmd --server sqlite --tool list_tables --raw

# Complex queries
mcp-cli cmd --server sqlite --tool read_query --tool-args '{"query": "SELECT COUNT(*) FROM users"}'

# Batch processing with GNU Parallel
ls *.txt | parallel mcp-cli cmd --server sqlite --input {} --output {}.summary --prompt "Summarize: {{input}}"

🔧 Provider Configuration

Ollama Configuration (Default)

Ollama runs locally by default on http://localhost:11434. MCP CLI v0.11.1+ with CHUK-LLM v0.16+ includes llama.cpp integration that automatically discovers and reuses Ollama's downloaded models for 1.53x faster inference (311 vs 204 tokens/sec) without re-downloading.

To use reasoning and other models:

# Pull reasoning and other models for Ollama
ollama pull gpt-oss          # Default reasoning model
ollama pull llama3.3         # Latest Llama
ollama pull llama3.2         # Llama 3.2
ollama pull qwen3            # Qwen 3
ollama pull qwen2.5-coder    # Coding-focused
ollama pull deepseek-coder   # DeepSeek coder
ollama pull granite3.3       # IBM Granite
ollama pull mistral          # Mistral
ollama pull gemma3           # Google Gemma
ollama pull phi3             # Microsoft Phi
ollama pull codellama        # Code Llama

# List available Ollama models
ollama list

# Configure remote Ollama server
mcp-cli provider set ollama api_base http://remote-server:11434

Cloud Provider Configuration

To use cloud providers with advanced models, configure API keys:

# Configure OpenAI (for GPT-5, GPT-4, O3 models)
mcp-cli provider set openai api_key sk-your-key-here

# Configure Anthropic (for Claude 4.5, Claude 3.5)
mcp-cli provider set anthropic api_key sk-ant-your-key-here

# Configure Azure OpenAI (for enterprise GPT-5)
mcp-cli provider set azure_openai api_key sk-your-key-here
mcp-cli provider set azure_openai api_base https://your-resource.openai.azure.com

# Configure other providers
mcp-cli provider set gemini api_key your-gemini-key
mcp-cli provider set groq api_key your-groq-key

# Test configuration
mcp-cli provider diagnostic openai
mcp-cli provider diagnostic anthropic

Custom OpenAI-Compatible Providers

MCP CLI supports adding custom OpenAI-compatible providers (LocalAI, custom proxies, etc.):

# Add a custom provider (persisted across sessions)
mcp-cli provider add localai http://localhost:8080/v1 gpt-4 gpt-3.5-turbo
mcp-cli provider add myproxy https://proxy.example.com/v1 custom-model-1 custom-model-2

# Set API key via environment variable (never stored in config)
export LOCALAI_API_KEY=your-api-key
export MYPROXY_API_KEY=your-api-key

# List custom providers
mcp-cli provider custom

# Use custom provider
mcp-cli --provider localai --server sqlite
mcp-cli --provider myproxy --model custom-model-1 --server sqlite

# Remove custom provider
mcp-cli provider remove localai

# Runtime provider (session-only, not persisted)
mcp-cli --provider temp-ai --api-base https://api.temp.com/v1 --api-key test-key --server sqlite

Security Note: API keys can be stored securely in OS-native keychains (macOS Keychain, Windows Credential Manager, Linux Secret Service) or HashiCorp Vault using the token management system. Alternatively, use environment variables following the pattern {PROVIDER_NAME}_API_KEY or pass via --api-key for session-only use. See Token Management for details.

Manual Configuration

The chuk_llm library configuration in ~/.chuk_llm/config.yaml:

ollama:
  api_base: http://localhost:11434
  default_model: gpt-oss

openai:
  api_base: https://api.openai.com/v1
  default_model: gpt-5

anthropic:
  api_base: https://api.anthropic.com
  default_model: claude-4-5-opus

azure_openai:
  api_base: https://your-resource.openai.azure.com
  default_model: gpt-5

gemini:
  api_base: https://generativelanguage.googleapis.com
  default_model: gemini-2.0-flash

groq:
  api_base: https://api.groq.com
  default_model: llama-3.1-70b

API keys can be provided via:

Secure token storage (recommended) - Stored in OS keychain/Vault, see Token Management
Environment variables - Export in your shell or add to ~/.chuk_llm/.env:

OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
AZURE_OPENAI_API_KEY=sk-your-azure-key-here
GEMINI_API_KEY=your-gemini-key
GROQ_API_KEY=your-groq-key

Command-line - Pass --api-key for session-only use (not persisted)

📂 Server Configuration

MCP CLI supports two types of server configurations:

Project Servers (server_config.json): Shared project-level configurations
User Servers (~/.mcp-cli/preferences.json): Personal runtime-added servers that persist across sessions

Configuration File Discovery

MCP CLI searches for server_config.json in the following priority order:

Explicit path via --config-file option:

mcp-cli --config-file /path/to/custom-config.json

Current directory - Automatically detected when running from a project directory:

cd /path/to/my-project
mcp-cli --server sqlite    # Uses ./server_config.json if it exists

Bundled default - When running via uvx or from any directory without a local config:
```
uvx mcp-cli --server cloudflare_workers    # Uses packaged server_config.json
```

This means you can:

Override per-project: Place a server_config.json in your project directory with project-specific server configurations
Use defaults globally: Run uvx mcp-cli from anywhere and get the bundled default servers
Customize explicitly: Use --config-file to specify any configuration file location

Bundled Default Servers

MCP CLI v0.11.1+ comes with an expanded set of pre-configured servers in the bundled server_config.json:

Server	Type	Description	Configuration
sqlite	STDIO	SQLite database operations	`uvx mcp-server-sqlite --db-path test.db`
echo	STDIO	Echo server for testing	`uvx chuk-mcp-echo stdio`
math	STDIO	Mathematical computations	`uvx chuk-mcp-math-server`
playwright	STDIO	Browser automation	`npx @playwright/mcp@latest`
brave_search	STDIO	Web search via Brave API	Requires `BRAVE_API_KEY` token
notion	HTTP	Notion workspace integration	`https://mcp.notion.com/mcp` (OAuth)
cloudflare_workers	HTTP	Cloudflare Workers bindings	`https://bindings.mcp.cloudflare.com/mcp` (OAuth)
monday	HTTP	Monday.com integration	`https://mcp.monday.com/mcp` (OAuth)
linkedin	HTTP	LinkedIn integration	`https://linkedin.chukai.io/mcp`
weather	HTTP	Weather data service	`https://weather.chukai.io/mcp`

Note: HTTP servers and API-based servers require authentication. Use the Token Management system to configure access tokens.

To use these servers:

# Use bundled servers from anywhere
uvx mcp-cli --server sqlite
uvx mcp-cli --server echo
uvx mcp-cli --server math
uvx mcp-cli --server playwright

# API-based servers require tokens
mcp-cli token set brave_search --type bearer
uvx mcp-cli --server brave_search

# HTTP/OAuth servers require OAuth authentication
uvx mcp-cli token set notion --oauth
uvx mcp-cli --server notion

# Use multiple servers simultaneously
uvx mcp-cli --server sqlite,math,playwright

Project Configuration

Create a server_config.json file with your MCP server configurations:

{
  "mcpServers": {
    "sqlite": {
      "command": "python",
      "args": ["-m", "mcp_server.sqlite_server"],
      "env": {
        "DATABASE_PATH": "database.db"
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed/files"],
      "env": {}
    },
    "brave-search": {
      "command": "npx",
      "args": ["-y", "@brave/brave-search-mcp-server"],
      "env": {
        "BRAVE_API_KEY": "${TOKEN:bearer:brave_search}"
      }
    },
    "notion": {
      "url": "https://mcp.notion.com/mcp",
      "headers": {
        "Authorization": "Bearer ${TOKEN:bearer:notion}"
      }
    }
  }
}

Secure Token Replacement

MCP CLI supports automatic token replacement from secure storage using the ${TOKEN:namespace:name} syntax:

Syntax: ${TOKEN:<namespace>:<token-name>}

Examples:

{
  "mcpServers": {
    "brave-search": {
      "command": "npx",
      "args": ["-y", "@brave/brave-search-mcp-server"],
      "env": {
        "BRAVE_API_KEY": "${TOKEN:bearer:brave_search}"
      }
    },
    "api-server": {
      "url": "https://api.example.com/mcp",
      "headers": {
        "Authorization": "Bearer ${TOKEN:bearer:my_api}",
        "X-API-Key": "${TOKEN:api-key:my_service}"
      }
    }
  }
}

Token Storage:

# Store tokens securely (never in config files!)
mcp-cli token set brave_search --type bearer
# Enter token value when prompted (hidden input)

mcp-cli token set my_api --type bearer --value "your-token-here"

# Tokens are stored in OS-native secure storage:
# - macOS: Keychain
# - Windows: Credential Manager
# - Linux: Secret Service (GNOME Keyring/KWallet)

Supported Locations:

env: Environment variables for STDIO servers
headers: HTTP headers for HTTP/SSE servers

Namespaces:

bearer: Bearer tokens (default for --type bearer)
api-key: API keys (default for --type api-key)
oauth: OAuth tokens (automatic)
generic: Custom tokens

Benefits:

✅ Never store API keys in config files
✅ Share server_config.json safely (no secrets)
✅ Tokens encrypted in OS-native secure storage
✅ Works across all transport types (STDIO, HTTP, SSE)

See Token Management Guide for complete documentation.

Runtime Server Management

Add servers dynamically during runtime without editing configuration files:

# Add STDIO servers (most common)
mcp-cli
> /server add sqlite stdio uvx mcp-server-sqlite --db-path mydata.db
> /server add playwright stdio npx @playwright/mcp@latest
> /server add time stdio uvx mcp-server-time

# Add HTTP servers with authentication
> /server add github --transport http --header "Authorization: Bearer ghp_token" -- https://api.github.com/mcp
> /server add myapi --transport http --env API_KEY=secret -- https://api.example.com/mcp

# Add SSE (Server-Sent Events) servers
> /server add events --transport sse -- https://events.example.com/sse

# Manage servers
> /server list                     # Show all servers
> /server disable sqlite           # Temporarily disable
> /server enable sqlite            # Re-enable
> /server remove myapi             # Remove user-added server

Key Points:

User-added servers persist in ~/.mcp-cli/preferences.json
Survive application restarts
Can be enabled/disabled without removal
Support STDIO, HTTP, and SSE transports
Environment variables and headers for authentication

📈 Advanced Usage Examples

Reasoning Model Comparison

# Compare reasoning across different models
> /provider ollama
> /model gpt-oss
> Think through this problem step by step: If a train leaves New York at 3 PM...
[See the complete thinking process with gpt-oss]

> /provider openai
> /model gpt-5
> Think through this problem step by step: If a train leaves New York at 3 PM...
[See GPT-5's reasoning approach]

> /provider anthropic
> /model claude-4-5-opus
> Think through this problem step by step: If a train leaves New York at 3 PM...
[See Claude 4.5's analytical process]

Local-First Workflow with Reasoning

# Start with default Ollama/gpt-oss (no API key needed)
mcp-cli chat --server sqlite

# Use reasoning model for complex problems
> Think through this database optimization problem step by step
[gpt-oss shows its complete thinking process before answering]

# Try different local models for different tasks
> /model llama3.3              # General purpose
> /model qwen2.5-coder         # For coding tasks
> /model deepseek-coder        # Alternative coding model
> /model granite3.3            # IBM's model
> /model gpt-oss               # Back to reasoning model

# Switch to cloud when needed (requires API keys)
> /provider openai
> /model gpt-5
> Complex enterprise architecture design...

> /provider anthropic
> /model claude-4-5-opus
> Detailed strategic analysis...

> /provider ollama
> /model gpt-oss
> Continue with local processing...

Multi-Provider Workflow

# Start with local reasoning (default, no API key)
mcp-cli chat --server sqlite

# Compare responses across providers
> /provider ollama
> What's the best way to optimize this SQL query?

> /provider openai gpt-5        # Requires API key
> What's the best way to optimize this SQL query?

> /provider anthropic claude-4-5-sonnet  # Requires API key
> What's the best way to optimize this SQL query?

# Use each provider's strengths
> /provider ollama gpt-oss      # Local reasoning, privacy
> /provider openai gpt-5        # Advanced reasoning
> /provider anthropic claude-4-5-opus  # Deep analysis
> /provider groq llama-3.1-70b  # Ultra-fast responses

Complex Tool Workflows with Reasoning

# Use reasoning model for complex database tasks
> /model gpt-oss
> I need to analyze our database performance. Think through what we should check first.
[gpt-oss shows thinking: "First, I should check the table structure, then indexes, then query patterns..."]
[Tool: list_tables] → products, customers, orders

> Now analyze the indexes and suggest optimizations
[gpt-oss thinks through index analysis]
[Tool: describe_table] → Shows current indexes
[Tool: read_query] → Analyzes query patterns

> Create an optimization plan based on your analysis
[Complete reasoning process followed by specific recommendations]

Automation and Scripting

# Batch processing with different models
for file in data/*.csv; do
  # Use reasoning model for analysis
  mcp-cli cmd --server sqlite \
    --model gpt-oss \
    --prompt "Analyze this data and think through patterns" \
    --input "$file" \
    --output "analysis/$(basename "$file" .csv)_reasoning.txt"
  
  # Use coding model for generating scripts
  mcp-cli cmd --server sqlite \
    --model qwen2.5-coder \
    --prompt "Generate Python code to process this data" \
    --input "$file" \
    --output "scripts/$(basename "$file" .csv)_script.py"
done

# Pipeline with reasoning
cat complex_problem.txt | \
  mcp-cli cmd --model gpt-oss --prompt "Think through this step by step" --input - | \
  mcp-cli cmd --model llama3.3 --prompt "Summarize the key points" --input - > solution.txt

Performance Monitoring

# Check provider and model performance
> /provider diagnostic
Provider Diagnostics
Provider      | Status      | Response Time | Features      | Models
ollama        | ✅ Ready    | 56ms         | 📡🔧         | gpt-oss, llama3.3, qwen3, ...
openai        | ✅ Ready    | 234ms        | 📡🔧👁️      | gpt-5, gpt-4o, o3, ...
anthropic     | ✅ Ready    | 187ms        | 📡🔧         | claude-4-5-opus, claude-4-5-sonnet, ...
azure_openai  | ✅ Ready    | 198ms        | 📡🔧👁️      | gpt-5, gpt-4o, ...
gemini        | ✅ Ready    | 156ms        | 📡🔧👁️      | gemini-2.0-flash, ...
groq          | ✅ Ready    | 45ms         | 📡🔧         | llama-3.1-70b, ...

# Check available models
> /models
Models for ollama (Current Provider)
Model                | Status
gpt-oss             | Current & Default (Reasoning)
llama3.3            | Available
llama3.2            | Available
qwen2.5-coder       | Available
deepseek-coder      | Available
granite3.3          | Available
... and 6 more

# Monitor tool execution with reasoning
> /verbose
> /model gpt-oss
> Analyze the database and optimize the slowest queries
[Shows complete thinking process]
[Tool execution with timing]

🔍 Troubleshooting

Common Issues

Ollama not running (default provider):

# Start Ollama service
ollama serve

# Or check if it's running
curl http://localhost:11434/api/tags

Model not found:

# For Ollama (default), pull the model first
ollama pull gpt-oss      # Reasoning model
ollama pull llama3.3     # Latest Llama
ollama pull qwen2.5-coder # Coding model

# List available models
ollama list

# For cloud providers, check supported models
mcp-cli models openai     # Shows GPT-5, GPT-4, O3 models
mcp-cli models anthropic  # Shows Claude 4.5, Claude 3.5 models

Provider not found or API key missing:

# Check available providers
mcp-cli provider list

# For cloud providers, set API keys
mcp-cli provider set openai api_key sk-your-key
mcp-cli provider set anthropic api_key sk-ant-your-key

# Test connection
mcp-cli provider diagnostic openai

Connection issues with Ollama:

# Check Ollama is running
ollama list

# Test connection
mcp-cli provider diagnostic ollama

# Configure custom endpoint if needed
mcp-cli provider set ollama api_base http://localhost:11434

Debug Mode

Enable verbose logging for troubleshooting:

mcp-cli --verbose chat --server sqlite
mcp-cli --log-level DEBUG interactive --server sqlite

# Write debug logs to a rotating file (secrets are automatically redacted)
mcp-cli --log-file ~/.mcp-cli/logs/debug.log --server sqlite

🔒 Security Considerations

Privacy & Local-First

Local by Default: Ollama with gpt-oss runs locally, keeping your data private
No Cloud Required: Full functionality without external API dependencies

Token & Authentication Security

Secure Token Storage: Tokens stored in OS-native credential stores (macOS Keychain, Windows Credential Manager, Linux Secret Service) under the "mcp-cli" service identifier
Multiple Storage Backends: Choose between keychain, encrypted files, or HashiCorp Vault based on security requirements
API Keys: Only needed for cloud providers (OpenAI, Anthropic, etc.), stored securely using token management system
OAuth 2.0 Support: Secure authentication for MCP servers using PKCE and resource indicators (RFC 7636, RFC 8707)

Log Security

Secret Redaction: All log output (console and file) is automatically redacted for Bearer tokens, API keys (sk-*), OAuth access tokens, and Authorization headers
Rotating File Logs: Optional --log-file with JSON format, 10MB rotation, and 3 backup files

Execution Security

Tool Validation: All tool calls are validated before execution
Timeout Protection: Configurable timeouts prevent hanging operations (v0.13+)
Circuit Breakers: Automatic failure detection and recovery to prevent cascading failures (v0.13+)
Server Isolation: Each server runs in its own process
File Access: Filesystem access can be disabled with --disable-filesystem
Transport Monitoring: Automatic detection of connection failures with warnings (v0.11+)

MCP Apps Security

Iframe Sandbox: Apps run in sandboxed iframes with restricted permissions
Content Security Policy: Server-supplied CSP domains are validated and sanitized
XSS Prevention: Tool names and user-supplied content are HTML-escaped before template injection
URL Scheme Validation: ui/open-link only allows http:// and https:// schemes
Tool Name Validation: Bridge rejects tool names not matching the MCP spec character set

🚀 Performance Features

LLM Provider Performance (v0.16+)

52x Faster Imports: Reduced from 735ms to 14ms through lazy loading
112x Faster Client Creation: Automatic thread-safe caching
llama.cpp Integration: 1.53x faster inference (311 vs 204 tokens/sec) with automatic Ollama model reuse
Dynamic Model Discovery: Zero overhead capability-based model selection

Tool Execution Performance (v0.13+)

Production Middleware: Timeouts, retries with exponential backoff, circuit breakers, and result caching
Concurrent Tool Execution: Multiple tools can run simultaneously with proper coordination
Connection Health Monitoring: Automatic detection and recovery from transport failures
Optimized Tool Manager: Reduced from 2000+ to ~800 lines while maintaining all functionality

Runtime Performance

Local Processing: Default Ollama provider minimizes latency
Reasoning Visibility: See AI thinking process with gpt-oss, GPT-5, Claude 4
Streaming Responses: Real-time response generation
Connection Pooling: Efficient reuse of client connections
Caching: Tool metadata and provider configurations are cached
Async Architecture: Non-blocking operations throughout

📦 Dependencies

Core dependencies are organized into feature groups:

cli: Terminal UI and command framework (Rich, Typer, chuk-term)
dev: Development tools, testing utilities, linting
chuk-tool-processor v0.22+: Production-grade tool execution with middleware, multiple execution strategies, and observability
chuk-llm v0.17+: Unified LLM provider with dynamic model discovery, capability-based selection, and llama.cpp integration
chuk-term: Enhanced terminal UI with themes, prompts, and cross-platform support

Install with specific features:

pip install "mcp-cli[cli]"        # Basic CLI features
pip install "mcp-cli[cli,dev]"    # CLI with development tools
pip install "mcp-cli[apps]"       # MCP Apps (interactive browser UIs)

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

git clone https://github.com/chrishayuk/mcp-cli
cd mcp-cli
pip install -e ".[cli,dev]"
pre-commit install

Demo Scripts

Explore the capabilities of MCP CLI:

# Command Mode Demos

# General cmd mode features (bash)
bash examples/cmd_mode_demo.sh

# LLM integration with cmd mode (bash)
bash examples/cmd_mode_llm_demo.sh

# Python integration example
uv run examples/cmd_mode_python_demo.py

# Custom Provider Management Demos

# Interactive walkthrough demo (educational)
uv run examples/custom_provider_demo.py

# Working demo with actual inference (requires OPENAI_API_KEY)
uv run examples/custom_provider_working_demo.py

# Simple shell script demo (requires OPENAI_API_KEY)
bash examples/custom_provider_simple_demo.sh

# Terminal management features (chuk-term)
uv run examples/ui_terminal_demo.py

# Output system with themes (chuk-term)
uv run examples/ui_output_demo.py

# Streaming UI capabilities (chuk-term)
uv run examples/ui_streaming_demo.py

Running Tests

pytest
pytest --cov=mcp_cli --cov-report=html

📜 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

CHUK Tool Processor - Production-grade async tool execution with middleware and observability
CHUK-LLM - Unified LLM provider with dynamic model discovery, llama.cpp integration, and GPT-5/Claude 4.5 support (v0.17+)
CHUK-Term - Enhanced terminal UI with themes and cross-platform support
Rich - Beautiful terminal formatting
Typer - CLI framework
Prompt Toolkit - Interactive input

🔗 Related Projects

Model Context Protocol - Core protocol specification
MCP Servers - Official MCP server implementations
CHUK Tool Processor - Production-grade tool execution with middleware and observability
CHUK-LLM - LLM provider abstraction with dynamic model discovery, GPT-5, Claude 4.5, O3 series support, and llama.cpp integration
CHUK-Term - Terminal UI library with themes and cross-platform support

Как установить

Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.

{
  "mcpServers": {
    "chrishayuk-mcp-cli": {
      "command": "npx",
      "args": []
    }
  }
}

Command Palette

chrishayuk/mcp-cli

Описание

README

🆕 Recent Updates (v0.16)

AI Virtual Memory (Experimental)

Execution Plans (Tier 6)

MCP Apps (SEP-1865)

Production Hardening

Performance & Polish

Dashboard (Real-Time Browser UI)

Multi-Modal Attachments

Code Quality

🔄 Architecture Overview

🌟 Features

Multiple Operational Modes

Advanced Chat Interface

Comprehensive Provider Support

Robust Tool System (Powered by CHUK Tool Processor v0.22+)

MCP Apps (Interactive UIs)

Execution Plans (Powered by chuk-ai-planner)

Advanced Configuration Management

Enhanced User Experience

📚 Documentation

Project

Core Documentation

Specialized Documentation

UI Documentation

Testing Documentation

📋 Prerequisites

🚀 Installation

Quick Start with Ollama (Default)

Using Different Models

🧰 Global Configuration

Default Configuration

Command-line Arguments

Environment Variables

🌐 Available Modes

1. Chat Mode (Default)

2. Interactive Mode

3. Command Mode

4. Direct Commands

🌐 MCP Apps (Interactive Browser UIs)

Prerequisites

How It Works

Example

Architecture

🤖 Using Chat Mode

Starting Chat Mode

Chat Commands (Slash Commands)

Provider & Model Management

Tool Management

Server Management (Runtime Configuration)

Multi-Modal Attachments

Conversation Management

UI Customization

Token Management

Session Control

Chat Features

Streaming Responses with Reasoning Visibility

Tool Execution

Multi-Modal Attachments

Provider Integration

🖥️ Using Interactive Mode

Starting Interactive Mode

Interactive Commands

📄 Using Command Mode

Command Mode Options

Examples

🔧 Provider Configuration

Ollama Configuration (Default)

Cloud Provider Configuration

Custom OpenAI-Compatible Providers

Manual Configuration

📂 Server Configuration

Configuration File Discovery

Bundled Default Servers

Project Configuration

Secure Token Replacement

Runtime Server Management