loading…
Search for a command to run...
loading…
A Model Context Protocol (MCP) server that connects Claude to local Ollama models, enabling offloading of simpler tasks to save Claude tokens.
A Model Context Protocol (MCP) server that connects Claude to local Ollama models, enabling offloading of simpler tasks to save Claude tokens.
A Model Context Protocol (MCP) server that connects Claude to your local Ollama models, allowing you to offload simpler tasks and save your Claude tokens for complex work.
First, install Ollama on your system:
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
# Windows
# Download from https://ollama.ai/download
Start Ollama:
ollama serve
Download useful models for different tasks:
# General purpose models
ollama pull gpt-oss # OpenAI's open-weight model
ollama pull llama3.2 # Fast, capable model
ollama pull qwen2.5 # High-quality text generation
# Specialised models
ollama pull deepseek-coder # Code generation
ollama pull nomic-embed-text # Text embeddings
ollama pull llama3.2:1b # Lightweight for simple tasks
Create a new directory and install dependencies:
mkdir mcp-ollama-server
cd mcp-ollama-server
# Copy the files (index.ts, package.json, tsconfig.json)
# Then install dependencies:
npm install
# Create src directory and move index.ts there
mkdir src
mv index.ts src/
# Build the project
npm run build
Add the server to your Claude Desktop configuration:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"ollama": {
"command": "node",
"args": ["/absolute/path/to/your/mcp-ollama-server/dist/index.js"],
"env": {}
}
}
}
Restart Claude Desktop to load the new MCP server. You should see the Ollama tools available in Claude.
ollama_generate_textGenerate text for simple writing tasks, basic summaries, or straightforward content creation.
Best for: Simple writing, basic explanations, content generation
ollama_chatHave conversations with local models for Q&A, explanations, or dialogue-based tasks.
Best for: Q&A sessions, explanations, interactive tasks
ollama_embed_textGenerate text embeddings for semantic similarity, clustering, or search.
Best for: Document similarity, semantic search, clustering
ollama_code_generationGenerate code using specialised coding models.
Best for: Simple scripts, boilerplate code, basic programming tasks
ollama_summariseSummarise text content with different length options.
Best for: Document summaries, article condensation
ollama_list_modelsList all available models on your Ollama installation.
ollama_pull_modelDownload new models to Ollama.
Once configured, Claude can use these tools like this:
Text Generation:
"Use Ollama to generate a simple email template for customer onboarding"
Code Generation:
"Have DeepSeek Coder create a Python script to parse CSV files"
Embeddings:
"Generate embeddings for these document titles using Nomic"
Summarisation:
"Use Llama to create a brief summary of this article"
Edit src/index.ts to modify:
http://localhost:11434)You can override settings with environment variables:
export OLLAMA_BASE_URL=http://localhost:11434
export OLLAMA_TIMEOUT=300000
ollama servecurl http://localhost:11434/api/tagsollama listollama pull model-nameUse smaller models for simple tasks:
llama3.2:1b for basic text generationqwen2.5:0.5b for very simple tasksKeep frequently used models warm:
# Pre-load models to keep them in memory
ollama run llama3.2 "hello"
ollama run deepseek-coder "print hello"
Adjust temperature based on task:
npm run dev # Run with hot reload
npm run watch # Watch mode
npm run type-check # Check TypeScript types
npm run lint # Lint code
To add new capabilities:
setupHandlers()To support new model types:
Here's how you might use this in practice:
Initial analysis with Claude: "I need to analyse this dataset and create a comprehensive report"
Delegate simple tasks: "Use Ollama to generate basic descriptions for each data column"
Complex analysis with Claude: Claude does the sophisticated statistical analysis and insights
Offload summarisation: "Use Llama to summarise each section of findings"
Final review with Claude: Claude assembles everything into a polished report
This approach maximises your Claude token efficiency while still getting comprehensive results.
Feel free to extend this server with additional capabilities:
Выполни в терминале:
claude mcp add claude-sidekick -- npx Безопасность
Низкий рискАвтоматическая эвристика по публичным данным — не гарантия безопасности.