loading…
Search for a command to run...
loading…
A foundational implementation of a Model Context Protocol (MCP) server designed for educational purposes. It demonstrates the complete interaction between an LL
A foundational implementation of a Model Context Protocol (MCP) server designed for educational purposes. It demonstrates the complete interaction between an LLM, an inference engine, and a client during an agentic call.
8000 and 8080. Open 3 generously sized terminals on your screen.llama.cpp:git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
cmake -B build && cmake --build build --config Release -j 6
./llama-server -m ~/Downloads/Qwen3.5-4B-Q8_0.gguf --ctx-size 4096 --temp 1.0 --top-p 0.95 --top-k 20 --min-p 0.00 --verbose --webui-mcp-proxy
https://github.com/behavioral-ds/mcp-example && cd mcp-example
poetry install && poetry shellpython mcp_serve.pypython call.pyLLM <-> Inference engine <-> MCP <-> Client.Open llama web UI at http://localhost:8080/, go to settings and add a new MCP server:
Select "MCP prompt" when drafting a new message:
That's your @mcp.prompt() parsed into UI element, click it:
...and supply some meaningful content:
Then click "Use prompt" and rejoice:
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"mcp-101-example": {
"command": "npx",
"args": []
}
}
}