loading…
Search for a command to run...
loading…
MCP server for local semantic retrieval over a library of ebooks, using pplx-embed-context-v1 embeddings stored in Qdrant, enabling an LLM agent like Gemma 4 to
MCP server for local semantic retrieval over a library of ebooks, using pplx-embed-context-v1 embeddings stored in Qdrant, enabling an LLM agent like Gemma 4 to search and retrieve relevant passages for answering questions.
A local MCP server that indexes .epub files and exposes semantic search tools
backed by pplx-embed-context-v1 (late chunking) and Qdrant.
MCP client (Claude Desktop, Claude Code, …)
│
▼ tool calls via MCP
server.py
pplx-embed-context-v1-0.6b + Qdrant (local)
| File | Role |
|---|---|
server.py |
MCP server — epub ingestion, embedding, search, Qdrant storage |
code_librarian.py |
Separate MCP server for code (AST-based chunking) |
Uses late chunking: all chunks from a chapter go through a single forward pass, so each chunk embedding captures full document context without needing a doc-prefix at inference time. Scores 81.96 nDCG@10 on ConTEB.
# 1. install
pip install -e .
# 2. ingest your library (runs embedding, then exits)
HF_HUB_OFFLINE=0 python server.py --index ./library
# 3. start the MCP server
python server.py
# optional: preload the model at startup
python server.py --preload
| Tool | Description |
|---|---|
search(query, top_k) |
Semantic search, returns top-k passages with scores |
search_groups(query, group_by, limit, group_size) |
Search grouped by "book" or "chapter" — best for cross-volume research |
get_passage(book, chapter, max_chars) |
Retrieve full text for a book/chapter |
list_books() |
List all indexed books with chapter counts |
collection_stats() |
Qdrant collection info (point count, vector size) |
library_stats() |
Full content breakdown: books, chapters, chunks per book, avg chunk length |
get_device() |
Show which device (CPU/GPU) is used for embeddings |
{
"mcpServers": {
"little-librarian": {
"command": "python",
"args": ["/path/to/server.py"]
}
}
}
| Setup | Min VRAM |
|---|---|
| CPU only | 0 GB |
| GPU (pplx-embed-0.6b) | ~2 GB |
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"scholar-mcp": {
"command": "npx",
"args": []
}
}
}