loading…
Search for a command to run...
loading…
A local Retrieval-Augmented Generation system that enables AI agents to query and retrieve information from document collections using ChromaDB vector search an
A local Retrieval-Augmented Generation system that enables AI agents to query and retrieve information from document collections using ChromaDB vector search and Ollama LLMs through a FastAPI interface.
A local, modular Retrieval-Augmented Generation (RAG) system using the Model Context Protocol (MCP) to connect an LLM to external tools like vector databases and document loaders.

This project implements an agentic RAG system that:
| Component | Tool/Library | Details |
|---|---|---|
| Language Model | Ollama | Local LLM inference (mistral, llama3, etc.) |
| Agent Framework | mcp + FastAPI | API server with tool registration |
| RAG Pipeline | LangChain + Custom | Context retrieval and prompt engineering |
| Vector Store | ChromaDB | Local, persistent vector database |
| Embeddings | SentenceTransformers | all-MiniLM-L6-v2 model |
| File Handling | pypdf, python-docx | PDF and document loading |
| Frontend (Optional) | Streamlit | Interactive web UI |
| Environment | Python 3.10+ | virtualenv or Conda |
agentic-rag-mcp/
├── main.py # FastAPI MCP server
├── rag_agent.py # Agent query logic and RAG orchestration
├── mcp_config.yaml # Configuration file
├── requirements.txt # Python dependencies
├── vector_store/ # Persisted ChromaDB vector store
├── data/
│ └── sample_docs/ # Sample documents for ingestion
└── tools/
└── chromadb_tool.py # Vector search tool implementation
cd agentic-rag-mcp
python -m venv .venv
# On Windows
.venv\Scripts\activate
# On macOS/Linux
source .venv/bin/activate
pip install -U pip
pip install -r requirements.txt
Download and install Ollama from the official website.
Start the Ollama server:
# On the system terminal (not in virtual environment)
ollama serve
In another terminal, pull a model:
ollama pull mistral # Recommended for RAG
# or
ollama pull llama3
Verify the server is running:
curl http://localhost:11434/api/tags
Run the interactive chat loop:
python rag_agent.py
This will:
Example interaction:
You: What is MCP?
Agent: The Model Context Protocol (MCP) enables modular tool use for AI agents by providing a standardized way to connect language models to external services...
[Used 2 retrieved documents as context]
Start the FastAPI MCP server:
python main.py
The server will be available at: http://localhost:8000
GET /health
POST /query
Content-Type: application/json
{
"query": "What is artificial intelligence?",
"use_context": true,
"n_results": 3
}
POST /search
Content-Type: application/json
{
"query": "MCP protocol",
"n_results": 5
}
POST /documents
Content-Type: application/json
{
"documents": [
"Document text 1",
"Document text 2"
],
"ids": ["doc1", "doc2"],
"metadata": [
{"source": "file1.txt"},
{"source": "file2.txt"}
]
}
GET /stats
from rag_agent import RAGAgent
# Initialize agent
agent = RAGAgent(
ollama_url="http://localhost:11434",
model="mistral"
)
# Get a response
result = agent.get_response("What is RAG?")
print(result["response"])
print(f"Retrieved {len(result['retrieved_documents'])} documents")
Edit mcp_config.yaml to customize:
from tools.chromadb_tool import ChromaTool
tool = ChromaTool()
documents = [
"Your document text 1",
"Your document text 2"
]
tool.add_documents(documents, ids=["id1", "id2"])
curl -X POST http://localhost:8000/documents \
-H "Content-Type: application/json" \
-d '{
"documents": ["Document 1", "Document 2"],
"ids": ["doc1", "doc2"]
}'
Create streamlit_app.py:
import streamlit as st
import requests
st.set_page_config(page_title="RAG Agent", layout="wide")
st.title("MCP-Powered Agentic RAG")
query = st.text_input("Ask a question:")
if query:
response = requests.post(
"http://localhost:8000/query",
json={"query": query}
)
result = response.json()
st.subheader("Response")
st.write(result["response"])
st.subheader("Retrieved Context")
for i, doc in enumerate(result["retrieved_documents"], 1):
st.write(f"**Doc {i}**: {doc[:200]}...")
Run Streamlit:
streamlit run streamlit_app.py
ollama servecurl http://localhost:11434/api/tagspip install sentence-transformers./vector_store/ directory exists and is writablepersist_dir in configuration matches actual pathMIT License - See LICENSE file for details
Contributions are welcome! Please:
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"agentic-rag-mcp-server": {
"command": "npx",
"args": []
}
}
}Web content fetching and conversion for efficient LLM usage.
Retrieval from AWS Knowledge Base using Bedrock Agent Runtime.
Provides auto-configuration for setting up an MCP server in Spring Boot applications.
A very streamlined mcp client that supports calling and monitoring stdio/sse/streamableHttp, and can also view request responses through the /logs page. It also