loading…
Search for a command to run...
loading…
Enables LLMs to read and extract content from PDF files with high-fidelity LaTeX recognition and layout awareness using a Python-based extraction engine. It inc
Enables LLMs to read and extract content from PDF files with high-fidelity LaTeX recognition and layout awareness using a Python-based extraction engine. It includes a robust Node.js fallback and supports page range filtering for efficient processing of large documents.
An MCP server that enables reading PDF file contents, allowing PDF documents to be used as a knowledge base for LLMs.
pdf-parse) if the Python environment is unavailable or fails, ensuring extraction always succeeds (albeit with lower formatting quality).pip (for high-quality extraction)Install Node.js dependencies:
npm install
Install Python dependencies (Recommended): To enable high-quality extraction (especially for scientific papers with math), install the Python dependencies.
# Create or activate a virtual environment if desired
python3 -m pip install -r python/requirements.txt
Note: The first time you run the tool with the Python backend, it will download necessary AI models (OCR, layout analysis, etc.) to a local cache. This download is approximately 3.3GB. Ensure you have a stable internet connection.
Build the server:
npm run build
Add this to your MCP settings configuration:
{
"mcpServers": {
"pdf-reader": {
"command": "node",
"args": ["/absolute/path/to/mcpPdf/dist/index.js"],
"env": {
// Optional: Override where python is found if not in venv or path
// "PYTHON_PATH": "/path/to/python"
}
}
}
}
read_pdfReads and extracts text content from a PDF file.
Inputs:
path (string): Absolute path to the PDF file.start_page (number, optional): Starting page number (1-based).end_page (number, optional): Ending page number (1-based).How it works:
convert.py script.marker models from the local cache (.cache directory in the project).pdf-parse (a native Node.js library)..cache directory for models to avoid system permission issues. If you encounter errors, ensure the project directory is writable.start_page and end_page arguments to extract only what you need.Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"pdf-mcp-server": {
"command": "npx",
"args": []
}
}
}PRs, issues, code search, CI status
Database, auth and storage
Reference / test server with prompts, resources, and tools.
Secure file operations with configurable access controls.