loading…
Search for a command to run...
loading…
Enables processing of Excel, Word, and PDF documents through reading, writing, and metadata extraction. It leverages a secure Pyodide WebAssembly environment to
Enables processing of Excel, Word, and PDF documents through reading, writing, and metadata extraction. It leverages a secure Pyodide WebAssembly environment to handle document parsing with support for paginated results.
Python-powered document processing MCP with MCP Apps — Process Excel, Word, PDF, PowerPoint documents with ease using Python, and view them beautifully through an interactive MCP App.
.xlsx files with sheet support and pagination.docx files with paragraph and table support.pdf files with text extraction and pagination.pptx files with slide content extraction.txt, .csv, .md, .json, .yaml, .yml with pagination supportAdd to your MCP client configuration (e.g., Claude Desktop, Cline, etc.):
Via npx (recommended):
{
"mcpServers": {
"docsmith": {
"command": "npx",
"args": ["-y", "docsmith-mcp"],
"env": {
"DOC_PAGE_SIZE": "100"
}
}
}
}
Via global installation:
npm install -g docsmith-mcp
{
"mcpServers": {
"docsmith": {
"command": "docsmith-mcp",
"env": {
"DOC_PAGE_SIZE": "100"
}
}
}
}
Via local path:
{
"mcpServers": {
"docsmith": {
"command": "node",
"args": ["/path/to/docsmith-mcp/dist/index.js"]
}
}
}
Then use the read_document tool:
{
"file_path": "/path/to/document.xlsx",
"mode": "paginated",
"page": 1,
"page_size": 50
}
The MCP App will automatically open to display the document content beautifully.
| Format | Extensions | Read | Write | Notes |
|---|---|---|---|---|
| Excel | .xlsx |
✅ | ✅ | Multi-sheet support, pagination |
| Word | .docx |
✅ | ✅ | Paragraphs and tables |
.pdf |
✅ | ❌ | Text extraction with pagination | |
| PowerPoint | .pptx |
✅ | ❌ | Slide content extraction |
| CSV | .csv |
✅ | ✅ | - |
| Text | .txt, .md |
✅ | ✅ | Pagination support |
| JSON | .json |
✅ | ✅ | - |
| YAML | .yaml, .yml |
✅ | ✅ | - |
Read document content with automatic format detection.
Parameters:
file_path (string, required): Path to the documentmode (string, optional): "paginated" or "raw" (default: "paginated")page (number, optional): Page number for paginated mode (default: 1)page_size (number, optional): Items per page (default: 100)sheet_name (string, optional): Sheet name for Excel filesExample:
{
"file_path": "/path/to/document.xlsx",
"mode": "paginated",
"page": 1,
"page_size": 50,
"sheet_name": "Sheet1"
}
Write document content.
Parameters:
file_path (string, required): Output pathformat (string, required): "excel", "word", "csv", "txt", "json", "yaml"data (array/object, required): Document contentExample:
{
"file_path": "/path/to/output.xlsx",
"format": "excel",
"data": [
["Product", "Q1", "Q2"],
["Laptop", 100, 150],
["Mouse", 500, 600]
]
}
Get document metadata without reading full content.
Parameters:
file_path (string, required): Path to the documentExample:
{
"file_path": "/path/to/document.pdf"
}
Execute Python code for flexible file operations, data processing, and custom tasks. Supports any file format and Python libraries.
Parameters:
code (string, required): Python code to executepackages (object, optional): Package mappings (import_name -> pypi_name) for required dependenciesfile_paths (array, optional): File paths that the code needs to accessExamples:
Read and process any file:
{
"code": "import json\nwith open('/path/to/file.json') as f:\n data = json.load(f)\n result = len(data)\n print(json.dumps({'count': result}))",
"file_paths": ["/path/to/file.json"]
}
Batch rename files with regex:
{
"code": "import os, re\nfolder = '/path/to/files'\nfor name in os.listdir(folder):\n new_name = re.sub(r'old_', 'new_', name)\n os.rename(os.path.join(folder, name), os.path.join(folder, new_name))\nprint(json.dumps({'success': True}))",
"file_paths": ["/path/to/files"]
}
Process data with pandas:
{
"code": "import pandas as pd\ndf = pd.read_csv('/path/to/data.csv')\nsummary = df.describe().to_dict()\nprint(json.dumps(summary))",
"packages": {"pandas": "pandas"},
"file_paths": ["/path/to/data.csv"]
}
Extract archive files:
{
"code": "import zipfile, os\nwith zipfile.ZipFile('/path/to/archive.zip', 'r') as z:\n z.extractall('/path/to/output')\nfiles = os.listdir('/path/to/output')\nprint(json.dumps({'extracted_files': files}))",
"file_paths": ["/path/to/archive.zip", "/path/to/output"]
}
The built-in MCP App provides a beautiful, interactive interface for viewing documents:
Built with React 19, Tailwind CSS v4, and Lucide icons.
Environment variables for customizing behavior:
| Variable | Description | Default |
|---|---|---|
DOC_RAW_FULL_READ |
Enable full raw read mode | false |
DOC_PAGE_SIZE |
Default items per page | 100 |
DOC_MAX_FILE_SIZE |
Max file size in MB | 50 |
See CONTRIBUTING.md for development setup and contribution guidelines.
MIT
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"docsmith-mcp": {
"command": "npx",
"args": []
}
}
}PRs, issues, code search, CI status
Database, auth and storage
Reference / test server with prompts, resources, and tools.
Secure file operations with configurable access controls.