Docsmith

Бесплатно

Enables processing of Excel, Word, and PDF documents through reading, writing, and metadata extraction. It leverages a secure Pyodide WebAssembly environment to

автор: mcpc-tech

GitHub

Описание

Enables processing of Excel, Word, and PDF documents through reading, writing, and metadata extraction. It leverages a secure Pyodide WebAssembly environment to handle document parsing with support for paginated results.

README

npm version

Python-powered document processing MCP with MCP Apps — Process Excel, Word, PDF, PowerPoint documents with ease using Python, and view them beautifully through an interactive MCP App.

Features

Excel: Read/write .xlsx files with sheet support and pagination
Word: Read/write .docx files with paragraph and table support
PDF: Read .pdf files with text extraction and pagination
PowerPoint: Read .pptx files with slide content extraction
Text Files: Read/write .txt, .csv, .md, .json, .yaml, .yml with pagination support
Run Python: Execute Python code for flexible file operations and data processing
MCP App: Beautiful React + Tailwind CSS app for viewing all document types
Flexible Reading Modes: Raw full read or paginated for large files
Powered by Pyodide: Runs in secure WebAssembly sandbox via code-runner-mcp

Quick Start

MCP Configuration

Add to your MCP client configuration (e.g., Claude Desktop, Cline, etc.):

Via npx (recommended):

{
  "mcpServers": {
    "docsmith": {
      "command": "npx",
      "args": ["-y", "docsmith-mcp"],
      "env": {
        "DOC_PAGE_SIZE": "100"
      }
    }
  }
}

Via global installation:

npm install -g docsmith-mcp

{
  "mcpServers": {
    "docsmith": {
      "command": "docsmith-mcp",
      "env": {
        "DOC_PAGE_SIZE": "100"
      }
    }
  }
}

Via local path:

{
  "mcpServers": {
    "docsmith": {
      "command": "node",
      "args": ["/path/to/docsmith-mcp/dist/index.js"]
    }
  }
}

Then use the read_document tool:

{
  "file_path": "/path/to/document.xlsx",
  "mode": "paginated",
  "page": 1,
  "page_size": 50
}

The MCP App will automatically open to display the document content beautifully.

Supported Formats

Format	Extensions	Read	Write	Notes
Excel	`.xlsx`	✅	✅	Multi-sheet support, pagination
Word	`.docx`	✅	✅	Paragraphs and tables
PDF	`.pdf`	✅	❌	Text extraction with pagination
PowerPoint	`.pptx`	✅	❌	Slide content extraction
CSV	`.csv`	✅	✅	-
Text	`.txt`, `.md`	✅	✅	Pagination support
JSON	`.json`	✅	✅	-
YAML	`.yaml`, `.yml`	✅	✅	-

Tools

read_document

Read document content with automatic format detection.

Parameters:

file_path (string, required): Path to the document
mode (string, optional): "paginated" or "raw" (default: "paginated")
page (number, optional): Page number for paginated mode (default: 1)
page_size (number, optional): Items per page (default: 100)
sheet_name (string, optional): Sheet name for Excel files

Example:

{
  "file_path": "/path/to/document.xlsx",
  "mode": "paginated",
  "page": 1,
  "page_size": 50,
  "sheet_name": "Sheet1"
}

write_document

Write document content.

Parameters:

file_path (string, required): Output path
format (string, required): "excel", "word", "csv", "txt", "json", "yaml"
data (array/object, required): Document content

Example:

{
  "file_path": "/path/to/output.xlsx",
  "format": "excel",
  "data": [
    ["Product", "Q1", "Q2"],
    ["Laptop", 100, 150],
    ["Mouse", 500, 600]
  ]
}

get_document_info

Get document metadata without reading full content.

Parameters:

file_path (string, required): Path to the document

Example:

{
  "file_path": "/path/to/document.pdf"
}

run_python

Execute Python code for flexible file operations, data processing, and custom tasks. Supports any file format and Python libraries.

Parameters:

code (string, required): Python code to execute
packages (object, optional): Package mappings (import_name -> pypi_name) for required dependencies
file_paths (array, optional): File paths that the code needs to access

Examples:

Read and process any file:

{
  "code": "import json\nwith open('/path/to/file.json') as f:\n    data = json.load(f)\n    result = len(data)\n    print(json.dumps({'count': result}))",
  "file_paths": ["/path/to/file.json"]
}

Batch rename files with regex:

{
  "code": "import os, re\nfolder = '/path/to/files'\nfor name in os.listdir(folder):\n    new_name = re.sub(r'old_', 'new_', name)\n    os.rename(os.path.join(folder, name), os.path.join(folder, new_name))\nprint(json.dumps({'success': True}))",
  "file_paths": ["/path/to/files"]
}

Process data with pandas:

{
  "code": "import pandas as pd\ndf = pd.read_csv('/path/to/data.csv')\nsummary = df.describe().to_dict()\nprint(json.dumps(summary))",
  "packages": {"pandas": "pandas"},
  "file_paths": ["/path/to/data.csv"]
}

Extract archive files:

{
  "code": "import zipfile, os\nwith zipfile.ZipFile('/path/to/archive.zip', 'r') as z:\n    z.extractall('/path/to/output')\nfiles = os.listdir('/path/to/output')\nprint(json.dumps({'extracted_files': files}))",
  "file_paths": ["/path/to/archive.zip", "/path/to/output"]
}

MCP App

The built-in MCP App provides a beautiful, interactive interface for viewing documents:

Excel: Interactive tables with sticky headers
PDF: Page-by-page text viewing
Word: Paragraph and table rendering
PowerPoint: Slide navigation

Built with React 19, Tailwind CSS v4, and Lucide icons.

Configuration

Environment variables for customizing behavior:

Variable	Description	Default
`DOC_RAW_FULL_READ`	Enable full raw read mode	`false`
`DOC_PAGE_SIZE`	Default items per page	`100`
`DOC_MAX_FILE_SIZE`	Max file size in MB	`50`

Contributing

See CONTRIBUTING.md for development setup and contribution guidelines.

License

MIT

from github.com/mcpc-tech/docsmith-mcp

Как установить

Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.

{
  "mcpServers": {
    "docsmith-mcp": {
      "command": "npx",
      "args": []
    }
  }
}

Docsmith

Бесплатно

Enables processing of Excel, Word, and PDF documents through reading, writing, and metadata extraction. It leverages a secure Pyodide WebAssembly environment to

автор: mcpc-tech

GitHub

Описание

README

npm version

Python-powered document processing MCP with MCP Apps — Process Excel, Word, PDF, PowerPoint documents with ease using Python, and view them beautifully through an interactive MCP App.

Features

Excel: Read/write .xlsx files with sheet support and pagination
Word: Read/write .docx files with paragraph and table support
PDF: Read .pdf files with text extraction and pagination
PowerPoint: Read .pptx files with slide content extraction
Text Files: Read/write .txt, .csv, .md, .json, .yaml, .yml with pagination support
Run Python: Execute Python code for flexible file operations and data processing
MCP App: Beautiful React + Tailwind CSS app for viewing all document types
Flexible Reading Modes: Raw full read or paginated for large files
Powered by Pyodide: Runs in secure WebAssembly sandbox via code-runner-mcp

Quick Start

MCP Configuration

Add to your MCP client configuration (e.g., Claude Desktop, Cline, etc.):

Via npx (recommended):

{
  "mcpServers": {
    "docsmith": {
      "command": "npx",
      "args": ["-y", "docsmith-mcp"],
      "env": {
        "DOC_PAGE_SIZE": "100"
      }
    }
  }
}

Via global installation:

npm install -g docsmith-mcp

{
  "mcpServers": {
    "docsmith": {
      "command": "docsmith-mcp",
      "env": {
        "DOC_PAGE_SIZE": "100"
      }
    }
  }
}

Via local path:

{
  "mcpServers": {
    "docsmith": {
      "command": "node",
      "args": ["/path/to/docsmith-mcp/dist/index.js"]
    }
  }
}

Then use the read_document tool:

{
  "file_path": "/path/to/document.xlsx",
  "mode": "paginated",
  "page": 1,
  "page_size": 50
}

The MCP App will automatically open to display the document content beautifully.

Supported Formats

Format	Extensions	Read	Write	Notes
Excel	`.xlsx`	✅	✅	Multi-sheet support, pagination
Word	`.docx`	✅	✅	Paragraphs and tables
PDF	`.pdf`	✅	❌	Text extraction with pagination
PowerPoint	`.pptx`	✅	❌	Slide content extraction
CSV	`.csv`	✅	✅	-
Text	`.txt`, `.md`	✅	✅	Pagination support
JSON	`.json`	✅	✅	-
YAML	`.yaml`, `.yml`	✅	✅	-

Tools

read_document

Read document content with automatic format detection.

Parameters:

file_path (string, required): Path to the document
mode (string, optional): "paginated" or "raw" (default: "paginated")
page (number, optional): Page number for paginated mode (default: 1)
page_size (number, optional): Items per page (default: 100)
sheet_name (string, optional): Sheet name for Excel files

Example:

{
  "file_path": "/path/to/document.xlsx",
  "mode": "paginated",
  "page": 1,
  "page_size": 50,
  "sheet_name": "Sheet1"
}

write_document

Write document content.

Parameters:

file_path (string, required): Output path
format (string, required): "excel", "word", "csv", "txt", "json", "yaml"
data (array/object, required): Document content

Example:

{
  "file_path": "/path/to/output.xlsx",
  "format": "excel",
  "data": [
    ["Product", "Q1", "Q2"],
    ["Laptop", 100, 150],
    ["Mouse", 500, 600]
  ]
}

get_document_info

Get document metadata without reading full content.

Parameters:

file_path (string, required): Path to the document

Example:

{
  "file_path": "/path/to/document.pdf"
}

run_python

Execute Python code for flexible file operations, data processing, and custom tasks. Supports any file format and Python libraries.

Parameters:

code (string, required): Python code to execute
packages (object, optional): Package mappings (import_name -> pypi_name) for required dependencies
file_paths (array, optional): File paths that the code needs to access

Examples:

Read and process any file:

{
  "code": "import json\nwith open('/path/to/file.json') as f:\n    data = json.load(f)\n    result = len(data)\n    print(json.dumps({'count': result}))",
  "file_paths": ["/path/to/file.json"]
}

Batch rename files with regex:

{
  "code": "import os, re\nfolder = '/path/to/files'\nfor name in os.listdir(folder):\n    new_name = re.sub(r'old_', 'new_', name)\n    os.rename(os.path.join(folder, name), os.path.join(folder, new_name))\nprint(json.dumps({'success': True}))",
  "file_paths": ["/path/to/files"]
}

Process data with pandas:

{
  "code": "import pandas as pd\ndf = pd.read_csv('/path/to/data.csv')\nsummary = df.describe().to_dict()\nprint(json.dumps(summary))",
  "packages": {"pandas": "pandas"},
  "file_paths": ["/path/to/data.csv"]
}

Extract archive files:

{
  "code": "import zipfile, os\nwith zipfile.ZipFile('/path/to/archive.zip', 'r') as z:\n    z.extractall('/path/to/output')\nfiles = os.listdir('/path/to/output')\nprint(json.dumps({'extracted_files': files}))",
  "file_paths": ["/path/to/archive.zip", "/path/to/output"]
}

MCP App

The built-in MCP App provides a beautiful, interactive interface for viewing documents:

Excel: Interactive tables with sticky headers
PDF: Page-by-page text viewing
Word: Paragraph and table rendering
PowerPoint: Slide navigation

Built with React 19, Tailwind CSS v4, and Lucide icons.

Configuration

Environment variables for customizing behavior:

Variable	Description	Default
`DOC_RAW_FULL_READ`	Enable full raw read mode	`false`
`DOC_PAGE_SIZE`	Default items per page	`100`
`DOC_MAX_FILE_SIZE`	Max file size in MB	`50`

Contributing

See CONTRIBUTING.md for development setup and contribution guidelines.

License

MIT

from github.com/mcpc-tech/docsmith-mcp

Как установить

Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.

{
  "mcpServers": {
    "docsmith-mcp": {
      "command": "npx",
      "args": []
    }
  }
}

Command Palette

Docsmith

Описание

README

Features

Quick Start

MCP Configuration

Supported Formats

Tools

read_document

write_document

get_document_info

run_python

MCP App

Configuration

Contributing

License

Как установить

Похожие MCP

GitHub

Supabase

Everything

Filesystem

Docsmith

Описание

README

Features

Quick Start

MCP Configuration

Supported Formats

Tools

read_document

write_document

get_document_info

run_python

MCP App

Configuration

Contributing

License

Как установить

Похожие MCP

GitHub

Supabase

Everything

Filesystem