loading…
Search for a command to run...
loading…
Comprehensive MCP server for academic research workflows, enabling paper searching across multiple sources, manuscript processing with citation placeholders, se
Comprehensive MCP server for academic research workflows, enabling paper searching across multiple sources, manuscript processing with citation placeholders, search caching, and citation export.
Fork Notice: This is a fork of openags/paper-toolkit-mcp, originally created by P.S Zhang. This fork extends the project with manuscript processing, search caching, and citation export features. Both versions are licensed under MIT.
A comprehensive MCP toolkit for paper searching, manuscript processing, and academic research workflows. The project follows a free-first strategy: prioritize open and public data sources, support optional API keys when they improve stability or coverage, and keep source-specific connectors extensible for advanced users.
New Features (v0.2.0): Manuscript processing with citation placeholders, search caching, BibTeX/RIS export, and one-click Word document generation.
paper-toolkit manuscript commandrefs.ris to Zotero (optional)draft_final.docx[@doi:10.1038/s41591-020-0001-2]
[@pmid:32145678]
[@arxiv:2106.12345]
[@title:Attention Is All You Need]
# Basic usage (generates formatted markdown + BibTeX + RIS)
paper-toolkit manuscript draft.md
# With Word document generation (requires pandoc)
paper-toolkit manuscript draft.md --docx
# Specify citation style
paper-toolkit manuscript draft.md -s apa
paper-toolkit manuscript draft.md -s ieee
paper-toolkit manuscript draft.md -s gb7714
# Custom output directory
paper-toolkit manuscript draft.md -o ./output
# Disable specific outputs
paper-toolkit manuscript draft.md --no-bib --no-ris
| Style | Code | Description |
|---|---|---|
| GB/T 7714-2015 | gb7714 |
Chinese national standard (numeric) |
| APA 7th | apa |
American Psychological Association |
| IEEE | ieee |
Institute of Electrical and Electronics Engineers |
| Vancouver | vancouver |
International Committee of Medical Journal Editors |
| Harvard | harvard |
Author-date format |
After processing, you get:
draft_formatted.md - Markdown with numbered citations [1], [2], ...draft_final.docx - Word document (if --docx used and pandoc installed)refs.bib - BibTeX file (can be imported to Zotero/JabRef)refs.ris - RIS file (Zotero/EndNote/Mendeley compatible)draft_references.txt - Plain text reference list.paper_cache/your_project/
├── draft.md
├── refs.bib
└── .paper_cache/ ← Cache is here
├── abc123.json ← Cached search results
└── def456.json
# List cached items
paper-toolkit cache list
# Clear all cache
paper-toolkit cache clear
Or via MCP tools: cache_list(), cache_clear()
# Search papers
paper-toolkit search "machine learning" -s arxiv,semantic -n 10
# Download PDF
paper-toolkit download arxiv 2106.12345
# Read paper (extract text)
paper-toolkit read arxiv 2106.12345
# Get paper metadata
paper-toolkit search "attention is all you need" -s crossref -n 1
# Process manuscript
paper-toolkit manuscript draft.md -s gb7714 --docx
# Cache management
paper-toolkit cache list
paper-toolkit cache clear
# List available sources
paper-toolkit sources
Write your paper in Markdown with citation placeholders, then generate a formatted Word document with references automatically:
# Introduction
Deep learning has made significant progress in medical imaging[@doi:10.1038/s41591-020-0001-2].
Transformer architecture revolutionized NLP[@title:Attention Is All You Need].
Process it:
paper-toolkit manuscript draft.md -s gb7714 --docx
Output:
draft_formatted.md - Text with numbered citations [1], [2], ...refs.bib - BibTeX file (for Zotero/EndNote import)refs.ris - RIS file (Zotero compatible)draft_final.docx - Word document with formatted referencesSupported placeholders: [@doi:...], [@pmid:...], [@arxiv:...], [@title:...]
Supported citation styles: GB/T 7714-2015, APA 7th, IEEE, Vancouver, Harvard
Search results are automatically cached in .paper_cache/ (relative to current working directory):
.paper_cache/ to clearprocess_manuscript - Process manuscript with citationsget_paper_metadata - Get paper metadata by identifierexport_references - Export references in BibTeX/RIS/text formatcache_list / cache_clear - Manage search cachepaper-toolkit-mcp is a Python-based tool for searching and downloading academic papers from various platforms. It provides tools for searching papers, downloading PDFs, and extracting text, making it ideal for researchers and AI-driven workflows. It can be used as an MCP server (for Claude Desktop and other MCP clients) or as a Claude Code skill with a CLI interface.
search_papers for multi-source concurrent search & deduplication, and download_with_fallback relying on publisher open access links with sequential fallbacks.Paper class.download_with_fallback now follows source-native download → OpenAIRE/CORE/Europe PMC/PMC discovery → Unpaywall DOI resolution → optional Sci-Hub.academic_platforms module.The long-term goal is not to depend on a single search engine, but to combine multiple free and public sources with clear roles:
Recommended free-first roadmap:
This matrix reflects verified live-integration results from functional and end-to-end regression tests in this repository. Columns show the highest capability level observed under normal conditions.
| Platform | Search | Download | Read | Notes |
|---|---|---|---|---|
| arXiv | ✅ | ✅ | ✅ | Open API; reliable |
| PubMed | ✅ | ❌ | ⚠️ info-only | Open API; reliable |
| bioRxiv | ✅ | ✅ | ✅ | Open API; reliable |
| medRxiv | ✅ | ✅ | ✅ | Open API; reliable |
| Google Scholar | ⚠️ | ❌ | ❌ | Bot-detection active; set paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL |
| IACR | ✅ | ✅ | ✅ | Open API; reliable |
| Semantic Scholar | ✅ | ✅ (OA) | ✅ (OA) | Works without key (rate-limited); key improves limits; key rejection (403) retried automatically without key |
| Crossref | ✅ | ❌ | ⚠️ info-only | Open API; reliable |
| OpenAlex | ✅ | ❌ | ⚠️ info-only | Open API; reliable |
| PMC | ✅ | ✅ (OA only) | ✅ (OA only) | OA PDFs only; direct download may be blocked by some proxy environments |
| CORE | ✅ | ✅ (record-dependent) | ✅ (record-dependent) | Free key recommended; connector retries with backoff and falls back to key-less on 401/403 |
| Europe PMC | ✅ | ✅ (OA) | ✅ (OA) | OA PDFs only; direct download may be blocked by some proxy environments |
| dblp | ✅ | ❌ | ⚠️ info-only | Open API; reliable |
| OpenAIRE | ✅ | ❌ | ❌ | Open API; retries 3× with escalating request profiles on transient 403 |
| CiteSeerX | ⚠️ | ✅ (record-dependent) | ⚠️ | API endpoint intermittently unavailable / redirects to web archive |
| DOAJ | ✅ | ⚠️ (URL-dependent) | ⚠️ (URL-dependent) | PDF availability varies by article; free key raises rate limits |
| BASE | ⚠️ | ✅ (record-dependent) | ✅ (record-dependent) | OAI-PMH endpoint requires institutional IP registration; returns empty gracefully otherwise |
| Zenodo | ✅ | ✅ (record-dependent) | ✅ (record-dependent) | Open API; reliable |
| HAL | ✅ | ✅ (record-dependent) | ✅ (record-dependent) | Open API; reliable |
| SSRN | ⚠️ | ⚠️ best-effort | ⚠️ best-effort | 403 bot-detection active; public PDF only |
| Unpaywall | ✅ (DOI lookup) | ❌ | ❌ | Requires paper_toolkit_mcp_UNPAYWALL_EMAIL |
| Sci-Hub (optional) | ⚠️ fallback-only | ✅ | ❌ | Optional; unstable mirrors; user responsibility |
| IEEE Xplore 🔑 | 🚧 skeleton | 🚧 skeleton | 🚧 skeleton | Requires paper_toolkit_mcp_IEEE_API_KEY to activate |
| ACM DL 🔑 | 🚧 skeleton | 🚧 skeleton | 🚧 skeleton | Requires paper_toolkit_mcp_ACM_API_KEY to activate |
✅ = reliable in live tests. ⚠️ = works but subject to upstream instability or access restrictions. ❌ = not supported. 🔑 = key required. 🚧 = skeleton only.
All keys are optional unless noted. Configure them in .env (preferred) or as shell exports.
| Environment Variable | Provider | Required? | How to obtain |
|---|---|---|---|
paper_toolkit_mcp_UNPAYWALL_EMAIL |
Unpaywall | Yes (Unpaywall disabled without it) | Any valid email; register at unpaywall.org |
paper_toolkit_mcp_CORE_API_KEY |
CORE | Recommended | Free at core.ac.uk/services/api |
paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY |
Semantic Scholar | Optional | Free at semanticscholar.org — improves rate limits |
paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL |
Google Scholar | Optional | Your HTTP/HTTPS proxy URL — bypasses bot-detection |
paper_toolkit_mcp_DOAJ_API_KEY |
DOAJ | Optional | Free at doaj.org — raises hourly rate limit |
paper_toolkit_mcp_ZENODO_ACCESS_TOKEN |
Zenodo | Optional | Free at zenodo.org — required for private records |
paper_toolkit_mcp_IEEE_API_KEY |
IEEE Xplore | Required to activate | Free at developer.ieee.org |
paper_toolkit_mcp_ACM_API_KEY |
ACM DL | Required to activate | See libraries.acm.org/digital-library/acm-open |
All variables follow the paper_toolkit_mcp_<NAME> prefix scheme. Legacy names without the prefix (e.g. CORE_API_KEY, UNPAYWALL_EMAIL) are still supported for backward compatibility.
Some search failures are caused by external provider instability, not by bugs in this project:
| Source | Symptom | Cause | Workaround |
|---|---|---|---|
| Google Scholar | Returns 0 results / empty HTML | Bot-detection (CAPTCHA) | Set paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL to a proxy |
| Semantic Scholar | 429 rate-limited responses | Anonymous access rate limit | Set paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY; if key is rejected (403) connector automatically retries without key |
| CORE | 500 / timeout errors | Unauthenticated rate limiting | Set paper_toolkit_mcp_CORE_API_KEY (free); connector retries with exponential backoff and falls back to key-less on 401/403 |
| OpenAIRE | Transient 403 responses | IP-based session rate limiting | Connector retries 3× per profile, escalating: plain session → XML Accept header → raw requests.get with Mozilla UA |
| CiteSeerX | 404 via web archive redirect | PSU endpoint intermittently redirects to archive | No workaround; connector returns empty gracefully |
| BASE | Search returns 0 results | OAI-PMH endpoint requires institutional IP registration | Register at base-search.net for API access; connector returns empty gracefully otherwise |
| SSRN | HTTP 403 | Bot-detection (Cloudflare) | No workaround; connector tries two endpoints and returns a clear message on failure |
| PMC / Europe PMC | PDF download ProxyError | Local proxy blocking direct HTTPS PDF download | Disable proxy or use download_with_fallback instead |
| Unpaywall | Skipped entirely | UNPAYWALL_EMAIL env var not set |
Set paper_toolkit_mcp_UNPAYWALL_EMAIL in .env |
IEEE Xplore and ACM Digital Library connectors are included as opt-in skeletons. They are disabled by default — no API calls are made unless you explicitly configure the corresponding keys.
| Platform | Env Var | Status |
|---|---|---|
| IEEE Xplore | paper_toolkit_mcp_IEEE_API_KEY |
🚧 skeleton — search registered, download/read raise NotImplementedError |
| ACM Digital Library | paper_toolkit_mcp_ACM_API_KEY |
🚧 skeleton — search registered, download/read raise NotImplementedError |
How to enable:
export paper_toolkit_mcp_IEEE_API_KEY=<your_ieee_key> # free key at https://developer.ieee.org/
export paper_toolkit_mcp_ACM_API_KEY=<your_acm_key> # see https://libraries.acm.org/digital-library
Once a key is set, the corresponding source is automatically added to ALL_SOURCES and its MCP tools (search_ieee / search_acm, download_ieee / download_acm, read_ieee_paper / read_acm_paper) are registered at server startup.
Without a key the connectors log a startup warning only — the rest of the server is unaffected.
Three additional free-source connectors are now integrated into the MCP server:
zenodo: Official Zenodo REST API connector (search + record-dependent PDF/read support).hal: HAL public API connector (search + record-dependent PDF/read support).ssrn: Discovery-first connector with hardened parser and best-effort download/read when a direct public PDF link is available.unpaywall: DOI-centric OA metadata source for standalone lookup (search_unpaywall) and fallback URL resolution.SSRN integration remains compliance-first: it only attempts direct public PDF links exposed by SSRN pages. If login/restricted delivery is required, the connector returns a clear message instead of bypassing access controls.
Sci-Hub support can remain available as an optional connector for users who explicitly choose to enable it, but it should not be treated as the default or recommended full-text path.
Choose the method that best fits your workflow. All methods support the same optional API keys.
MCP Server Config file locations (for methods below)
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json- Windows:
%APPDATA%\Claude\claude_desktop_config.json- Linux:
~/.config/Claude/claude_desktop_config.json
This is the most reliable method — you have full control and can customize the installation.
# 1. Clone your forked repo
git clone https://github.com/YOUR_USERNAME/paper-toolkit-mcp.git
cd paper-toolkit-mcp
# 2. Install dependencies (using uv, recommended)
# Install uv if you don't have it: https://docs.astral.sh/uv/getting-started/installation/
uv venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"
# 3. Verify it works
uv run -m paper_toolkit_mcp.server
# or
paper-toolkit search "machine learning" -s arxiv,semantic
Claude Desktop / Trae IDE config (replace the path with your actual clone location):
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "uv",
"args": [
"run",
"--directory", "D:/Codes/paper-toolkit-mcp",
"-m", "paper_toolkit_mcp.server"
],
"env": {
"paper_toolkit_mcp_UNPAYWALL_EMAIL": "[email protected]",
"paper_toolkit_mcp_CORE_API_KEY": "",
"paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": ""
}
}
}
}
For Trae IDE on Windows, edit the MCP settings file at the location shown in Trae's settings UI, or use the python -m method:
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "python",
"args": ["-m", "paper_toolkit_mcp.server"]
}
}
}
Make sure to run this from your project directory, or set the cwd appropriately.
npx -y @smithery/cli install @openags/paper-toolkit-mcp --client claude
Smithery automatically writes the correct config block for you. No manual JSON editing needed.
uvx (no install, always latest)uvx runs the package directly from PyPI without a permanent install. Requires uv.
# Install uv (skip if already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
⚠️ macOS note:
uvxgenerated wrapper scripts rely onrealpath, which is not included in macOS by default. If you see arealpath: command not founderror, either install GNU coreutils (brew install coreutils) or use Method 3 (uv run) instead — it does not have this limitation.
Claude Desktop config:
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "uvx",
"args": ["paper-toolkit-mcp"],
"env": {
"paper_toolkit_mcp_UNPAYWALL_EMAIL": "[email protected]",
"paper_toolkit_mcp_CORE_API_KEY": "",
"paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
"paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
"paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
"paper_toolkit_mcp_IEEE_API_KEY": "",
"paper_toolkit_mcp_ACM_API_KEY": ""
}
}
}
}
uv (persistent install)uv tool install paper-toolkit-mcp
Claude Desktop config:
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "uv",
"args": ["tool", "run", "paper-toolkit-mcp"],
"env": {
"paper_toolkit_mcp_UNPAYWALL_EMAIL": "[email protected]",
"paper_toolkit_mcp_CORE_API_KEY": "",
"paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
"paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
"paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
"paper_toolkit_mcp_IEEE_API_KEY": "",
"paper_toolkit_mcp_ACM_API_KEY": ""
}
}
}
}
pip (standard Python install)pip install paper-toolkit-mcp
Claude Desktop config:
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "python",
"args": ["-m", "paper_toolkit_mcp.server"],
"env": {
"paper_toolkit_mcp_UNPAYWALL_EMAIL": "[email protected]",
"paper_toolkit_mcp_CORE_API_KEY": "",
"paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
"paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
"paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
"paper_toolkit_mcp_IEEE_API_KEY": "",
"paper_toolkit_mcp_ACM_API_KEY": ""
}
}
}
}
If
pythonis not on your PATH, replace it with the full path (e.g./usr/bin/python3orC:\Python311\python.exe). Runwhich python3/where pythonto find it.
npx (via Smithery CLI, no local Python needed)npx -y @smithery/cli run @openags/paper-toolkit-mcp
Claude Desktop config:
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "npx",
"args": ["-y", "@smithery/cli", "run", "@openags/paper-toolkit-mcp"],
"env": {
"paper_toolkit_mcp_UNPAYWALL_EMAIL": "[email protected]",
"paper_toolkit_mcp_CORE_API_KEY": "",
"paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": ""
}
}
}
}
docker build -t paper-toolkit-mcp .
docker run --rm -i \
-e [email protected] \
-e paper_toolkit_mcp_CORE_API_KEY=your_core_key \
paper-toolkit-mcp
Claude Desktop config:
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "docker",
"args": ["run", "--rm", "-i", "paper-toolkit-mcp"],
"env": {
"paper_toolkit_mcp_UNPAYWALL_EMAIL": "[email protected]",
"paper_toolkit_mcp_CORE_API_KEY": "",
"paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
"paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
"paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
"paper_toolkit_mcp_IEEE_API_KEY": "",
"paper_toolkit_mcp_ACM_API_KEY": ""
}
}
}
}
This is the most reliable method on macOS — no wrapper scripts, no realpath issues.
# 1. Install uv (skip if already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# 2. Clone repo
git clone https://github.com/openags/paper-toolkit-mcp.git
cd paper-toolkit-mcp
# 3. Verify it runs (uv auto-resolves dependencies, no manual install needed)
uv run -m paper_toolkit_mcp.server
Claude Desktop config (replace the directory path with your actual clone location):
{
"mcpServers": {
"paper-toolkit-mcp": {
"command": "uv",
"args": [
"run",
"--directory", "/path/to/paper-toolkit-mcp",
"-m", "paper_toolkit_mcp.server"
],
"env": {
"paper_toolkit_mcp_UNPAYWALL_EMAIL": "[email protected]",
"paper_toolkit_mcp_CORE_API_KEY": "",
"paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
"paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
"paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
"paper_toolkit_mcp_IEEE_API_KEY": "",
"paper_toolkit_mcp_ACM_API_KEY": ""
}
}
}
}
For example, if you cloned to /Users/mac/Pengsong/paper-toolkit-mcp:
"args": ["run", "--directory", "/Users/mac/Pengsong/paper-toolkit-mcp", "-m", "paper_toolkit_mcp.server"]
uv runautomatically installs dependencies into an isolated environment on first run — nopip installorvenvneeded.
For active development, optionally install an editable copy:
uv venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"
.env file)Instead of putting keys directly in the JSON config you can store them in a .env file in the project root (auto-loaded on startup):
cp .env.example .env # if running from source
# or create ~/.paper-toolkit-mcp.env for global use
[email protected]
paper_toolkit_mcp_CORE_API_KEY=
paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY=
paper_toolkit_mcp_ZENODO_ACCESS_TOKEN=
paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL=
paper_toolkit_mcp_IEEE_API_KEY=
paper_toolkit_mcp_ACM_API_KEY=
To use a custom path: export paper_toolkit_mcp_ENV_FILE=/absolute/path/to/.env
Legacy variable names without the
paper_toolkit_mcp_prefix (e.g.CORE_API_KEY,UNPAYWALL_EMAIL) are still supported for backward compatibility.
We welcome contributions! Here's how to get started:
Fork the Repository: Click "Fork" on GitHub.
Clone and Set Up:
git clone https://github.com/yourusername/paper-toolkit-mcp.git
cd paper-toolkit-mcp
uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"
Make Changes:
academic_platforms/.tests/.Submit a Pull Request: Push changes and create a PR on GitHub.

search_papers)IEEE_API_KEY)ACM_API_KEY)This project is licensed under the MIT License. See the LICENSE file for details.
Happy researching with paper-toolkit-mcp! If you encounter issues, open a GitHub issue.
Выполни в терминале:
claude mcp add scholar-toolkit-mcp -- npx Безопасность
Низкий рискАвтоматическая эвристика по публичным данным — не гарантия безопасности.