loading…
Search for a command to run...
loading…
Provides Latin NLP tools for tokenization, lemmatization, POS tagging, reported speech detection, and LiLa Knowledge Base querying via MCP.
Provides Latin NLP tools for tokenization, lemmatization, POS tagging, reported speech detection, and LiLa Knowledge Base querying via MCP.
A Model Context Protocol (MCP) server for Latin Natural Language Processing (NLP), reported speech detection, and LiLa Knowledge Base querying.
-que splittingThe server provides a pipeline of interoperable MCP tools for Latin NLP and Digital Humanities workflows.
The tools are designed to be used sequentially, but may also be used independently.
| Tool | Description |
--------------------------------------------------------------------------------------------------------|
| tokenize_latin_text | Tokenize Latin text with sentence splitting and enclitic handling |
| parser | Morphological analysis and preprocessing using UDPipe |
| detect_reported_speech_from_text | Transformer-based reported speech detection |
| get_lila_lemma_info | Query the LiLa Knowledge Base for lemma information |
| get_lila_lemma_tokens_dataframe | Retrieve LiLa corpus token occurrences and count attestations per work |
| export_lila_lemma_tokens_csv | Export LiLa corpus token occurrences as a CSV file |
The parser tool performs:
The tool calls the UDPipe API and uses the model:
latin-evalatin24-240520
The model was evaluated on EvaLatin campaign in 2024 and trained with on Latin Dependency Treebanks.
EvaLatin 2024 overview: https://aclanthology.org/2024.lt4hala-1.21/
UDPipe model repository: https://github.com/ufal/evalatin2024-latinpipe
The parser tool also prepares the linguistic input required by the reported speech detection model.
This preparation stage:
The detect_reported_speech tool performs token-level reported speech prediction.
It takes the output of the parsing/preparation stage as input and returns:
The model is:
Hugging Face model repository: https://huggingface.co/agudei/latin-reported-speech-laberta
Paper describing the experiment: https://aclanthology.org/2026.latechclfl-1.24/
The get_lila_lemma_info tool provides simplified access to the LiLa Knowledge Base.
The tool:
The tool is designed to help users interact with LiLa without manually writing SPARQL queries.
The get_lila_lemma_tokens_dataframe tool retrieves corpus attestations linked to a Latin lemma in the LiLa Knowledge Base.
The tool:
This enables corpus-based lexical exploration and quantitative analysis of lemma attestations across Latin works.
The export_lila_lemma_tokens_csv tool exports LiLa corpus attestation results as a CSV file.
The exported CSV includes:
The tool is designed for:
Install dependencies with:
uv sync
uv run mcp-latin
uv run python -m mcp_latin -vv
The MCP server will run at:
http://localhost:8001/mcp
You can test the server locally with:
npx @modelcontextprotocol/inspector
Then connect Inspector to:
http://localhost:8001/mcp
Use the MCP tool tokenize_latin_text on:
"Senatus populusque Romanus."
Use the MCP tool parser on:
"Non potui, inquit, sustinere illud durum spectaculum."
Use the Latin MCP tools only.
1. Parse:
"HISPO ROMANIUS alio colore dixit illam non amore adulescentis sed odio patris sui secutam"
2. Detect reported speech.
Use the MCP tool get Lila information on the "probabilis".
Use the MCP tool get occurrences of the on the lemma "probabilis" and export the results.
A reproducible VS Code devcontainer is included in:
.devcontainer/
See:
.devcontainer/README.md
for details.
Run in your terminal:
claude mcp add mcp-latin-tools-server -- npx