loading…
Search for a command to run...
loading…
ContextCrumb compresses long text, local files, and MCP catalog descriptions into denser context for LLM agents. It helps agents load more useful information in
ContextCrumb compresses long text, local files, and MCP catalog descriptions into denser context for LLM agents. It helps agents load more useful information into the context window and reduce token usage without turning the input into a summary.
Shake the crumbs out of bloated context.
Before / After - Quickstart - Playground - Install - CLI - Agent + MCP - Model
LLM context gets messy fast: notes, logs, issue threads, docs, research dumps, and tool descriptions all pile up until the useful signal is buried under filler.
ContextCrumb is a token-level compressor for LLM and agent workflows. It looks at text word by word and removes low-signal tokens while keeping the surviving text in the original order.
That is the idea behind the name: the context is still there, but the loose crumbs are shaken off before they reach your model. Less bloat in the prompt. More room for the parts that matter. Less wasted usage when Codex, Claude Code, or another agent processes long files repeatedly.
No install needed. Paste text, compare the kept context, and see what gets shaken off.
ContextCrumb is not a summarizer. It does not rewrite your document into a new explanation. It keeps the source sequence and deletes expendable words. This example uses target_keep_ratio=0.72.
Original
Agents spend context on notes, logs, tickets, docs, and tool descriptions. Those files contain useful facts, but they also carry filler phrases and repeated wording. ContextCrumb compresses the text before it reaches the model. It keeps the original order, removes low-value tokens, and leaves a shorter version with the names, actions, constraints, and sequence still intact.
Compressed
Agents spend context notes, logs, tickets, docs tool descriptions. Those files useful facts, carry filler phrases repeated wording. ContextCrumb compresses text before reaches model. keeps original order, removes low-value tokens, leaves shorter version names, actions, constraints sequence intact.
Same order. Less padding. More room for the next file. On prose-heavy agent inputs, ContextCrumb often saves around 30-70% of the context depending on how aggressively you compress and how much filler is in the source.
| Metric | Original | Compressed | Saved |
|---|---|---|---|
| Model tokens | 72 | 52 | 20 tokens |
| Token budget | 100% | 72% | 28% fewer input tokens |
What that feels like over a month
Assume your agent reads 8k-token notes, logs, tickets, research dumps, or docs before answering. This helps with API token bills, but also with subscription-based coding agents where heavy context reads can burn through usage faster.
| Workflow | Files read / day | Context saved / month | API cost avoided at $5 / 1M input tokens | Subscription usage feel |
|---|---|---|---|---|
| Solo agent helper | 20 | ~1.4M-3.4M tokens | ~$7-$17 | Fewer bulky reads in Codex or Claude Code |
| Busy project workspace | 200 | ~14M-34M tokens | ~$72-$168 | More room for actual reasoning and edits |
| Agent-heavy team or eval loop | 2,000 | ~144M-336M tokens | ~$720-$1,680 | Less usage spent processing padded files |
The bigger win is usually not only the bill. It is keeping long-running agents from filling their context, turns, and subscription usage with words they did not need to carry in the first place.
Teach your agent a small habit: compress the bloat before it enters context. ContextCrumb is meant to sit in the background as a skill, stepping in whenever a long note, doc, issue thread, research dump, or log would otherwise flood the context window and eat into your Codex or Claude Code usage.
npx skills add Yuchen20/Context-Crumb
The skill tells your agent when to compress text, how to preserve the useful sequence, and when exact raw text is required for things like code, configs, or direct quotes.
Use ContextCrumb to compress this long project note before you work from it.
| Use case | What changes |
|---|---|
| Agent file loading | Compress long notes, docs, research dumps, and logs before they hit the context window. |
| Prompt pipelines | Shrink natural-language inputs without hand-writing summarizers. |
| MCP catalogs | Compress verbose tool/resource descriptions while preserving names and schemas. |
| Local workflows | Run ONNX inference by default, with cached model files after first download. |
| Subscription-aware agents | Spend less Codex or Claude Code usage on repeatedly loading padded prose. |
| Inspection and tuning | Use diff and inspect to see what was kept, deleted, and saved. |
Best fit: docs, notes, issue threads, logs, research context, and other natural-language files. For source code where exact syntax matters, prefer raw file loading or use a conservative keep ratio.
pip install contextcrumb
Optional extras:
pip install "contextcrumb[mcp]"
pip install "contextcrumb[serve]"
pip install "contextcrumb[torch]"
ContextCrumb uses the ONNX backend by default, so normal users do not need PyTorch or Transformers installed. Model files are cached locally after the first download.
The main agent-friendly command is load:
contextcrumb load notes.txt
It prints only compressed text by default, which makes it easy for agents, hooks, shell scripts, and prompt pipelines to capture stdout and move on. For subscription tools like Codex or Claude Code, that means fewer bulky file reads before the agent gets to the useful part.
Useful commands:
contextcrumb load notes.txt --json
contextcrumb diff notes.txt
contextcrumb inspect notes.txt
contextcrumb stats
diff marks deleted tokens like this:
kept words [-deleted words-] kept words
ContextCrumb includes an optional MCP stdio adapter for agent clients that can run Python tools through uvx.
pip install "contextcrumb[mcp]"
Published-package MCP config:
{
"mcpServers": {
"contextcrumb": {
"command": "uvx",
"args": [
"--from",
"contextcrumb[mcp]",
"contextcrumb-mcp"
]
}
}
}
The MCP server exposes:
compress_text
compress_file
ContextCrumb also ships contextcrumb-shrink, an MCP proxy that compresses verbose catalog descriptions before an agent sees them while forwarding tool names, schemas, calls, results, and resource contents unchanged. This is useful when an agent client repeatedly spends context and subscription usage just looking at long tool descriptions.
Model weights and a hosted demo are public on Hugging Face:
Planned for later:
uv pip install --python .\.venv\Scripts\python.exe -e ".[dev,mcp]"
.\.venv\Scripts\python.exe -m pytest
.\.venv\Scripts\python.exe -m build
Release notes are tracked in CHANGELOG.md.
MIT. See LICENSE.
Run in your terminal:
claude mcp add contextcrumb -- npx Security
Low riskAutomated heuristic from public metadata — not a security guarantee.