loading…
Search for a command to run...
loading…
MCP server for TNL (Typed Natural Language): per-feature English contracts for AI coding agents. 6 tools — getimpactedtnls, retrievetnl, trace, proposetnldiff,
MCP server for TNL (Typed Natural Language): per-feature English contracts for AI coding agents. 6 tools — getimpactedtnls, retrievetnl, trace, proposetnldiff, approvetnldiff, verify — let agents look up relevant contracts, propose contract edits, and verify implementations against them. Drop-in via npx typed-nl init for Claude Code, Codex, Gemini.
AI coding agents make design decisions silently, drift from what was planned, and lose context at session end. TNL (Typed Natural Language) is the fix: a per-feature English contract with a fixed schema — proposed by the agent, approved by you, implemented against, saved on disk, and read by every future session. If you've used plan mode in Claude Code, this is the same discipline made compact, persistent, and machine-checkable.
The schema is seven fields:
id / title — what this feature isscope — feature or repo-widepaths — which files the change is allowed to touchsurfaces — named external surfaces (CLI commands, routes, MCP tools)behaviors — numbered MUST / SHOULD / MAY clauses; the contract propernon-goals — what's explicitly out of scoperationale — the why, for future readersYou approve this once, before any code runs. The agent implements against each MUST clause and self-attests at the end — for every MUST, naming the file or test that satisfies it.
No new tool, no new agent, no new workflow. TNL slots into whatever agent you already use. Claude Code, Codex, Gemini, and Cursor get first-class tnl init with stanza + hooks + MCP. Any agent that reads a Markdown instruction file adopts with a two-step manual copy — the minimum product is a stanza in your instruction file plus a tnl/ directory. tnl verify, the PreToolUse hook that re-surfaces the contract mid-edit, and the MCP server are optional layers on top.
id: user-rate-limiter
title: Per-user API rate limiter
scope: feature
owners: [@jana]
paths: [src/middleware/rate-limit.ts]
surfaces: [POST /api/*]
intent:
Cap requests per user at 60/min; exceeding users get 429 with
Retry-After. Prevents abuse on the public write endpoints.
behaviors:
- The middleware MUST track request counts per user in a sliding window of 60 seconds.
- When a user exceeds 60 requests in the window, the middleware MUST return HTTP 429.
- [test: tests/rate_limit.test.ts::returns_429_on_exceeded] The 429 response MUST include a `Retry-After` header.
- [semantic] The middleware SHOULD log the user ID and request path on every 429, without leaking the request body.
non-goals:
- Per-IP rate limiting — authenticated surface only.
- Distributed rate state — in-memory per instance for now.
Three zones: machine (id, paths, surfaces), contract (behaviors with RFC 2119 keywords and [semantic] / [test: …] prefixes), human (intent, non-goals, rationale).
Every feature request — new or modification — runs through 7 steps:
tnl/ for files whose paths: overlap the request. If one exists, the output is an edit; if not, a new TNL.tnl/<slug>.tnl.paths: bounds the change.For follow-up work, step 1 returns "edit the existing TNL" — and the next session reads the already-approved contract as context, rather than rediscovering design decisions from the code.
tnl verify checks paths exist and test bindings resolve. The PreToolUse hook re-injects the contract on every Edit/Write so the agent can't drift silently.We ran a controlled A/B. Baseline condition: four working principles — think before coding, simplicity first, surgical edits, goal-driven — written as prose in the project's CLAUDE.md / AGENTS.md. TNL condition: the same four plus two more (match existing conventions; exhaustive end-of-task self-attestation) encoded as tnl/workflow.tnl, plus a per-feature TNL. Same agent, same project context; only the contract step differs.
Headline task: add event-driven triggers to a 16KLOC Python codebase. 35 behavioural scenarios covering config, cycle prevention, cron coexistence, CLI surfaces.
| Agent | Run | TNL | Baseline | Gap |
|---|---|---|---|---|
| Claude Code Opus 4.7 | 1 | 35/35 | 29/35 | +6 |
| Claude Code Opus 4.7 | 2 | 31/35 | 27/35 | +4 |
| Claude Code Opus 4.7 | 3 | 30/35 | 25/35 | +5 |
| Codex GPT-5.4 high | 1 | 32/35 | 26/35 | +6 |
| Codex GPT-5.4 high | 2 | 31/35 | 26/35 | +5 |
| Codex GPT-5.5 | 1 | 32/35 | 29/35 | +3 |
| Codex GPT-5.5 | 2 | — ¹ | 30/35 | — |
¹ Second GPT-5.5 baseline has no paired TNL run; averaged baseline across the two GPT-5.5 runs is 29.5/35.
TNL was ahead of baseline in every paired cell across models. Gap ranges +3 to +6 scenarios.
Model scaling. The gap narrows as the underlying agent gets stronger: baseline on the same task jumped from 26/35 (Codex GPT-5.4, avg of 2 runs) to 29.5/35 (Codex GPT-5.5, avg of 2 runs) with no change to the codebase or prompt. The contract step still helps, but it helps less when the agent already writes disciplined code on its own.
Other signals:
Caveats up front. Small sample (2–3 per cell), LLM sessions are noisy, and we built the tool. Every script, prompt, raw JSON, and session transcript is committed so you can rerun anything.
We built this tool using its own workflow — the minimal form (CLAUDE.md stanza + tnl/, no hooks or MCP). The baseline rules live in tnl/workflow.tnl and every feature has its own TNL in tnl/. In practice: faster turnaround, few rework cycles, each next change edits the spec instead of re-analysing code. One project's worth of evidence, but the meta-test isn't nothing.
Two of Andrej Karpathy's observations framed the problem we're solving:
TNL is our answer: a concrete contract format with a fixed schema, a review workflow, and enforcement plumbing that turns both observations into a daily practice.
# One-off, no install
npx -y typed-nl <command>
# Or install globally
npm install -g typed-nl
tnl <command>
Requires Node 20 or later.
If your agent reads a markdown instruction file but isn't in tnl init's native list, the minimum adoption is two copies:
tnl/workflow.tnl in your project.<!-- tnl:workflow-stanza --> that tnl init --agent claude would emit into CLAUDE.md) into your agent's instruction file.That's it — the workflow fires from the stanza, contracts live in tnl/. The hook, MCP server, and CI action are all optional and agent-specific; they can be added later if your stack supports them.
Begin with just the baseline TNL scaffold — no MCP, no hooks, no CI. The agent follows the workflow from the appended CLAUDE.md stanza alone.
cd /path/to/your/repo
npx -y typed-nl init --agent claude --minimal
This writes only:
tnl/ — where your TNL contracts will livetnl/workflow.tnl — baseline session principlesCLAUDE.md — TNL workflow stanza appended (or file created if missing)For Codex: --agent codex (writes AGENTS.md). For Gemini: --agent gemini (writes GEMINI.md). For Cursor: --agent cursor (writes AGENTS.md + .cursor/mcp.json).
Start a Claude Code (or Codex / Gemini / Cursor) session and ask for any feature. The agent, guided by the workflow stanza, will:
tnl/<slug>.tnl.paths:).npx -y typed-nl verify
Runs tier 1 (paths and dependencies exist) and tier 2 (test-binding integrity — each [test:] annotation names a test that still exists). Exits 2 on any failure; CI uses this gate.
You can always re-run tnl init to layer on more. Each step is independent and safe to re-run (idempotent):
# Full install: MCP server + PreToolUse hook + CI workflow
npx -y typed-nl init --agent claude
# Everything except CI
npx -y typed-nl init --agent claude --no-ci
# Everything except the PreToolUse hook
npx -y typed-nl init --agent claude --no-hook
# Claude only: add the /tnl-feature slash command
npx -y typed-nl init --agent claude --with-skill
What each capability gives you:
| Capability | Added by default (omit --minimal) |
What it does |
|---|---|---|
| MCP server | yes | Registers tnl in .mcp.json / .codex/config.toml / .gemini/settings.json / .cursor/mcp.json. Agent gains 6 tools: retrieve, propose, approve, verify, impacted, trace. |
| PreToolUse hook | yes (Claude) | .claude/settings.json hook auto-injects impacted TNLs as context on every Edit / Write. |
| CI workflow | yes | .github/workflows/tnl-verify.yml runs tnl verify on push + PR. |
/tnl-feature skill |
no (opt-in via --with-skill) |
Claude Code slash command for explicit invocation. |
tnl init [flags] # scaffold TNL in a project
tnl verify [paths...] # check structural + test-binding integrity
tnl resolve [id...] # regenerate sidecar meta (hashes, classification)
tnl impacted <paths...> # list TNLs whose paths: overlap with given code paths
tnl diff <file> # show clause-level diff of a TNL vs HEAD
tnl test-plan <id> # list test-backed clauses for a unit
tnl init flags| Flag | Default | Behavior |
|---|---|---|
--agent claude|codex|gemini|cursor |
auto-detect | Target one agent; overrides detection |
--minimal |
off | Scaffold only tnl/ + instruction-file stanza; skip everything below |
--no-ci |
off | Skip .github/workflows/tnl-verify.yml |
--no-mcp |
off | Skip MCP server registration |
--no-hook |
off | Skip Claude PreToolUse hook |
--with-skill |
off | (Claude only) Install /tnl-feature slash command |
--local-install |
off | (Dev-only) Rewrite configs to absolute local node dist/... paths |
Without --agent, init auto-detects targets (.claude/ → Claude; AGENTS.md → Codex; GEMINI.md → Gemini; .cursor/ → Cursor). Re-running tnl init is safe — existing files are detected and upgraded when the bundled template evolves.
tnl init auto-registers the TNL MCP server with supported agents. Once registered, the agent gains six tools:
| Tool | Purpose |
|---|---|
get_impacted_tnls |
Return TNLs whose paths: overlap with given code paths |
retrieve_tnl |
Return the verbatim contents of one or more TNLs by id |
propose_tnl_diff |
Validate and stage a batch of create/update diffs |
approve_tnl_diff |
Commit a staged diff to disk, regenerate sidecars |
verify |
Run the verifier over given paths, return structured JSON |
trace |
Record / retrieve session-scoped agent-initiated events |
Running MCP manually:
npx -y -p typed-nl tnl-mcp-server # stdio JSON-RPC server
Machine zone fields (see tnl/workflow.tnl for a working example):
| Field | Required | Notes |
|---|---|---|
id |
yes | Kebab-case slug matching the filename |
title |
yes | Short human label |
scope |
yes | repo-wide or feature |
owners |
yes | List of @handles |
paths |
scope=feature only | Files this TNL governs |
surfaces |
optional | Named external surfaces (CLI commands, routes, tools) |
dependencies |
optional | Other TNL ids this couples with |
intent |
yes | One-paragraph plain English |
behaviors |
yes | Numbered clauses using MUST / SHOULD / MAY |
non-goals |
yes | Explicit scope fences |
rationale |
optional | Tradeoffs, gotchas, why-behind-choices |
RFC 2119 keywords:
Clause prefixes:
[semantic] — judgment needed to verify (not structural)[test: <file>::<name>] — binds the clause to a named test; tnl verify checks the test still existsgit clone https://github.com/janaraj/tnl.git
cd tnl
npm install
npm run build
npm test # parser, resolver, verifier, CLI, MCP suites
npm run typecheck
Every new feature follows the TNL workflow this tool enforces. See CLAUDE.md for session guidance and tnl/ for the contracts governing this repository's own development.
MIT. See LICENSE.
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"janaraj-tnl": {
"command": "npx",
"args": []
}
}
}