loading…
Search for a command to run...
loading…
Kimiflare — a terminal coding agent powered by Kimi-K2.6 on Cloudflare Workers AI. Image understanding, plan mode, MCP server integration, 262k context.
Kimiflare — a terminal coding agent powered by Kimi-K2.6 on Cloudflare Workers AI. Image understanding, plan mode, MCP server integration, 262k context.
A terminal coding agent powered by Kimi-K2.6 on Cloudflare Workers AI.
Moonshot's 1T-parameter open-source model, running directly on your Cloudflare account.
💸 Heads up — this runs on your Cloudflare account. We recommend setting a budget cap on Workers AI and checking your Cloudflare billing regularly while using KimiFlare.
🚀 Stay up to date. Newer versions are significantly more token-efficient and cheaper to run. Run
/updateinside KimiFlare ornpm update -g kimiflareto get the latest release.
npm install -g kimiflare
kimiflare
On first run, an interactive onboarding wizard asks for your Cloudflare Account ID and API Token. That's it — you're ready.
Or run without installing:
npx kimiflare
Requires Node.js ≥ 20.
| Feature | What it does |
|---|---|
| Plan / Edit / Auto modes | plan blocks all mutating tools for safe research. edit (default) prompts per mutating call. auto approves everything for trusted tasks. |
| Live task panel | For multi-step work, the agent publishes a task list with progress icons (■ active, ☐ pending, ✓ done), elapsed time, and token deltas. |
| 14 terminal themes | dark, light, high-contrast, dracula, nord, one-dark, monokai, solarized-dark/light, tokyo-night, gruvbox-dark/light, catppuccin-mocha, rose-pine. Interactive picker with live preview (Ctrl+T). |
| Paste collapse | Large pastes (≥200 chars or ≥2 newlines) collapse to [pasted N lines #id]. Full content still goes to the model — scrollback stays clean. |
| Type-ahead queue | Type your next prompt while the model is still working. Queued prompts show as ⏳ … and fire in order. Ctrl-C aborts current + clears queue. |
| Auto-compaction | At ~80% context usage, kimiflare nudges you to run /compact. It summarizes older turns into a dense summary, keeping the last 4 turns intact. |
| Streaming reasoning | Toggle the model's chain-of-thought with /reasoning or Ctrl-R. See how it thinks in real time. |
| Image understanding | Drop image paths (PNG, JPG, WebP, GIF, BMP up to 5 MB) into any prompt. The model sees them inline — perfect for UI reviews, diagrams, and screenshots. |
| Live cost tracking | Status bar shows real-time cost based on Cloudflare pricing: $0.95/M input, $0.16/M cached, $4.00/M output. |
| Optional AI Gateway | Route Workers AI traffic through your own Cloudflare AI Gateway for request logs, cache status, and analytics while keeping your API token local. |
| Session persistence | Every turn is auto-saved. /resume lists past sessions (with message counts) in a paginated picker. |
| Smart permissions | Bash session-allow is keyed by the first token (e.g., allow all git commands). Write/edit show a unified diff before you approve. |
Project context (/init) |
Scans your repo and writes a concise KIMI.md — build commands, layout, conventions. Auto-loaded on every launch. |
| MCP server integration | Plug in external tools via the Model Context Protocol — local stdio servers or remote SSE endpoints. GitHub, Sentry, docs search, databases, etc. |
| Co-author auto-append | Detects git commit commands and auto-injects Co-authored-by: kimiflare <[email protected]>. |
| Local structured memory | SQLite + embeddings cross-session memory. Extracts facts, instructions, and preferences at compaction time; recalls them via hybrid search (FTS5 + vector + exact) in future sessions. Team-shareable via .kimiflare/memory.db. |
| Resilient transport | Retries Cloudflare capacity errors (code 3040) and 5xx with exponential backoff up to 5 attempts. |
Get credentials from Cloudflare:
Then either export them each shell:
export CLOUDFLARE_ACCOUNT_ID=...
export CLOUDFLARE_API_TOKEN=...
# Optional: route through a Cloudflare AI Gateway you own
export KIMIFLARE_AI_GATEWAY_ID=...
# Optional: enable local structured memory
export KIMIFLARE_MEMORY_ENABLED=1
export KIMIFLARE_MEMORY_DB_PATH=.kimiflare/memory.db
export KIMIFLARE_MEMORY_MAX_AGE_DAYS=90
export KIMIFLARE_MEMORY_MAX_ENTRIES=1000
or save them once (chmod 600 automatically):
mkdir -p ~/.config/kimiflare
cat > ~/.config/kimiflare/config.json <<'EOF'
{
"accountId": "YOUR_ACCOUNT_ID",
"apiToken": "YOUR_API_TOKEN",
"model": "@cf/moonshotai/kimi-k2.6",
"aiGatewayId": "YOUR_GATEWAY_NAME"
}
EOF
chmod 600 ~/.config/kimiflare/config.json
kimiflare talks directly to Workers AI unless aiGatewayId is configured. When set, chat completions are sent to Cloudflare's native Workers AI Gateway endpoint:
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/{model_id}
Create a gateway in the Cloudflare dashboard under AI > AI Gateway, then set aiGatewayId in ~/.config/kimiflare/config.json or export KIMIFLARE_AI_GATEWAY_ID. The same Workers AI API token stays on your machine and is sent to Cloudflare.
Optional per-request controls:
{
"aiGatewayCacheTtl": 3600,
"aiGatewaySkipCache": false,
"aiGatewayCollectLogPayload": false,
"aiGatewayMetadata": { "tool": "kimiflare" }
}
cf-aig-cache-status from AI Gateway is shown separately from Workers AI prompt-token caching (cached_tokens). If you enable gateway logs, kimiflare records metadata such as log id, cache hit/miss, tokens, duration, and status when Cloudflare returns it; prompt and response bodies are not stored by kimiflare.
kimiflare supports external tools via MCP. Add servers to your ~/.config/kimiflare/config.json:
{
"accountId": "YOUR_ACCOUNT_ID",
"apiToken": "YOUR_API_TOKEN",
"mcpServers": {
"github": {
"type": "local",
"command": ["npx", "-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_xxx" }
},
"fetch": {
"type": "local",
"command": ["uvx", "mcp-server-fetch"]
},
"my-remote": {
"type": "remote",
"url": "https://example.com/mcp",
"headers": { "Authorization": "Bearer token123" }
}
}
}
type: "local" (stdio subprocess) or "remote" (SSE/HTTP endpoint)command: array with executable and args (local only)url: endpoint URL (remote only)env: environment variables for local serversheaders: HTTP headers for remote serversenabled: set to false to skip a serverMCP tools appear prefixed as mcp_<server>_<tool> alongside built-in tools.
Commands:
/mcp list — show connected servers and tool counts/mcp reload — disconnect and reconnect all configured serverskimiflare can remember facts, instructions, and preferences across sessions using a local SQLite database with vector search.
How it works:
@cf/baai/bge-base-en-v1.5) in a local SQLite database.kimiflare/memory.db in your repo root (add to .gitignore)Enable:
export KIMIFLARE_MEMORY_ENABLED=1
Or in ~/.config/kimiflare/config.json:
{
"memoryEnabled": true,
"memoryDbPath": ".kimiflare/memory.db",
"memoryMaxAgeDays": 90,
"memoryMaxEntries": 1000,
"memoryEmbeddingModel": "@cf/baai/bge-base-en-v1.5"
}
Commands:
/memory — show memory stats (total count, DB size, by category)/memory search <query> — manual hybrid search over stored memories/memory clear — wipe all memories for the current repoStorage & cleanup:
kimiflare # launch TUI
kimiflare --model @cf/moonshotai/kimi-k2.6 # override model
kimiflare -p "summarize PLAN.md" # stream answer to stdout
kimiflare -p "..." --dangerously-allow-all # auto-approve mutating tools (for scripts)
kimiflare -p "..." --reasoning # include chain-of-thought in stderr
Reference image files directly in your prompt — the model sees them inline:
kimiflare
› fix the layout bug in this screenshot docs/bug.png
› convert this mockup design.png to Tailwind HTML
› explain this architecture diagram.png
Supported formats: PNG, JPG, JPEG, WebP, GIF, BMP (up to 5 MB each, 10 per message).
| Flag | Short | Description |
|---|---|---|
--print <prompt> |
-p |
One-shot mode: send prompt, stream reply, exit |
--model <id> |
-m |
Model ID (default: @cf/moonshotai/kimi-k2.6) |
--dangerously-allow-all |
— | Auto-approve every permission prompt (print mode only) |
--reasoning |
— | Stream chain-of-thought to stderr (print mode only) |
--version |
-V |
Show version |
--help |
-h |
Show help |
| Command | Effect |
|---|---|
/mode edit|plan|auto |
Switch mode. edit prompts for permission (default), plan is read-only research, auto auto-approves every tool call. |
/plan /auto /edit |
Shortcuts for the three modes. |
/thinking low|medium|high |
Reasoning effort. low = fastest, shallow; medium = balanced (default); high = deepest, slowest. Saved to config. |
/theme |
Interactive theme picker with live preview (Ctrl+T). Saved to config. |
/theme NAME |
Set theme by name directly. |
/resume |
Pick a past conversation to restore. |
/compact |
Summarize older turns to free context. Suggested automatically at ~80% full. Extracts memories if memory is enabled. |
/init |
Scan the repo and write a KIMI.md so future agents have project context. |
/memory |
Show memory stats (total count, DB size, by category). |
/memory search <query> |
Search stored memories manually. |
/memory clear |
Wipe all memories for the current repo. |
/mcp list |
List connected MCP servers and their tools. |
/mcp reload |
Disconnect and reconnect all configured MCP servers. |
/reasoning |
Toggle chain-of-thought display. |
/clear |
Reset the current conversation. |
/cost |
Show token usage for the current turn. |
/model |
Show current model. |
/update |
Check for updates manually. |
/logout |
Clear saved credentials. |
/help |
List all commands. |
/exit |
Quit. |
| Shortcut | Action |
|---|---|
Ctrl+C |
Interrupt current turn (press again to exit) |
Ctrl+R |
Toggle reasoning display |
Ctrl+O |
Toggle verbose tool output |
Ctrl+T |
Open theme picker |
Shift+Tab |
Cycle mode (edit → plan → auto) |
↑ / ↓ |
Walk prompt history |
| Shortcut | Action |
|---|---|
⌥← / ⌥→ |
Jump word left/right |
⌘← / ⌘→ |
Jump to start / end of line |
⌥⌫ |
Delete word backward |
⌘⌫ |
Delete to start of line |
⌥⌦ |
Delete word forward |
Ctrl+A / Ctrl+E |
Start / end of line |
Ctrl+W / Ctrl+U / Ctrl+K |
Delete word backward / to start / to end of line |
write, edit, bash) pause for your approval.Kimi-K2.6 always reasons, but you can cap the effort:
Set with /thinking medium (persists), or per-launch via KIMI_REASONING_EFFORT=high.
All tool calls show inline; mutating ones require per-call approval the first time, with an option to allow for the rest of the session.
| Tool | Permission | What it does |
|---|---|---|
read |
auto | Read a text file (≤ 2MB) with optional line range. |
write |
prompt | Create or overwrite a file. Shows a unified diff before you approve. |
edit |
prompt | Replace an exact substring. Fails unless old_string is unique (or replace_all=true). |
bash |
prompt | Run a shell command via bash -lc. Session-allow is keyed by the first token of the command. |
glob |
auto | Match files by pattern (**/*.ts), sorted by mtime. |
grep |
auto | Regex search. Uses rg if installed; falls back to a JS walk. |
web_fetch |
auto | Fetch a URL, convert HTML → markdown (≤ 100KB). |
tasks_set |
auto | Publish a live task list for multi-step work. |
┌───────────────────────────────────────────────────────────┐
│ kimiflare (Node.js TUI) │
user ─▶ │ │
│ user msg ─▶ agent loop ─▶ runKimi() ──[POST SSE]──▶ │
│ ▲ │
│ │ │
│ tool result ◀──tool executor──◀ tool_calls │
│ (permission modal for write / edit / bash) │
└───────────────────────────────────────────────────────────┘
│
▼
api.cloudflare.com/client/v4
/accounts/{ID}/ai/run/
@cf/moonshotai/kimi-k2.6
Direct fetch to Workers AI by default, or the native Workers AI AI Gateway endpoint when aiGatewayId is configured. The payload remains OpenAI-compatible messages + tools, with an SSE stream containing reasoning + content + tool-call deltas accumulated by index.
git clone https://github.com/sinameraji/kimiflare
cd kimiflare
npm install
npm run build
npm link # or: ln -s "$PWD/bin/kimiflare.mjs" ~/.local/bin/kimiflare
Scripts:
npm run build — bundle with tsup (dist/ + bin/kimiflare.mjs)npm run dev — run via tsx (tsx src/index.tsx)npm run typecheck — tsc --noEmitnpm start — run compiled binContributions are welcome!
git checkout -b feat/your-featurenpm run typecheck and npm run buildgit commit -m "feat: description"git push origin feat/your-featureYou don't need a real MCP server to test the integration. Here's a minimal test server you can save as test-mcp-server.js:
// test-mcp-server.js — a minimal MCP server for testing
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
const server = new Server({ name: "test-server", version: "1.0.0" }, { capabilities: { tools: {} } });
server.setRequestHandler("tools/list", async () => ({
tools: [
{
name: "greet",
description: "Greet someone by name",
inputSchema: {
type: "object",
properties: { name: { type: "string" } },
required: ["name"],
},
},
{
name: "add",
description: "Add two numbers",
inputSchema: {
type: "object",
properties: { a: { type: "number" }, b: { type: "number" } },
required: ["a", "b"],
},
},
],
}));
server.setRequestHandler("tools/call", async (req) => {
if (req.params.name === "greet") {
return { content: [{ type: "text", text: `Hello, ${req.params.arguments.name}!` }] };
}
if (req.params.name === "add") {
const sum = req.params.arguments.a + req.params.arguments.b;
return { content: [{ type: "text", text: String(sum) }] };
}
throw new Error("Unknown tool");
});
const transport = new StdioServerTransport();
await server.connect(transport);
Then add it to your config:
{
"mcpServers": {
"test": {
"type": "local",
"command": ["node", "/path/to/test-mcp-server.js"]
}
}
}
Launch kimiflare and try:
/mcp list — should show test (local) — 2 toolsuse mcp_test_greet with name "kimiflare" — should return Hello, kimiflare!use mcp_test_add with a 3 and b 5 — should return 8For a real-world test, try the official GitHub MCP server:
{
"mcpServers": {
"github": {
"type": "local",
"command": ["npx", "-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_xxx" }
}
}
}
Then ask: search for issues labeled bug in sinameraji/kimiflare
MIT © Sina Meraji
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"kimiflare": {
"command": "npx",
"args": [
"-y",
"kimiflare"
]
}
}
}