loading…
Search for a command to run...
loading…
Self-hosted semantic memory layer for Claude and MCP-compatible AI clients. Store notes, search by meaning not keywords, and recall relevant context automatical
Self-hosted semantic memory layer for Claude and MCP-compatible AI clients. Store notes, search by meaning not keywords, and recall relevant context automatically across sessions. Runs free on Cloudflare Workers, D1, Vectorize, and Workers AI

One memory layer, every AI tool. Store anything once and recall it across Claude, ChatGPT, Cursor, and any MCP-compatible client — self-hosted on Cloudflare's free tier in minutes.
License: MIT Built with Cloudflare Workers MCP Compatible
Every AI tool you use has the same problem: it forgets everything between sessions. Your projects, decisions, preferences — gone the moment you close the chat. Second Brain fixes that with a single self-hosted memory layer that works across all your AI tools. Store something in Claude Desktop, recall it in Cursor, reference it in claude.ai on your phone. One brain, everywhere.
It's a lightweight Cloudflare Worker that gives any MCP-compatible AI client (Claude Desktop, Claude Code, claude.ai, etc.) a persistent memory store — with semantic search powered by vector embeddings. You can capture notes from your browser, phone, or scripts, then have your AI automatically recall relevant context at the start of every session.
Five tools. One memory layer. Every AI client.
| Tool | Parameters | Description |
|---|---|---|
remember |
content (string), tags? (string[]), source? (string) |
Store a note. Runs duplicate check first — blocked if near-exact match exists, flagged if similar. |
append |
id (string), addition (string) |
Append new information to an existing entry. Preserves original, adds update with timestamp. Use when something has changed rather than storing a duplicate. |
recall |
query (string), topK? (1–20, default 5), tag? (string) |
Semantic vector search with chunk deduplication, optionally filtered by tag |
list_recent |
n? (1–50, default 10), tag? (string) |
Chronological listing, optionally filtered by tag |
forget |
id (string) |
Delete an entry and all its chunks from both D1 and Vectorize |
A built-in web UI ships with every deploy — no extra setup required. Access it at your Worker URL: https://<your-worker-url>/
Three views:

flowchart TB
subgraph Clients["Capture Sources"]
UI[Web Dashboard]
B[Browser Bookmarklet]
I[iOS Shortcuts]
C[Claude / MCP Clients]
S[Scripts / curl]
end
subgraph Worker["Cloudflare Worker"]
CAP[POST /capture]
LIST[GET /list]
MCP[GET+POST /mcp]
DUP[Duplicate Detection]
CHUNK[Chunking]
end
subgraph Storage["Cloudflare Storage"]
D1[(D1 SQLite)]
VEC[(Vectorize Index)]
AI[Workers AI\nbge-small-en-v1.5]
end
UI --> CAP
B --> CAP
I --> CAP
S --> CAP
C --> MCP
CAP --> DUP
DUP -->|blocked| CAP
DUP -->|flagged| CHUNK
DUP -->|unique| CHUNK
CHUNK --> D1
CHUNK --> AI
AI --> VEC
MCP --> AI
MCP --> VEC
MCP --> D1
LIST --> D1
Every note is embedded as a 384-dimensional vector using bge-small-en-v1.5 on Workers AI. Semantic search queries the Vectorize index using cosine similarity — so "users drop off at the payment step" matches "onboarding problems" even though no keywords overlap.
Long notes are automatically split into overlapping chunks before embedding so each segment gets a clean vector. Near-duplicate content is detected and blocked or flagged before storing. Updates to existing entries are appended with a timestamp rather than stored as duplicates.
The fastest path to a running second brain is the one-click deploy:
Click Deploy → Cloudflare forks the repo, provisions D1 + Vectorize, and deploys the Worker automatically.
Run the schema in Cloudflare Dashboard → D1 → second-brain-db → Console:
CREATE TABLE IF NOT EXISTS entries (
id TEXT PRIMARY KEY,
content TEXT NOT NULL,
tags TEXT NOT NULL DEFAULT '[]',
source TEXT NOT NULL DEFAULT 'api',
created_at INTEGER NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_entries_created_at ON entries(created_at DESC);
CREATE INDEX IF NOT EXISTS idx_entries_source ON entries(source);
Set your auth token:
openssl rand -base64 32 # generate a secure token
wrangler secret put AUTH_TOKEN
Test it:
curl -X POST https://<your-worker-url>/capture \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"content": "second brain is working", "source": "test"}'
# → {"ok":true,"id":"..."}
Open your dashboard at https://<your-worker-url>/
Connect to Claude → see Connect to AI Clients.
Your Worker URL is in Cloudflare Dashboard → Workers & Pages →
second-brain.
It looks like:https://second-brain.<your-subdomain>.workers.dev
If you prefer to deploy manually from a clone:
wrangler CLI (installed automatically via npm install)# 1. Clone and install
git clone https://github.com/rahilp/second-brain-cloudflare.git
cd second-brain-cloudflare
npm install
# 2. Authenticate with Cloudflare
npx wrangler login
# 3. Create the D1 database
npm run db:create
# Copy the database_id output and paste it into wrangler.toml → [[d1_databases]] → database_id
# 4. Create the Vectorize index
npm run vectors:create
# 5. Run the schema migration
npm run db:migrate:remote
# 6. Set your auth token
openssl rand -base64 32
npx wrangler secret put AUTH_TOKEN
# 7. Deploy
npm run deploy
curl -X POST https://<your-worker-url>/capture \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"content": "Decided to use Cloudflare Workers for the API instead of Vercel — better cold start times and the free D1 DB is perfect for this scale.",
"tags": ["architecture", "decision"],
"source": "notes"
}'
{ "ok": true, "id": "f47ac10b-58cc-4372-a567-0e02b2c3d479" }
curl "https://<your-worker-url>/list?n=5" \
-H "Authorization: Bearer YOUR_TOKEN"
Once connected, ask Claude naturally:
You: "What do I know about infrastructure decisions?"
Claude (callingrecall): "1. [May 10 · notes] (94%) Decided to use Cloudflare Workers for the API instead of Vercel — better cold start times..."
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"second-brain": {
"command": "npx",
"args": ["mcp-remote", "https://<your-worker-url>/mcp"]
}
}
}
Restart Claude Desktop, then add this to Settings → Custom Instructions:
You have access to a personal second brain via MCP tools: remember, recall, list_recent, forget, append.
MANDATORY RULES — no exceptions:
1. At the start of EVERY conversation, call recall with the main topic before responding to anything. Do not skip this even if the topic seems simple.
2. Store EVERYTHING important automatically — call remember whenever the user mentions:
- Anything personal (goals, preferences, habits, relationships, health)
- Work context (projects, decisions, strategies, problems, progress)
- Ideas or plans (even half-formed ones)
- Tasks or commitments ("I need to", "I'm going to", "we decided")
- Opinions or reactions ("I think", "I like", "I don't want")
3. When something has changed or evolved, use append instead of remember — call append whenever the user says things like "actually", "we changed our minds", "update that", "I've decided", "that's no longer true", or "we reversed that decision". Never store a duplicate when an update to an existing entry is more appropriate.
4. Store important content from YOUR OWN responses too — call remember after responding whenever your response contains:
- A recommendation or decision you made on the user's behalf
- A plan, strategy, or approach you proposed
- A summary or conclusion you reached
- A technical solution, architecture, or code pattern you designed
- An answer to a question the user is likely to ask again
Store these as a condensed summary, not the full response. Tag with "claude-response" plus any relevant project tags.
5. Never ask permission to store — store silently and keep going.
6. NEVER use Claude's built-in memory system. If you would normally save a memory, call remember instead. Always.
7. Auto-detect the current topic or project and include it as a tag (e.g. if discussing a website, tag it "website"; if discussing a specific company or product, use that name as a tag). Always combine specific tags with generic ones.
Tags to use:
- personal — life, preferences, habits
- work — projects, decisions, strategy
- idea — concepts, plans, brainstorms
- task — things to do or follow up on
- context — background info about ongoing situations
- claude-response — summaries of important responses Claude gave
- [auto-detected project/topic tag]
Always set source to "claude-desktop" when storing.
If the second brain MCP tools are unavailable, tell me immediately. Do not fall back to built-in memory silently.
claude mcp add second-brain "npx" "mcp-remote" "https://<your-worker-url>/mcp"
Create ~/.claude/CLAUDE.md:
# Second Brain — mandatory rules
You have access to a personal second brain via MCP tools: remember, recall, list_recent, forget, append.
MANDATORY RULES — no exceptions:
1. At the start of EVERY conversation, call recall with the main topic before responding to anything. Do not skip this even if the topic seems simple.
2. Store EVERYTHING important automatically — call remember whenever the user mentions:
- Anything personal (goals, preferences, habits, relationships, health)
- Work context (projects, decisions, strategies, problems, progress)
- Ideas or plans (even half-formed ones)
- Tasks or commitments ("I need to", "I'm going to", "we decided")
- Opinions or reactions ("I think", "I like", "I don't want")
3. When something has changed or evolved, use append instead of remember — call append whenever the user says things like "actually", "we changed our minds", "update that", "I've decided", "that's no longer true", or "we reversed that decision". Never store a duplicate when an update to an existing entry is more appropriate.
4. Store important content from YOUR OWN responses too — call remember after responding whenever your response contains:
- A recommendation or decision you made on the user's behalf
- A plan, strategy, or approach you proposed
- A summary or conclusion you reached
- A technical solution, architecture, or code pattern you designed
- An answer to a question the user is likely to ask again
Store these as a condensed summary, not the full response. Tag with "claude-response" plus any relevant project tags.
5. Never ask permission to store — store silently and keep going.
6. NEVER use Claude's built-in memory system. If you would normally save a memory, call remember instead. Always.
7. Auto-detect the current topic or project and include it as a tag (e.g. if discussing a website, tag it "website"; if discussing a specific company or product, use that name as a tag). Always combine specific tags with generic ones.
Tags to use:
- personal — life, preferences, habits
- work — projects, decisions, strategy
- idea — concepts, plans, brainstorms
- task — things to do or follow up on
- context — background info about ongoing situations
- claude-response — summaries of important responses Claude gave
- [auto-detected project/topic tag]
Always set source to "claude-code" when storing.
If the second brain MCP tools are unavailable, tell me immediately. Do not fall back to built-in memory silently.
In claude.ai → Settings → Integrations → Add custom connector:
| Field | Value |
|---|---|
| Name | second-brain |
| Remote MCP server URL | https://<your-worker-url>/mcp |
This makes your second brain available in both the web app and the Claude iOS app automatically.
Create a new browser bookmark and paste the following as the URL — replacing YOUR_WORKER_URL and YOUR_TOKEN:
javascript:(function(){
const WORKER='https://YOUR_WORKER_URL/capture';
const TOKEN='YOUR_TOKEN';
const text=window.getSelection().toString().trim();
const content=text?`${text}\n\n${document.title}\n${location.href}`:`${document.title}\n${location.href}`;
fetch(WORKER,{method:'POST',headers:{'Authorization':`Bearer ${TOKEN}`,'Content-Type':'application/json'},body:JSON.stringify({content,source:'browser',tags:['reading']})})
.then(r=>r.json())
.then(()=>{
const b=document.createElement('div');
b.textContent='✓ Saved to brain';
Object.assign(b.style,{position:'fixed',top:'20px',right:'20px',zIndex:'99999',background:'#1a1a1a',color:'#fff',padding:'10px 16px',borderRadius:'8px',fontSize:'14px'});
document.body.appendChild(b);
setTimeout(()=>b.remove(),2000)
})
.catch(()=>alert('Capture failed — check your token and Worker URL'));
})();
Usage:
The full source with comments is in bookmarklet.js.
https://YOUR_WORKER_URL/capture, Method: POSTAuthorization = Bearer YOUR_TOKENcontent = Ask for Input result, source = phoneDownload Shortcut — after installing, open the shortcut and update YOUR_WORKER_URL and YOUR_TOKEN with your values.
source = voiceName it something Siri-friendly like "Brain dump" to trigger hands-free: "Hey Siri, Brain dump."
Download Shortcut — after installing, open the shortcut and update YOUR_WORKER_URL and YOUR_TOKEN with your values.
Save any link directly from Safari or any app:
source = browser, tags = ["reading"]All endpoints require an Authorization: Bearer YOUR_TOKEN header (except CORS preflight).
POST /captureStore an entry. Duplicate detection runs synchronously. Embedding happens in the background so the response is instant after the duplicate check.
Request body:
{
"content": "your note here", // required
"tags": ["work", "idea"], // optional
"source": "api" // optional, defaults to "api"
}
Responses:
{ "ok": true, "id": "uuid-v4" }
{
"ok": true,
"id": "uuid-v4",
"warning": "similar",
"matchId": "existing-uuid",
"score": 88.5,
"message": "Stored but similar entry exists — tagged as duplicate-candidate"
}
{
"ok": false,
"duplicate": true,
"matchId": "existing-uuid",
"score": 97.2,
"message": "Near-exact duplicate detected — not stored"
}
| Status | Meaning |
|---|---|
200 ok:true |
Entry stored successfully |
200 ok:false duplicate:true |
Blocked — near-exact duplicate |
400 |
Missing/invalid content or malformed JSON |
401 |
Missing or invalid auth token |
GET /list?n=20List recent entries in reverse chronological order.
| Query param | Default | Max | Description |
|---|---|---|---|
n |
20 |
100 |
Number of entries to return |
GET+POST /mcpMCP server endpoint using the Streamable HTTP transport. Connect any MCP-compatible client here.
Every entry is embedded using bge-small-en-v1.5 via Workers AI, converting text into a 384-dimensional vector that represents its meaning. When you call recall, your query is embedded the same way and Cloudflare Vectorize finds the closest stored vectors by cosine similarity.
Example: Store "users drop off at the payment step" and later recall it with "onboarding problems." The keyword "payment" never appears in the query — but the meaning matches.
This is what separates Second Brain from a simple keyword search or a tag system.
Long notes are automatically split into overlapping segments before embedding. This solves two problems:
bge-small-en-v1.5) has a ~512 token limit. Content beyond that is truncated — chunking ensures the full note is searchable.How it works:
recall fetches extra results and deduplicates by parent ID, returning only the best-matching chunk per entryforget deletes the parent entry and all its chunksBefore storing, every entry is checked against existing vectors for similarity. Three outcomes:
| Similarity score | Outcome | Response |
|---|---|---|
| >= 95% | Blocked | Returns existing entry ID, nothing stored |
| 85–95% | Flagged | Stored with duplicate-candidate tag, match info included in response |
| < 85% | Unique | Stored normally |
This prevents the brain from accumulating near-identical entries from clicking the bookmarklet twice on the same article, or Claude storing the same context multiple times across sessions.
The duplicate check requires one embed call before inserting, adding ~300ms to each capture. This runs synchronously so the response always reflects what actually happened.
| Service | Role |
|---|---|
| Cloudflare Workers | Serverless runtime — globally distributed, ~0ms cold start |
| Cloudflare D1 | SQLite-compatible relational database for structured storage |
| Cloudflare Vectorize | Vector index for semantic (cosine) similarity search |
| Cloudflare Workers AI | Runs bge-small-en-v1.5 for text embeddings |
| MCP TypeScript SDK | Implements the Model Context Protocol server |
All free tier at personal scale — no credit card required for typical usage.
npm install
npm run dev # starts wrangler dev with local D1 + Vectorize stubs
Note: Vectorize and Workers AI are only available remotely. For local development, embedding calls will gracefully fail and entries will still be stored in D1 without vectors.
To run against remote resources during development:
npx wrangler dev --remote
| Script | Description |
|---|---|
npm run dev |
Start local dev server |
npm run deploy |
Deploy to Cloudflare Workers |
npm run db:create |
Create the D1 database |
npm run db:migrate |
Run schema against local D1 |
npm run db:migrate:remote |
Run schema against remote D1 |
npm run vectors:create |
Create the Vectorize index |
MIT — use it, fork it, make it your own.
Run in your terminal:
claude mcp add second-brain-cloudflare -- npx