loading…
Search for a command to run...
loading…
MCP server for analyzing Clay tables — schema, rows, errors, and enrichment debugging, using the internal Clay API to enable AI assistants to understand and deb
MCP server for analyzing Clay tables — schema, rows, errors, and enrichment debugging, using the internal Clay API to enable AI assistants to understand and debug Clay tables.
MCP server for analyzing Clay tables — schema, rows, errors, credit cost, and enrichment debugging, via Clay's public v3 API.
Connects Claude (Desktop, Code, or any MCP client) to Clay. Share any app.clay.com URL and Claude can read the schema, pull rows, trace enrichment failures end-to-end through Clay Functions (subroutines), and see exactly how many credits each cell consumed. Writing formulas and Claygent prompts lives in the companion clay-gtm-architect project — slab reads and debugs, the architect builds.
slab authenticates with a Clay API key — cross-platform (macOS / Linux / Windows), no Chrome dependency.
clay-formulas and clay-prompt-eng skills, which walk the workflow: gather inputs, pick the mode, apply the section structure and casing conventions, validate against the production-bug checklist.Most of the interesting state in a Clay table — schema, formula text, run conditions, cell statuses, provider responses, credit cost per cell, subroutine pointers — comes back from the /v3/ API. slab wraps that API, adds session caches, and exposes the whole thing as MCP tools. This branch uses a Clay API key for auth so the server runs anywhere Node runs, with no browser or Keychain dependency.
A core design choice: slab returns structured JSON, not pre-digested prose. Earlier versions rendered markdown summaries, classified identifier values by shape, and flagged columns as "likely broken" inside the script. That's all gone. The current shape is: fetch → project to a token-cheap form → return raw JSON. Interpretation, classification, and judgment happen in the LLM. The script's job is fetching, projecting, and caching — not deciding.
Before starting, make sure you have these installed:
1. git
Run git --version in Terminal. If you see command not found or a dialog asking to install developer tools, run:
xcode-select --install
Click Install in the dialog that appears and wait for it to finish (~5–10 min) before continuing.
2. Node.js ≥ 18
Run node -v in Terminal. If you see command not found or a version below 18:
.pkg installer from nodejs.org — click the LTS button, double-click the downloaded file, and follow the prompts. This is standalone and doesn't require Homebrew or Xcode.brew install node — only use this if Homebrew is already installed and your Xcode Command Line Tools are fully set up (i.e. step 1 above is complete).After installing, close and reopen Terminal, then confirm with node -v.
3. A Clay API key
Go to Settings → Account → API Key in Clay and copy the key. You'll need it in Step 2 below.
Run each command separately in Terminal — don't paste them all at once:
git clone https://github.com/gunnerpark-alt/slab-mcp.git
cd slab-mcp
npm install
No build step — npm install is everything. The server runs directly as node index.js. Note the directory path — you'll need it in Step 3.
In Clay, go to Settings → Account → API Key and copy the key. It works across every workspace you belong to — you don't need a separate key per workspace.
First, find your exact path by running this in Terminal (from inside the slab-mcp folder):
pwd
It will print something like /Users/yourname/slab-mcp. Your config path will be that output with /index.js appended.
If the config file doesn't exist yet (e.g. you just installed Claude Desktop and never opened it, or open ~/Library/Application\ Support/Claude/claude_desktop_config.json says the file doesn't exist), create it from scratch. Replace /Users/yourname/slab-mcp with your actual pwd output:
mkdir -p ~/Library/Application\ Support/Claude
cat > ~/Library/Application\ Support/Claude/claude_desktop_config.json << 'EOF'
{
"mcpServers": {
"slab": {
"command": "node",
"args": ["/Users/yourname/slab-mcp/index.js"],
"env": {
"CLAY_API_KEY": "paste-your-key-here"
}
}
}
}
EOF
If the file already exists, open it in a text editor and add the "slab" block inside the existing "mcpServers" object — don't replace the whole file:
{
"mcpServers": {
"slab": {
"command": "node",
"args": ["/Users/yourname/slab-mcp/index.js"],
"env": {
"CLAY_API_KEY": "paste-your-key-here"
}
}
}
}
Either way, replace paste-your-key-here with your Clay API key and /Users/yourname/slab-mcp/index.js with your actual path.
Don't put the key in
args— useenvso it stays out of shell history and logs.
Fully quit the app (Cmd+Q on Mac), then reopen it. A simple window close isn't enough — the MCP server only starts on launch.
In Claude Desktop go to Settings → Developer. You should see slab listed with a green connected indicator.
If it shows an error:
ls /path/to/slab-mcp/index.js in Terminal to confirm the file exists at the path you put in the config.Add to ~/.claude.json (user-level) or a project .mcp.json:
{
"mcpServers": {
"slab": {
"command": "node",
"args": ["/Users/yourname/slab-mcp/index.js"],
"env": {
"CLAY_API_KEY": "paste-your-key-here"
}
}
}
}
Then run /mcp in Claude Code and reconnect slab.
slab is a standard stdio MCP server — anything that speaks MCP can run it:
CLAY_API_KEY=<your-key> node /Users/yourname/slab-mcp/index.js
slab exposes eight data tools — what a table IS and what's IN it. Builder workflows (writing formulas, writing Claygent prompts) live in the companion clay-gtm-architect project instead.
| Tool | Use when | Returns |
|---|---|---|
sync_table |
URL contains /tables/ (not /workbooks/) |
{ rootSchema, subroutines }. Schema includes every field's full typeSettings (formula text, prompts, run conditions, full inputsBinding) and pricing on action fields (basic credits, actionExecution, pre/post-2026 pricing). Recursively syncs invoked functions to depth 3, max 20 tables. |
sync_workbook |
URL contains /workbooks/ |
{ workbookId, tables, externalSubroutines, errors }. Every table in the workbook plus any function it calls that lives elsewhere. Cross-table connections are not pre-computed — derive from typeSettings. |
get_rows |
Show data, check fill rates, find a row's _rowId by query, look up an entity by name / domain / email, switch to a saved view without re-syncing |
{ totalRows, returnedCount, view, rows }. Every row carries _rowId. Accepts either tableId (from a prior sync_table) or url (auto-syncs). With query, rows also have matchedColumns (every column whose cell matched); pass identifier_column to scope the search to one column. Pass view (viewId gv_* or view name like "All rows" / "Errored rows") to query a different saved view than the one picked at sync time. |
export_csv |
You need all (or most) rows for analysis and display values are enough — full-table scans, coverage analysis, summarising patterns across the dataset | { totalRows, returnedCount, truncated, view, columns, rows }. Kicks off a Clay export job, polls until done, downloads once — one round-trip regardless of table size, no row cap. Pass columns to project down to just the columns you need (the only way wide/large tables fit the ~900KB response budget); truncated + a warning flag when rows were cut. Accepts view like get_rows. Not for membership checks — that's find_rows. |
find_rows |
You have a list of IDs / emails / domains and need to know which ones exist in a column — membership checks that would blow the response budget as a full export | { column, searched, uniqueSearched, matchedCount, notFoundCount, totalRowsScanned, matches: [{ value, _rowId, ...return_columns }], not_found }. Fetches rows inside the server and intersects there — raw table data never crosses the MCP transport, so table size can't blow the budget. Matching is exact + case-sensitive against one column; return_columns adds context fields to each match; every match carries _rowId for follow-up get_record. Scans up to the records-API cap (20000 rows). |
get_record |
You have a _rowId and need raw provider JSON, credit cost, or you're following a subroutine origin pointer |
{ _rowId, _credits, <columnName>: { value, status, fullContent, credits? } }. |
get_credits |
"How much did row X cost" / "average credit cost per row" / "which column is most expensive" | One tool, three modes. With rowId: that row's per-column breakdown, including the cost of any function (execute-subroutine) calls it triggered — recursively follows origin pointers to function rows since parent cells don't carry subroutine cost. Without rowId: samples N rows (default 50), aggregates, splits direct columns from subroutine columns, and extrapolates a total table cost. Pass full: true to scan every row, or subroutine_depth: 0 to skip recursion (parent-only cost, will undercount). |
get_errors |
Broad "what's failing" / health check | { rowsAnalyzed, view, columns: [{ success, error, hasNotRun, queued, total, fillPct, topErrors }] }. Counts only — Claude derives "broken" vs "gated by run condition" from the schema. Pass view="Errored rows" (or any saved view) to scope the count; on a table with that view, this is much faster than counting across the whole table. |
The MCP server's instructions field carries the full decision tree (which tool when, the cost rule of thumb between cheap surface reads and expensive nested fetches, how to follow subroutine pointers, how to interpret credit fields) — Claude sees it on every conversation that uses any slab tool.
slab's job is to bridge Clay's internal API to an LLM. The architecture follows from one constraint and one bet:
The constraint: Clay's /v3/ API responses are huge. A single 19-column table's schema is hundreds of KB of action definitions, output schemas, and UI strings. Returning them raw would 5–10x token cost before Claude reads anything.
The bet: every other piece of judgment — what counts as "broken," whether a value is an email or a domain, how to format an explanation — is better done by the LLM than by JavaScript heuristics. Scripts are fast and deterministic but rigid; the moment Clay reworks an error string or adds a new identifier shape, a heuristic silently misclassifies. The LLM weighs context.
Together those two ideas produce slab's shape: scripts handle what the LLM can't do cheaply (auth, polling, pagination, caching, projecting away noise); the LLM handles everything else.
index.js MCP server, tool definitions, decision-tree instructions
src/clay-api.js Clay v3 API client — endpoint helpers + schema projection
src/auth.js Credential resolver (CLAY_API_KEY env → ~/.slab/config.json fallback)
src/row-utils.js Status counting, record projection (token-cheap shape)
Four things are worth understanding deeper because they're what makes slab actually useful for "explain this table" / "why did X fail" questions, beyond just listing endpoints:
In Clay, a "function" is another table invoked from a parent column with actionKey: "execute-subroutine". The parent's schema tells you WHICH function runs; the function's own schema tells you HOW. Without the function's schema, any explanation of the parent is incomplete.
sync_table follows every execute-subroutine reference and syncs the target table too, recursively, up to depth 3 with a 20-table cap. sync_workbook does the same and additionally pulls in any function referenced by a workbook table that lives outside the workbook. Every fetched schema goes into the in-memory cache, so subsequent get_rows / get_record / get_errors calls on those tableIds work without a fresh sync.
The result: one call brings back the entire call graph the parent table participates in.
get_rows has two paths, both backed by Clay's internal records API — no CSV export jobs involved (that's export_csv, below), no session caching.
With a query — server-side search. Slab posts to Clay's internal POST /tables/{tableId}/views/{viewId}/search endpoint with { searchTerm }. Clay runs the same case-insensitive substring match its own UI search box uses, returning { results: [{ fieldId, recordId }] } — one entry per matching cell, so the same record can repeat across columns. Slab dedupes by recordId, builds a matchedColumns list per match, then fetches each unique record in parallel (concurrency 5) to populate display values. One round-trip plus N record fetches, no row-count ceiling. Server caps at 1000 matching cells per response — broad substrings like "@" or a common domain may saturate it; slab surfaces a hitCapWarning when that happens.
This replaced an earlier paginate-the-whole-CSV approach that on a 100k-row table scanned forever and could fail mid-way. Server-side search returns in seconds for any table size.
Without a query — direct read. GET /records?limit=N returns the first N rows, each with its own id and cells. Slab projects cells.value to a display dict, attaches _rowId, and returns. One API call, no caching needed.
The records endpoint is also what powers get_errors and get_credits aggregate mode via listRows, which fetches up to RECORDS_API_CAP (20000) rows in one shot — the records API silently ignores every pagination param we tested (offset, cursor, after, page, ...). get_errors flags truncated in its response when the view exceeds the cap so partial counts aren't read as full coverage.
A search like "acme" will often hit multiple columns simultaneously — Name, URL, parent company. Each match's matchedColumns list is what Claude uses to pick the right hit. There's no script-side priority hierarchy.
Bulk reads get two dedicated tools on top of these primitives. export_csv bypasses the records API entirely: it kicks off Clay's async export job, polls until the file is ready, and downloads the result once — one round-trip regardless of table size, no row cap. The parsed result is truncated to fit the MCP response budget (~900KB) with a truncated flag and warning when rows were dropped, which is why its columns projection parameter matters on wide tables. find_rows goes the other way: it pulls rows via the records API (same 20k cap) but does the set intersection inside the server process, so only matches and misses cross the MCP transport — a membership check against a list of values costs the same whether the table has 50 rows or 20,000.
When a record's cell contains fullContent.origin = { tableId, recordId }, that cell is the OUTPUT of a function execution and origin is a POINTER to the row that actually ran. The parent row tells you the function "succeeded" with a display value; the child row reveals what happened inside — which provider in the waterfall ran, which inputs the parent passed in, which run conditions gated.
The MCP server's instructions require Claude to follow every origin pointer with get_record(origin.tableId, origin.recordId) before calling an execution trace complete, recursing up to 3 levels deep. Without that, "why did X fail for Y?" answers are surface-level guesses. With it, you get the full execution graph.
This is the highest-leverage thing slab does. It's also the easiest part to get wrong if you skip the follow-up calls — which is why it's spelled out in both the server instructions and the get_record tool description.
Every action cell in a Clay record carries credit data in externalContent:
upfrontCreditUsage.totalCost — basic credits charged for the run.additionalCreditUsage.totalCost — credits charged after the run completed (e.g., long-running providers).hiddenValue.costDetails.totalCostToAIProvider — for AI columns, the underlying OpenAI / Anthropic dollar cost.The schema-level pricing (field.actionDefinition.pricing.credits) tells you what a column COSTS per run; the record-level usage tells you what a row ACTUALLY COST. Both are exposed:
sync_table returns each action field's pricing (current credits + post-2026 pricing).get_record returns per-cell credits.{ total, upfront, additional, aiProviderCost } and a row-level _credits.{ total, billedCellCount } roll-up.get_credits rolls credit data up across rows in one call — pass a rowId for one row's breakdown, or omit it to sample N rows (default 50) and extrapolate a total table cost. Pass full: true to scan every row.Subroutine cost is billed separately — and get_credits follows it. When a parent row has an execute-subroutine cell (a column that invokes another table as a function), that cell's credits is null. The actual function-call credits are billed on the function row that ran, reachable via fullContent.origin.{tableId, recordId}. get_record alone shows only the parent's direct cost, which can drastically undercount the true per-row spend on tables that use functions heavily. get_credits recursively follows the origin pointers and returns { direct, viaSubroutines, total } plus per-subroutine-column detail. Default recursion depth is 2 (parent → function → one nested level); pass subroutine_depth: 0 to disable. Aggregate mode splits the rollup into byColumn (direct cells) and bySubroutineColumn (function calls), so it's clear which Clay column triggers the most function-driven spend.
This makes "how much does this row cost" / "what's our average credit spend per row" / "which column is most expensive" / "did this AI call burn $ I didn't expect" answerable without the user opening the Clay UI.
Clay tables expose multiple saved views — typical examples: "Default view," "All rows," "Errored rows," and any custom filtered view the table builder created. Each view returns a different row set (server-side filtered) and potentially a different ordering.
sync_table lists every available view in rootSchema.views: [{ id, name }] and picks one as the active default. The picker prefers an "All rows" view if it exists, otherwise the URL's view, otherwise the table's first view. That choice is the default for get_rows and get_errors after sync.
To query a different view without re-syncing, pass the optional view parameter to either tool. It accepts a viewId (gv_*) or a view name (case- and whitespace-tolerant — "Errored rows", "errored_rows", "ERRORED ROWS" all match the same view). The most common power move: get_errors with view="Errored rows" is much faster and more focused than scanning the whole table — it only counts statuses across already-failing rows. Same pattern works for any custom filter view ("Fully enriched rows", "Tier 1 accounts only", etc.).
The schema cache lives in-process and vanishes when the MCP server restarts.
schemaCache: tableId → schema. Populated by sync_table / sync_workbook, and lazily by get_rows when called with a url instead of a tableId. There is no row cache — both get_rows paths hit Clay directly each call.The builder skills that used to live here have moved to the companion clay-gtm-architect project, where they're now clay-formulas (write / fix / review Clay formulas) and clay-prompt-eng (write / fix / review Claygent and Use AI prompts), alongside the clay-architect skill. Install them from that repo's plugin marketplace — it's the single source of truth, so there's nothing to copy or symlink out of slab anymore.
Reference docs let Claude look things up after it's already chosen what to do. Skills shape the order of operations before any choice is made — a workflow with an embedded constraint set, not a lookup-after-the-fact cookbook. For "write a Clay formula" or "write a Claygent prompt," that ordering is the leverage. Those skills now live in clay-gtm-architect, but the rationale is unchanged.
slab is the sensor: it reads, traces, and costs live Clay tables through the v3 API. clay-gtm-architect is the memory + actuator: a curated Clay Knowledge Base plus the clay-architect, clay-prompt-eng, and clay-formulas skills that turn briefs into buildable specs and answer best-practice questions. The two close the loop — clay-gtm-architect's clay-kb-curate skill reads tables via this MCP and proposes Knowledge Base additions from what it finds. Both run together inside the Slack-fronted managed agent.
"Sync this workbook and tell me what it does"
sync_workbook → Claude reads every table's schema + cross-table connections → explains
"Summarize the domains / analyze patterns across the whole table"
export_csv(url, columns=["Company Name", "Domain"])
→ every row's display values arrive in one call (async export job, no row cap)
→ Claude reasons over the full dataset — no per-row fetching
"Which of these 200 account IDs are already in the table?"
find_rows(url, column="Account ID", values=[...])
→ { matches: [{ value, _rowId, ... }], not_found: [...] }
→ intersection happens in the server — table size can't blow the response budget
→ optional: get_record on interesting matches via their _rowId
"Why did this enrichment fail for Acme Corp?"
get_rows(url, query="Acme Corp")
→ Claude scans matches, picks the right hit by matchedColumns
get_record(tableId, rowId)
→ record arrives with all subroutine origin pointers
get_record per origin (in parallel)
→ Claude reconstructs the full execution graph
"How much did row X cost in credits?"
get_rows(url, query="X", limit=1)
→ match comes back with _rowId
get_record(tableId, rowId)
→ record._credits.total = 10.9 across 6 billed cells
→ per-cell breakdown shows which column was most expensive
→ AI cells additionally disclose the OpenAI dollar cost
"How much does this whole table cost to run, on average per row?"
get_credits(tableId) # samples 50 rows, extrapolates
→ perRow.avg, perRow.min, perRow.max
→ byColumn ranked by avgCreditsPerRow (which enrichments dominate)
→ extrapolatedTotalCredits = avg × schema.rowCount
"Help me rewrite the Claygent prompt in column X"
sync_table (schema shows current prompt text in full)
→ clay-prompt-eng skill (clay-gtm-architect) handles the rewrite
→ skill picks mode (web research vs content manipulation)
→ skill walks the 12- or 10-section workflow
→ returns a rewritten prompt with proper casing, null policy, examples
"Fix this formula"
sync_table (schema shows current formula text)
→ clay-formulas skill (clay-gtm-architect) handles the fix
→ skill checks sandbox traps, optional-chaining, lookup column wrapping
→ returns a corrected formula with the 10 syntax rules satisfied
"Which columns are broken across this table?"
get_errors → per-column status counts
sync_table → cross-reference with each column's run condition
→ Claude separates "broken" from "intentionally gated"
npm start # run the server directly (stdio)
node --check index.js # syntax check
# Manual probes against a real Clay session — not real tests, just scratch scripts:
node test.js
node test-export.js
node test-workbook.js
There's no test suite yet. Contributions welcome.
test*.js files are manual probes, not assertions.MIT.
Run in your terminal:
claude mcp add slab-mcp -- npx CSA PROJECT - FZCO © 2026 IFZA Business Park, DDP, Premises Number 31174 - 001
Security
Low riskAutomated heuristic from public metadata — not a security guarantee.