Overview
Version
v1.0 — Stable
Status
Stable, 2026-06-06
License
MIT
Maintained by
Fasad Salatov (Unyly)
Quark is a streaming-first protocol for connecting AI agents (LLM clients) to tools (tool servers). It replaces Model Context Protocol (MCP) where MCP cracks under production load: composition, streaming, subscriptions, backpressure, capability security, and multi-agent — all built into the protocol, not bolted on.
▸ v1.0 — stable (2026-06-06)
- ✓ Stability guarantee — no breaking changes through v1.x
- ✓ Federation — server-to-server routing via mesh discovery
- ✓ MessagePack binary frames — opt-in via subprotocol
- ✓ Extended filter — matches/in/notIn, parens, !, arithmetic, nested.field
- ✓ Schema registry — $ref to https://schemas.quark.dev/
- ✓ Python SDK — [email protected]
- ✓ Conformance test suite — 78 tests (Go 35 + TS 21 + Python 22)
- ✓ IETF draft alignment (quark-protocol-00)
▸ v0.2 — production-grade (2026-06-05)
- ✓ Cryptographically signed capability tokens (QCT, HMAC-SHA256)
- ✓ Bearer authentication in handshake
- ✓ Session resume after disconnect (RSM frame)
- ✓ Heartbeat (HBT/HBA) — detect dead connections
- ✓ Tool input validation via JSON Schema (server-side)
- ✓ Cost tracking in RES/END (cost_used per call)
- ✓ Distributed tracing (W3C trace_id/span_id)
- ✓ Tool versioning (name@version syntax)
- ✓ Protocol version field bumped 1 → 2
▸ v0.1 (2026-04-15)
Initial draft: streaming, composition, subscriptions, backpressure, capabilities.
Anthropic shipped MCP in 2024 and it became the de-facto standard for AI tooling. Technically it is JSON-RPC over stdio/SSE, designed for a desktop-app world. Production workloads expose architectural cracks:
✗ No native streaming
Long-running tool calls (LLM streams, file uploads, log tails) are SSE workarounds.
✗ Stateless per call
Every tool invocation rebuilds context. Burns tokens, multiplies latency.
✗ No composition
"Get repos → filter → summarize → post to Slack" — N round-trips, each with full payload.
✗ No subscriptions
Reactive flows (notify on new PR) need external poll loops.
✗ No backpressure
A flood of requests can DoS a tool server with no graceful degradation.
✗ No multi-agent
Agent-to-agent comms (Claude → Gemini) require bespoke bridges.
✗ No capability model
Tools can do anything. Clients can't restrict.
These aren't bugs — they're consequences of choosing JSON-RPC as the substrate. Fixing them requires a new protocol. Quark is that protocol.
✓ Streaming-native
Bidirectional streams are first-class. Every call can stream.
✓ Composable
Pipe syntax (tool₁ | filter | tool₂) — server-side execution.
✓ Stateful sessions
Open a channel, keep context, save tokens and latency.
✓ Subscriptions
Subscribe to events. Server pushes. No polling.
✓ Backpressure
Built-in flow control. Servers throttle gracefully.
✓ Typed
JSON Schema everywhere. Composition is type-checked.
✓ Multi-agent
Agent IDs in the protocol. Direct agent-to-agent calls.
✓ Capability-based
Calls require capability tokens. Agents can't do what they're not allowed.
✓ MCP-compatible
Adapter layer wraps existing MCP servers. Zero migration cost.
Quark uses WebSocket as the primary transport. WebSocket gives bidirectional streaming, runs in every browser and server, and survives mobile network handovers.
ws://server:port/quark
wss://server:port/quark (TLS, recommended)For server-to-server federation, plain TCP with the same framing is allowed for lower overhead. Optional QUIC support is reserved for future versions (v0.3+).
Every message on the wire is a frame: 4-байтовый header + JSON payload.
┌─────────┬─────────┬─────────────────────────────┐
│ version │ kind │ payload (JSON) │
│ 1 byte │ 3 bytes │ variable length │
└─────────┴─────────┴─────────────────────────────┘version— версия протокола, сейчас0x01kind— 3-буквенный ASCII opcode (см. ниже)payload— UTF-8 JSON, может быть пустым
WebSocket text frames are used. Binary frames are reserved for future binary payload (MessagePack option in v0.2).
Channels
channel is a persistent stateful connection between a client (AI agent) and a Quark server. Opened on connect, closed on disconnect.
Within a channel, state is preserved:
- Capability grants (valid for channel lifetime or until revoked)
- Subscriptions (active until explicitly unsubscribed or channel closed)
- Open tool streams (alive until completed or cancelled)
A single channel can carry multiple simultaneous calls, streams, and subscriptions, distinguished by seq.
Quark v0.2 introduces cryptographically signed capability tokens — Quark Capability Tokens (QCT). JWT-like, but spec-defined for Quark.
▸ Token format
qct.v1.<base64url(payload)>.<base64url(signature)>
signature = HMAC-SHA256(secret, "v1." + base64url(payload))▸ Payload
{
"iss": "https://issuer.example.com",
"sub": "[email protected]",
"iat": 1690000000,
"nbf": 1690000000,
"exp": 1700000000,
"scope": ["github:read:*", "slack:notify:#dev"],
"client_id": "claude-desktop",
"max_cost_usd": 5.00
}iss— issuer URL (required)sub— subject (user/principal) (required)iat— issued at, Unix secondsnbf— not before, optionalexp— expires, Unix seconds (required)scope— array of capability strings (required)client_id— restricts which client (optional)max_cost_usd— ceiling on total cost (optional)
▸ Capability strings
github:read:* # read anything on GitHub
github:write:repo:owner/name # write to specific repo
slack:notify:#general # notify a specific Slack channel
slack:notify:* # notify any Slack channel
*:read # read anything anywhereA capability "a:b:c" grants exact match AND descendants when granted as "a:b:c:*".
▸ Usage in handshake
{
"kind": "HEY",
"v": 2,
"auth": { "type": "bearer", "token": "qct.v1.eyJ..." },
"agent": { "id": "claude-desktop", "kind": "llm", "name": "Claude" }
}▸ Server verification
- Parse QCT (split by ".")
- Verify HMAC signature
- Check iat <= now, nbf <= now, exp > now
- If client_id set, verify matches agent.id
- Store granted scope as channel capabilities
On failure — ERR { code: "AUTH_INVALID" } + close.
▸ why it matters
Without signed tokens, an AI agent could claim any capabilities and the server would trust them. QCT means only legitimate issuers can mint tokens. This is the foundation for enterprise/compliance use cases (audit trails, SOC2, GDPR).
▸ client → server
▸ server → client
Every channel starts with a HEY.
{
"kind": "HEY",
"v": 1,
"agent": {
"id": "claude-desktop-3.7-mac",
"kind": "llm",
"name": "Claude Desktop"
},
"supports": ["streaming", "subscribe", "compose", "capabilities"]
}{
"kind": "HEY",
"v": 1,
"server": {
"id": "github-tools-v2",
"name": "GitHub Tools",
"version": "2.1.0"
},
"supports": ["streaming", "subscribe", "compose", "capabilities"],
"tools": 12,
"topics": 4
}If versions don't match, server replies with ERR and closes.
Both sides exchange heartbeats to detect dead connections (firewall timeouts, mobile NAT drops).
// Client (every 30s)
{ "kind": "HBT", "ts": 1700000000 }
// Server
{ "kind": "HBA", "ts": 1700000000 }▸ client
No HBA in 60s → reconnect
▸ server
No HBT in 90s → close (state held for TTL)
After disconnect (drop, mobile sleep), client reconnects and sends RSM:
{
"kind": "RSM",
"v": 2,
"session_id": "ses_a7b3c9d1",
"last_seq_received": 42
}Server:
- If session valid — replays buffered frames with seq > 42, then resumes
- If expired — ERR { code: "SESSION_EXPIRED" }, client falls back to fresh HEY
Subscriptions and capability grants survive resume. Open tool streams are cancelled (clients should re-INV if needed).
Servers MUST buffer the last 64 outgoing frames per session.
▸ mobile-friendly
iPhone in pocket → 4G handover → WebSocket drops → user opens app → client auto-RSMs → last 30s of missed push events arrive at once. No data loss.
Tools are registered server-side at startup. Clients discover via LST:
{
"kind": "LST",
"seq": 1,
"tools": [
{
"name": "github.list_repos",
"description": "List repos for a user/org",
"input": { "type": "object", "properties": { "owner": { "type": "string" } } },
"output": { "type": "array", "items": { "$ref": "#/types/Repo" } },
"effects": ["network", "read"],
"cost": { "estimate": 0.0001, "currency": "USD" },
"streaming": true,
"requires_capability": "github:read"
}
]
}Tool schemas use JSON Schema Draft 2020-12 with two extensions:
effects— массив изpure | read | write | network | money | irreversible | costcost— оценённая стоимость вызова (помогает AI бюджетировать)requires_capability— capability которая нужна для вызова
▸ One-shot
{
"kind": "INV",
"seq": 2,
"tool": "github.list_repos",
"input": { "owner": "anthropic" }
}{
"kind": "RES",
"seq": 2,
"output": [{ "name": "claude-code", "stars": 12000, "owner": "anthropic" }]
}▸ Streaming
When tool spec advertises streaming: true, results come as STR, ending with END.
// → INV { "seq": 3, "tool": "logs.tail", "input": { "file": "app.log" } }
// ← STR { "seq": 3, "data": { "line": "GET /api 200" } }
// ← STR { "seq": 3, "data": { "line": "GET /api 200" } }
// ← STR { "seq": 3, "data": { "line": "GET /api 500" } }
// ← END { "seq": 3 }This is Quark's killer feature vs MCP.
A single INV can describe a pipeline. The server executes the whole pipeline, only sending the final result back. No round-trips between steps.
{
"kind": "INV",
"seq": 4,
"pipeline": [
{ "tool": "github.list_repos", "input": { "owner": "anthropic" } },
{ "filter": "stars > 100" },
{ "map": ["name"] },
{ "tool": "slack.notify", "input_bind": { "items": "$prev", "channel": "#dev" } }
]
}Stages:
tool— invoke a tool, output flows to next stagefilter— CEL/SQL-like expression filtering itemsmap— project fieldsreduce— aggregateinput_bind— bind previous stage output into the next tool's input
▸ effect
Collapses N HTTP round-trips into one. ~10× latency reduction in real workloads.
{
"kind": "SUB",
"seq": 5,
"topic": "github.pr_opened",
"filter": { "repo": "anthropic/claude-code" }
}Server replies with RES (subscription id), then streams EVT:
{ "kind": "EVT", "seq": 5, "data": { "pr": 123, "title": "Fix typo" } }Until client sends UNS with same seq, or channel closes.
When server is overloaded, it sends WIN with a smaller window. Client MUST not send more than window outstanding requests.
{ "kind": "WIN", "window": 3 }Default window: 64. Server can shrink any time. Client must respect.
Capabilities
Quark v0.1 ships a minimal capability model. Capabilities are strings declared by tools and granted by users.
Tool declares: requires_capability: "github:write:repo:foo/bar"
Client (after user consent) in HEY:
"capabilities": [
"github:read:*",
"github:write:repo:foo/bar",
"slack:notify:#dev"
]Server validates capability against granted set on each INV. If missing, returns ERR with code MISSING_CAPABILITY. v0.2 will add cryptographically signed grants for audit/compliance.
Errors
{
"kind": "ERR",
"seq": 4,
"code": "MISSING_CAPABILITY",
"message": "Tool github.write_issue requires github:write",
"stage": 1
}Standard error codes:
Tracing
Every frame may include W3C Trace Context metadata:
{
"kind": "INV",
"seq": 5,
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span_id": "00f067aa0ba902b7",
"parent_span_id": "00f067aa0ba902b1",
"tool": "github.list_repos"
}trace_id— 32 hex chars, идентифицирует distributed tracespan_id— 16 hex chars, идентифицирует span внутри traceparent_span_id— связывает span с родителем
Server propagates trace_id to child operations (pipeline stages, federation). OpenTelemetry collectors can ingest Quark traces via a sidecar that reads frames and emits spans.
▸ use cases
- • Debugging: see exactly what calls an AI agent made in a session
- • Latency analysis: spot bottlenecks in a pipeline
- • Compliance: full audit trail of who-what-when
- • Cost attribution: which feature spent what
Quark ships with an MCP-Quark adapter. Any existing MCP server can be wrapped:
[AI agent] ──Quark──> [Quark adapter] ──MCP──> [legacy MCP server]The adapter:
- Converts Quark INV → MCP tools/call
- Converts MCP responses → Quark RES
- Loses streaming/composition/subscriptions (MCP doesn't support)
- Logs a warning when advanced feature is requested
This means zero migration cost. Clients start using Quark, existing MCPs continue to work via the adapter, authors migrate to native Quark when they want the new features.
SDKs
Provided in this repository:
▸ Go server
srv := quark.NewServer()
srv.RegisterTool(quark.Tool{
Name: "echo.upper",
Description: "Echo text in uppercase",
Handler: func(ctx context.Context, in map[string]any) (any, error) {
return strings.ToUpper(in["text"].(string)), nil
},
})
http.Handle("/quark/ws", srv)▸ TypeScript client
import { Quark } from '@fasad_salatov/quark-client'
const ch = await Quark.connect('wss://server/quark/ws', {
agent: { id: 'my-bot', kind: 'llm', name: 'My Bot' },
})
const repos = await ch.invoke('github.list_repos', { owner: 'anthropic' })
for await (const log of ch.stream('logs.tail', { file: 'app.log' })) {
console.log(log.line)
}
const filtered = await ch.pipeline([
{ tool: 'github.list_repos', input: { owner: 'anthropic' } },
{ filter: 'stars > 100' },
{ map: ['name'] },
])Roadmap
- v0.1Apr 2026initial draft, reference impls, MCP adapter
- v0.2Jun 5, 2026QCT auth, session resume, heartbeat, validation, cost tracking, tracing
- v1.0Jun 6, 2026stable. Federation, MessagePack, extended filter, schema registry, Python SDK.▸ now
- v1.1Q3 2026QUIC transport, mesh routing improvements
- v1.2Q4 2026WebRTC P2P for browser-to-browser AI agents
- v1.3Q1 2027WASM pipeline stages (sandboxed user code)
- v2.0Q3 2027Asymmetric QCT signing (RSA/ECDSA), full CEL adoption, capability delegation chains
▸ discussion
Spec is open, MIT. Contribute via GitHub issues/PRs. Questions and integration requests — email or Telegram.