loading…
Search for a command to run...
loading…
TES SDK — LLM observability and lifecycle tracking via Pentatonic Thing Event System. Track token usage, tool calls, and conversations. Manage things through ev
TES SDK — LLM observability and lifecycle tracking via Pentatonic Thing Event System. Track token usage, tool calls, and conversations. Manage things through event-sourced lifecycle stages with AI enrichment and vector search.
Memory and observability for AI agents.
Two products on one platform (TES). One install. JavaScript & Python.
Two products that share one TES account, one install line, and one dashboard:
| Product | What it does | When you want it |
|---|---|---|
| Memory | Persistent, searchable memory for your AI agent — 7-layer hybrid retrieval (BM25 + vector + KG + reranker), repo onboarding via references. Runs locally (Docker) or hosted (TES). | You want your agent to remember conversations, preferences, and codebase context across sessions. |
| Observability | Wrap your LLM client and capture every call — tokens, tool calls, latency, content. Events flow to TES for the dashboard, analytics, and search attribution. | You want to know what your agent is actually doing in production. |
Both products are sold separately, but you can use either, both, or neither. Plugins for Claude Code and OpenClaw install everything at once if you'd rather skip the SDK glue.
TES (Thing Event System) is Pentatonic's account-and-events backbone. Both products in this SDK run on it: memory writes/queries land in TES, observability events stream to it, and the dashboard reads from it.
You only need a TES account if you're using hosted memory or observability (observability always sends events to TES). Local memory runs entirely on your machine and needs no TES account.
# One-time: open browser, sign in or sign up, get API keys
npx @pentatonic-ai/ai-agent-sdk login
login opens your browser at the hosted sign-in page. New users click "Sign up" to create a tenant (clientId + region + email + password). After verification the CLI writes credentials to ~/.config/tes/credentials.json (mode 0600). The Claude Code plugin, OpenClaw plugin, hooks, and corpus CLI all auto-discover this file — no manual paste step.
✓ Connected as [email protected] on tenant `your-clientid`
✓ Credentials written to ~/.config/tes/credentials.json
To check connection state later: npx @pentatonic-ai/ai-agent-sdk whoami. To point at a local TES dev instance: npx @pentatonic-ai/ai-agent-sdk login --endpoint http://localhost:8788.
(init still works as a one-major-release deprecation alias for login.)
Persistent, searchable memory for AI agents. Backed by a 7-layer hybrid retrieval engine — BM25 keyword (L0), core files (L1), HybridRAG orchestrator (L2), Knowledge Graph entities (L3), vector index (L4), comms-namespace vectors (L5), and a document store with cross-encoder reranker (L6). Reciprocal Rank Fusion stitches them at query time.
Same engine, same wire format (/store, /search, /forget, /store-batch, /health), two deployment modes:
Run the full engine stack on your own machine via Docker. No API keys, no cloud, fully offline. Embeddings come from your local Ollama; quality depends on the model you pull (768d nomic-embed-text is the default and works fine on a laptop).
Prerequisites
ollama pull nomic-embed-textIf you'll run Claude Code (or anything else) inside a Docker container that needs to reach the engine, make Ollama listen on all interfaces so containers can reach it via host.docker.internal:
sudo mkdir -p /etc/systemd/system/ollama.service.d
echo -e '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0:11434"' \
| sudo tee /etc/systemd/system/ollama.service.d/override.conf
sudo systemctl daemon-reload
sudo systemctl restart ollama
Bring up the engine
git clone https://github.com/Pentatonic-Ltd/ai-agent-sdk.git
cd ai-agent-sdk/packages/memory-engine
# Default .env points at Ollama on the host. Edit if your Ollama is
# elsewhere or you want to use a higher-quality model (e.g. mxbai-embed-large
# at 1024d → set EMBED_DIM=1024 and EMBED_MODEL_NAME=mxbai-embed-large).
cat > .env <<'EOF'
PME_NV_EMBED_ENABLED=false
NV_EMBED_URL=http://host.docker.internal:11434/v1/embeddings
EMBED_MODEL_NAME=nomic-embed-text
EMBED_DIM=768
OLLAMA_DIM=768
PME_OLLAMA_URL=http://host.docker.internal:11434/api/embeddings
PME_EMBED_MODEL=nomic-embed-text
L5_OLLAMA_EMBED_URL=http://host.docker.internal:11434/api/embed
L5_OLLAMA_EMBED_MODEL=nomic-embed-text
PME_HYDE_ENABLED=false
PME_RERANK_ENABLED=true
PME_PORT=8099
CLIENT_ID=local
NEO4J_AUTH=neo4j/local-dev-pw
NEO4J_PASSWORD=local-dev-pw
EOF
docker compose up -d --scale nv-embed=0
First run pulls images and builds engine containers — ~10–15 min. Subsequent restarts are seconds.
Verify
curl -s http://localhost:8099/health | jq
# Status should be "ok" or "degraded" with most layers reporting ok.
curl -sX POST http://localhost:8099/store \
-H "content-type: application/json" \
-d '{"content":"hello memory","metadata":{"arena":"local"}}' | jq
curl -sX POST http://localhost:8099/search \
-H "content-type: application/json" \
-d '{"query":"hello","limit":3,"min_score":0.001}' | jq
If /search returns the row from /store, the engine is live.
Connect Claude Code
The tes-memory plugin's hooks already speak the engine's wire format. Three steps:
/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai
npx @pentatonic-ai/ai-agent-sdk config local
This writes ~/.claude-pentatonic/tes-memory.local.md with mode: local and memory_url: http://localhost:8099. If you want a different URL, pass --engine-url <url>. To switch back to hosted later, run tes config hosted (delegates to login)./reload-plugins (or restart Claude Code if status reports stale state — MCP server processes need a full restart to pick up plugin updates).Inspect what's currently configured at any time:
npx @pentatonic-ai/ai-agent-sdk config show
Verify:
/tes-memory:tes-status
Should report ✓ Connected to local memory engine. Now every prompt auto-searches engine memory and every turn auto-stores. The footer 🧠 Matched N memories from Pentatonic Memory shows hits.
Seed memory from your codebase or docs (optional)
Drop the cold-start problem on day one by pre-populating the engine with references to your code/docs:
MEMORY_ENGINE_URL=http://localhost:8099 \
npx @pentatonic-ai/ai-agent-sdk ingest ~/code/my-project
References-mode by default — stores path + signature pointers, not full file contents. See Repository Onboarding for details.
Tuning
Change embedding model: pull a different one, edit EMBED_MODEL_NAME + EMBED_DIM in .env, then docker compose down -v && docker compose up -d --scale nv-embed=0 (the -v is required because Milvus collections are dim-locked at creation; switching dims means recreating).
| Model | Dim | Notes |
|---|---|---|
nomic-embed-text (default) |
768 | Smallest; works on any laptop |
mxbai-embed-large |
1024 | Better recall; ~600 MB download |
nv-embed-v2 (via gateway) |
4096 | Production-grade; needs a hosted endpoint or GPU |
Run on Pentatonic's infrastructure. NV-Embed-v2 (4096d) embeddings via the AI gateway, managed Postgres/Neo4j/Qdrant/Milvus, dashboard. The engine still ships in this repo — hosted just deploys it for you.
# 1. Get a TES account
npx @pentatonic-ai/ai-agent-sdk login
# 2. Install the SDK
npm install @pentatonic-ai/ai-agent-sdk
# or: pip install pentatonic-ai-agent-sdk
Memory operations route through TES → engine. No client-side change between local and hosted.
import { engineAdapter, ingestCorpus } from '@pentatonic-ai/ai-agent-sdk/memory/corpus';
const adapter = engineAdapter({
engineUrl: 'http://localhost:8099',
arena: 'my-app',
});
await adapter.init();
await adapter.ingestChunk('User prefers dark mode', { kind: 'note' });
For raw /search and /store, just fetch() against ${engineUrl}/search etc. The wire format is documented in packages/memory-engine/docs/MIGRATION.md.
Wrap your LLM client and every call automatically emits a CHAT_TURN event to TES — input/output tokens, tool calls, model, latency, content. Events flow into the TES dashboard, where you get session metrics, search attribution, dead-end detection, and full-text + semantic search across conversations.
Observability requires a TES account (hosted or self-hosted Pentatonic platform). Events have nowhere to go without one.
JavaScript
import { TESClient } from "@pentatonic-ai/ai-agent-sdk";
const tes = new TESClient({
clientId: process.env.TES_CLIENT_ID,
apiKey: process.env.TES_API_KEY,
endpoint: process.env.TES_ENDPOINT,
});
const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
const result = await ai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
});
Python
from pentatonic_agent_events import TESClient
tes = TESClient(
client_id=os.environ["TES_CLIENT_ID"],
api_key=os.environ["TES_API_KEY"],
endpoint=os.environ["TES_ENDPOINT"],
)
ai = tes.wrap(OpenAI(), session_id="conv-123")
result = ai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
| Provider | Detection | Intercepted Method |
|---|---|---|
| OpenAI | client.chat.completions.create |
chat.completions.create() |
| Anthropic | client.messages.create |
messages.create() |
| Workers AI | client.run (JS only) |
run() |
All other methods pass through unchanged.
If you use Claude Code or OpenClaw, the plugin gives you both products at once — every conversation turn is captured (observability) AND searched/stored as memory. No SDK glue to write.
Works with both local and hosted memory. Install once, switch modes via config.
/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai
Local engine — bring up the engine first (Memory > Local), then write the plugin config:
npx @pentatonic-ai/ai-agent-sdk config local
Hosted TES — run login once, the plugin auto-discovers ~/.config/tes/credentials.json:
npx @pentatonic-ai/ai-agent-sdk login
# equivalent: npx @pentatonic-ai/ai-agent-sdk config hosted
Either way, verify with /tes-memory:tes-status in Claude Code, or from the shell:
npx @pentatonic-ai/ai-agent-sdk config show
The plugin's MCP server, hooks, and tools all read the same config — switching modes is a single CLI call away.
What it tracks (auto, every turn):
openclaw plugins install @pentatonic-ai/openclaw-memory-plugin
Then tell OpenClaw:
Set up pentatonic memory
The agent will ask whether you want local (private, Docker-based) or hosted (Pentatonic TES cloud), then walk you through the rest. For hosted mode, it handles account creation, email verification, and API key generation conversationally.
Or use the CLI directly:
openclaw pentatonic-memory local
What it does: OpenClaw's context engine hooks fire on every lifecycle event — ingest stores user/assistant messages via the engine's /store endpoint (BM25 + vector + KG indexing in parallel); assemble calls /search to inject relevant memories as system-prompt context; compact and after-turn are managed by the engine's own decay/consolidation. Plus agent-callable tools: memory_search, memory_store, memory_layers.
After setup, config lives in ~/.openclaw/pentatonic-memory.json. To switch modes, run setup again or edit directly.
You can also configure via openclaw.json:
{
"plugins": {
"slots": { "contextEngine": "pentatonic-memory" },
"entries": {
"pentatonic-memory": {
"enabled": true,
"config": {
"memory_url": "http://localhost:8099"
}
}
}
}
}
For hosted mode, replace the config block with:
{
"tes_endpoint": "https://your-company.api.pentatonic.com",
"tes_client_id": "your-company",
"tes_api_key": "tes_your-company_xxxxx"
}
The memory layer starts empty. To avoid the cold-start problem where retrieval has nothing useful to return for the first days of use, you can ingest your repos (or any folder of docs) on day one:
# Interactive — picks paths, shows a cost preview, ingests, offers
# to install a git post-commit hook so memory stays current
npx @pentatonic-ai/ai-agent-sdk onboard
# One-shot ingest of a single path
npx @pentatonic-ai/ai-agent-sdk ingest ~/code/my-app
npx @pentatonic-ai/ai-agent-sdk ingest ~/Documents/design-notes # any folder works
# See what's tracked and how big the corpus is
npx @pentatonic-ai/ai-agent-sdk status
# Delta-resync everything that's tracked (or one path)
npx @pentatonic-ai/ai-agent-sdk resync
# Manage the tracked-paths list
npx @pentatonic-ai/ai-agent-sdk corpus list
npx @pentatonic-ai/ai-agent-sdk corpus remove ~/code/old-project
npx @pentatonic-ai/ai-agent-sdk corpus reset
Tenant credentials come from env vars (TES_ENDPOINT, TES_CLIENT_ID, TES_API_KEY) or ~/.config/tes/credentials.json if you used npx @pentatonic-ai/ai-agent-sdk login. To point at a TES instance running on localhost, set TES_ENDPOINT=http://localhost:8788.
By default, ingest stores pointers to source content (path + line range + a short signature/summary), not full chunk content. Per-language strategies:
function / class / const / exportdef / classWhy pointers? Code mutates between ingests. Embedded chunks of old source rot silently — the LLM keeps confidently citing functions you've since rewritten, with retrieval evidence to back it up. Pointers rot loudly: when a file moves or changes, Read fails or returns different content, and the agent observes and adjusts. Stale-but-confident is the worst-class memory bug; loud-and-self-correcting is qualitatively better for source code.
It also means proprietary source never leaves your machine — only the index (path + summary) is sent to the hosted TES, and the agent reads actual file contents at query time on its own.
If you need a self-contained index (e.g. for air-gapped retrieval where the source isn't available at query time), opt into legacy chunk-content storage by passing mode: "content" to ingestCorpus when using the SDK as a library.
Any folder works — git is not required. The walker honors .gitignore and .tesignore if present, plus a hard-exclude list for secrets and credentials that cannot be overridden even with !pattern rules:
.env* (any environment file)*.pem, *.key, *.crt, *.p12, *.pfx, *.jksid_rsa, id_ed25519, id_ecdsa, id_dsa (SSH private keys).ssh/, .aws/, .gcp/, .azure/ (whole directories).npmrc, .pypirc, .netrcsecrets/, credentials/, service-account.**_secret*, *_token*, *_password*Plus directory-level skips: .git, node_modules, dist, build, .next, venv, __pycache__, target, .terraform, etc. And extension skips for binaries, lockfiles, and minified output. Files larger than 512 KB are skipped by default (override with adapter options if you need to).
For git repos, accepting the prompt during onboard installs a post-commit hook at .git/hooks/post-commit that re-ingests files changed in each commit. The hook is non-fatal — it never blocks a commit. Install manually any time with:
npx @pentatonic-ai/ai-agent-sdk install-git-hook
For non-git folders, re-run ingest or resync whenever the source changes. Re-ingest is cheap: the SDK keeps a content-hash per file and skips anything that hasn't changed since the last run.
TESClient(config) — Observability| Param | Type | Default | Description |
|---|---|---|---|
clientId |
string |
required | Your tenant identifier |
apiKey |
string |
required | TES API key |
endpoint |
string |
required | TES instance URL |
userId |
string |
null |
User identifier for attribution |
captureContent |
boolean |
true |
Include message content in events |
maxContentLength |
number |
4096 |
Truncate content beyond this length |
tes.wrap(client, opts?)Returns an instrumented proxy. Every intercepted call emits a CHAT_TURN event.
| Option | Type | Default | Description |
|---|---|---|---|
sessionId |
string |
auto-generated UUID | Links events from the same conversation |
metadata |
object |
{} |
Custom fields on every event |
tes.session(opts?)Returns a Session for manual event emission.
session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })Emits a CHAT_TURN event with accumulated data, then resets.
normalizeResponse(raw)Standalone utility to normalize any LLM response:
import { normalizeResponse } from "@pentatonic-ai/ai-agent-sdk";
const { content, model, usage, toolCalls } = normalizeResponse(openaiResponse);
engineAdapter(config) — MemoryThin HTTP client for the memory engine. config = { engineUrl, arena, apiKey? }. Returns { ingestChunk(content, metadata), deleteByCorpusFile(repoAbs, relPath), init() }. See Use as a library.
For raw /store / /search calls, just fetch() against ${engineUrl} directly — the wire format is documented in packages/memory-engine/docs/MIGRATION.md.
doctor)Run a full health check of your SDK install at any time:
npx @pentatonic-ai/ai-agent-sdk doctor
doctor auto-detects which install path you're on (Local Memory, Hosted TES, or self-hosted Pentatonic platform) and runs only the checks that apply. Exit code is 0 for all-clear, 1 for warnings, 2 for critical.
Common flags:
npx @pentatonic-ai/ai-agent-sdk doctor --json # machine-readable
npx @pentatonic-ai/ai-agent-sdk doctor --alert # silent unless issues
npx @pentatonic-ai/ai-agent-sdk doctor --no-plugins
npx @pentatonic-ai/ai-agent-sdk doctor --path local
What gets checked:
/health, per-layer health (L0–L6), embedding endpoint reachabilitytes-memory.local.md parses, memory_url reachableDrop a .mjs file into ~/.config/pentatonic-ai/doctor-plugins/ to add your own checks. Useful for app-specific things — internal APIs, ingest freshness, custom infrastructure — without forking the SDK.
// ~/.config/pentatonic-ai/doctor-plugins/my-app.mjs
export default {
name: "my-app",
checks: [
{
name: "internal API",
severity: "warning",
run: async () => {
const res = await fetch("https://internal/health");
return res.ok
? { ok: true, msg: "200 OK" }
: { ok: false, msg: `HTTP ${res.status}` };
},
},
],
};
See packages/doctor/README.md for the full plugin contract and programmatic API.
Your code / Claude Code plugin / OpenClaw plugin
|
+-------------------+--------------------+
| |
Memory product Observability product
(engine HTTP API) (TESClient.wrap)
| |
| POST /store /search /forget | CHAT_TURN events
▼ ▼
+----------------+ +-----------------+
| memory engine | | TES |
| (compat shim) | | (Cloudflare) |
+----------------+ | Workers, R2, |
| | Queues, Pages |
+----------+----------+ +--------+--------+
| | |
Local Hosted ---------------------------+
(your machine) (Pentatonic-managed)
| |
docker compose AWS/GCP container cluster
+ host Ollama + AI gateway (NV-Embed-v2)
Plugins (Claude Code, OpenClaw) are lightweight integrations on top of both products — they call into memory and emit observability events on the user's behalf.
MIT
Добавь это в claude_desktop_config.json и перезапусти Claude Desktop.
{
"mcpServers": {
"ai-agent-sdk": {
"command": "npx",
"args": [
"-y",
"@pentatonic-ai/ai-agent-sdk"
]
}
}
}Web content fetching and conversion for efficient LLM usage.
Retrieval from AWS Knowledge Base using Bedrock Agent Runtime.
автор: modelcontextprotocolProvides auto-configuration for setting up an MCP server in Spring Boot applications.
A very streamlined mcp client that supports calling and monitoring stdio/sse/streamableHttp, and can also view request responses through the /logs page. It also
автор: xuzexin-hzНе уверен что выбрать?
Найди свой стек за 60 секунд
Автор?
Embed-бейдж для README
Похожее
Все в категории ai