loading…
Search for a command to run...
loading…
nexus-agents makes your AI coding tools work together intelligently. It coordinates Claude, Codex, Gemini, and OpenCode — routing each task to the best model us
nexus-agents makes your AI coding tools work together intelligently. It coordinates Claude, Codex, Gemini, and OpenCode — routing each task to the best model using data-driven algorithms, validating outputs through multi-model consensus voting, and continuously improving through outcome-driven learning. Connect it to any MCP-compatible editor (Claude Code, Cursor, VS Code) and it handles the rest.
OpenSSF Best Practices OpenSSF Scorecard
Governance substrate for your AI coding agents — adversarial review, drift-detected rules, immutable audit, closed-loop telemetry
npm version License: MIT Node.js Version
Nexus-agents is a governance layer that sits above your AI coding agents — Claude Code, Codex, Gemini, and OpenCode. The agents do the engineering; nexus-agents enforces the rules they have to follow, reviews their work adversarially before it ships, audits everything they touch, and routes the next task based on what actually worked.
What it gives you:
pr_review runs 5 voter roles (architect, security, devex, catfish, scope_steward) with a 4-point verification gate. On the v5 evaluation set (10 PRs): 100% bug-catch and 50% raw false-positive rate; manual triage reclassified most "FPs" as legitimate findings the dataset had mislabeled. Full numbers: docs/research/pr-review-experiment-results-v5.mdCLAUDE.md + governance:check + blocking CI gates fail the build when documented rules drift from registered behavior (model registry, MCP tools, expert types, skills)AuditTrail with structured logging and hash-chained append-only storage; integrity is verifiable via the verify_audit_chain MCP toolOutcomeStore feeds production telemetry back into LinUCB + TOPSIS scoring so the system actually learns from what shipped vs what regressed. A second, bounded loop runs by default: a signal.swarm_unhealthy (adapter circuit-breaker / swarm-health) applies a small, capped, auto-decaying routing demotion via TuneAdjustmentStore — demotion-only, never zeroes a CLI, every adjustment audited, opt-out with NEXUS_TUNE_ENFORCE=falseconsensus_vote runs a default 7-role panel (architect, security, devex, ai_ml, pm, catfish, scope_steward; --quick uses 3). Six strategies: simple/super-majority, unanimous, higher-order Bayesian, opinion-wise, proof-of-learningYou: "Review this PR / orchestrate this task / vote on this proposal"
↓
nexus-agents: enforce rules → route → adversarial review → audit → learn from outcome
↓
Engineering agents: Claude Code · Codex · Gemini · OpenCode
↓
Code: actual edits, tests, PRs, issues
What this is NOT:
Human / IDE / CLI
(Claude Code, Cursor, VS Code, terminal)
│ MCP Protocol
▼
┌─────────────────────────────────────────────────────┐
│ GOVERNANCE SUBSTRATE — what nexus-agents provides │
│ │
│ Charter (drift-checked) Adversarial PR review │
│ Role registry Multi-voter consensus │
│ Immutable audit trail Closed-loop telemetry │
│ │
│ 46 MCP tools · multi-stage CompositeRouter │
└────────────────────────┬────────────────────────────┘
│
▼ delegates execution to
┌─────────────────────────────────────────────────────┐
│ ENGINEERING AGENTS — what does the actual work │
│ │
│ Claude Code · Codex · Gemini · OpenCode │
└────────────────────────┬────────────────────────────┘
│
▼ produces
Code, tests, PRs, issues
The governance substrate is the layer that catches the mistakes engineering agents would otherwise make — bad code shipped, rules drifting from intent, audit gaps, telemetry-free routing — and routes the next task based on what actually worked the last time.
npm install -g nexus-agents
Or as a Claude Code plugin (single-command install from the official marketplace):
/plugin install nexus-agents
See docs/getting-started/PLUGIN_INSTALL.md for plugin-specific setup, or llms-install.md for the short install guide an AI agent can follow.
nexus-agents doctor
Prints a health table — Node version, configured CLIs (claude / codex / gemini / opencode), API keys missing vs present. Read-only; safe to run any time.
nexus-agents vote --quick --proposal "Use SQLite over JSON files for the outcome store"
You should see:
Nexus Agents Consensus Vote
============================
Collecting votes from 3 agents (timeout: 60s each)...
Proposal: Use SQLite over JSON files for the outcome store
Votes
✓ Software Architect: APPROVE (86%)
✓ Security Engineer: APPROVE (74%)
✓ Scope Steward: APPROVE (91%)
Summary
Approve: 3
Reject: 0
Abstain: 0
Approval: 100.0%
Threshold: simple_majority
Result: APPROVED
Completed in ~30s
Three voter roles deliberate via whichever local CLIs you have (Claude, Codex, Gemini) — no API keys required. Per-voter reasoning is recorded; the terminal prints the verdict. Mixed outcomes (some approve / some reject) and graceful error handling are demonstrated on the project site hero with a real 7-voter run.
nexus-agents setup # Auto-configures MCP server in Claude Code, Cursor, etc.
Restart your editor. The 46 MCP tools (orchestrate, consensus_vote, research_synthesize, verify_audit_chain, …) become available to whatever agent you're already using.
setup configuresBy default, setup writes/updates up to seven things in your environment. Each can be skipped with the corresponding --skip-* flag if you don't want it.
| Configured | Where written | Opt-out flag |
|---|---|---|
| MCP server registration (Claude) | ~/.claude/mcp.json / Claude Desktop config |
--skip-mcp |
| Project rules | .cursor/rules/ and/or .claude/rules/ |
--skip-rules |
| Session hooks | ~/.claude/hooks/ (session-start / pre-tool / etc.) |
--skip-hooks |
| OpenCode MCP config | ~/.config/opencode/opencode.json |
--skip-opencode |
| Gemini MCP config | ~/.gemini/mcp.json |
--skip-gemini |
| Codex MCP config | ~/.codex/config.toml |
--skip-codex |
| Project config file | ./nexus-agents.yaml |
--skip-config |
Run with --interactive (the default) for a per-step confirm flow, or --no-interactive to accept all defaults.
export ANTHROPIC_API_KEY=your-key
nexus-agents orchestrate "Explain the architecture of this codebase"
Security: In default MCP mode, the server communicates only via stdio with the parent process (no network exposure). The REST API (opt-in) auto-generates an API key on first start. For network-exposed deployments, set
NEXUS_AUTH_ENABLED=true. See SECURITY.md.
| Category | Details |
|---|---|
| Adversarial PR Review | pr_review MCP tool: 5 voter roles (architect, security, devex, catfish, scope_steward) with 4-point gate. v5 evaluation (n=10): 100% bug-catch, 50% raw FP rate; manual triage reclassified most FPs as legitimate findings (details) |
| Consensus Voting | 6 strategies: simple_majority, supermajority, unanimous, higher_order (Bayesian correlation-aware), opinion_wise, proof_of_learning |
| Drift-Detected Charter | CLAUDE.md + inject-governance.ts check enforces single-source registries (model registry, MCP tools, expert types). Blocking CI gate fails build on drift |
| Audit Trail | Structured logging for every tool call, voter decision, and routing choice. Hash-chained immutable storage; integrity verifiable via verify_audit_chain MCP tool (#2281, #2289) |
| Closed-Loop Telemetry | OutcomeStore feeds LinUCB + TOPSIS scoring; a second bounded, audited self-tuning loop demotes unhealthy CLIs (capped, auto-decaying, on by default, opt-out NEXUS_TUNE_ENFORCE=false) |
| Security Pipeline | Sandboxing (Docker/policy), trust-tiered input handling, SARIF parsing, red-team patterns, ClawGuard access policies (audit/enforce) |
| Multi-Expert Orchestration | 12 built-in expert types coordinated by Orchestrator. Roles bind prompt + tools + memory |
| Development Pipeline | Research → Plan → Vote → Decompose → Implement → QA → Security. Three modes: autonomous, harness (caller implements), dry-run |
| Memory & Learning | 5 user-facing backends (session, belief, agentic, adaptive, typed). Cross-session persistence feeds routing decisions |
| Research System | 9 discovery sources (arXiv, GitHub, Semantic Scholar, etc). Auto-catalog, quality scoring, synthesis into topic clusters |
| Graph Workflows | DAG-based workflow execution with checkpoint/resume, state reduction, and event hooks |
| 46 MCP Tools | Agent management, workflow execution, research, memory, codebase intelligence, repo analysis, consensus, operations |
| Expert | Specialization |
|---|---|
| Code | Implementation, debugging, optimization |
| Architecture | System design, patterns, scalability |
| Security | Vulnerability analysis, secure coding |
| Testing | Test strategies, coverage, test generation |
| QA | Acceptance criteria, regression checks |
| Documentation | Technical writing, API docs |
| DevOps | CI/CD, deployment, infrastructure |
| Research | Literature review, state-of-the-art analysis |
| PM | Product management, requirements, priorities |
| UX | User experience, usability, accessibility |
| Infrastructure | Server management, bare metal, networking |
| Data Viz | Charts, dashboards, visual data presentation |
Nexus-agents routes tasks through 5 CLI adapters, each connecting to major AI providers:
| CLI | Provider | Best For |
|---|---|---|
| claude | Anthropic (Claude) | Complex reasoning, analysis |
| gemini | Google (Gemini) | Long context, multimodal |
| codex | OpenAI (Codex CLI) | Code generation, reasoning |
| codex-mcp | OpenAI (Codex MCP) | MCP-native Codex integration |
| opencode | Custom OpenAI-compat | Custom endpoints, local models |
nexus-agents # Start MCP server (default)
nexus-agents doctor # Check installation health
nexus-agents setup # Configure Claude CLI integration
nexus-agents orchestrate "..." # Run task with experts
nexus-agents vote "proposal" # Multi-agent consensus voting
nexus-agents review <pr-url> # Review a GitHub PR
nexus-agents expert list # List available experts
nexus-agents workflow list # List workflow templates
nexus-agents config init # Generate config file
nexus-agents init --portable # Create workspace-local .nexus-agents/ for sandboxes
nexus-agents init --portable --mcp-config # Also emit .mcp.json wiring Claude Code to it
nexus-agents init --portable --install --mcp-config # …and install the binary into the workspace
nexus-agents fitness-audit # Run fitness score audit
nexus-agents research query # Query research registry
nexus-agents --help # Full command list
See docs/ENTRYPOINTS.md for the complete CLI reference (28+ commands).
When running as an MCP server, the following tools are available. Start with run — the default entry point: give it a goal and the MetaOrchestrator picks (and, with execute: true, runs) the right strategy. The other pipeline tools are advanced force-strategy paths for pinning a specific one.
| Tool | Description |
|---|---|
orchestrate |
Task orchestration with Orchestrator coordination |
create_expert |
Create a specialized expert agent |
execute_expert |
Run a task through a previously-created expert (by expertId) |
run_workflow |
Run a linear workflow template (use run_graph_workflow for DAGs) |
delegate_to_model |
Pick the best-fit existing model for a task (no registry change) |
list_experts |
Inventory of expert ROLES for create_expert |
list_workflows |
Inventory of multi-step TEMPLATES for run_workflow |
consensus_vote |
Multi-model consensus voting on proposals |
research_query |
Query research registry (status, overlap, stats, search) |
research_add |
Add an arXiv PAPER to the registry (for non-paper sources use research_add_source) |
research_add_source |
Add a NON-PAPER source (repo/tool/blog) — for arXiv papers use research_add |
research_discover |
Discover papers/repos from external sources |
research_analyze |
Analyze registry for gaps, trends, coverage |
research_catalog_review |
Review auto-cataloged research references |
research_synthesize |
Synthesize registry into topic clusters with themes |
survey_oss_landscape |
Transient OSS project search (license, stars, last-commit) via GitHub |
vendor_publishing_audit |
Look up a vendor's signing infrastructure (GPG keys, URL patterns, signature shape) |
compare_data_feeds |
Diff two YAML/JSON feeds: coverage + per-field axes |
memory_query |
Query across all memory backends |
memory_stats |
Memory system statistics dashboard |
memory_write |
Write to typed memory backends |
weather_report |
Multi-CLI performance weather report |
issue_triage |
Triage GitHub issues with trust classification |
run_graph_workflow |
Run a DAG workflow with per-node checkpoints + audit trail (linear → run_workflow) |
execute_spec |
Execute AI software factory spec pipeline |
registry_import |
Draft YAML for a NEW model entry (for picking existing models use delegate_to_model) |
query_trace |
Query execution traces for observability |
query_task_state |
Query the structured task-state log for a task ID |
get_job_result |
Read result of an async-mode dispatch by jobId (#3042 / #2631) |
list_jobs |
List async-mode jobs across all tools — cross-session discovery (#3046 / #2631) |
cancel_job |
Mark an async-mode job as cancelled — idempotent (#3042 Stage 1b) |
ci_health_check |
CI infrastructure health — composes GitHub status + recent-runs activity (#3076) |
verify_audit_chain |
Verify hash chain of a FileAuditStorage audit log directory |
repo_analyze |
Analyze GitHub repository structure |
repo_security_plan |
Generate security scanning pipeline for a repo |
extract_symbols |
Tree-sitter AST symbols from a SINGLE file (functions/classes/types) |
search_codebase |
Cross-file ripgrep search for patterns or text (not an AST parser) |
run_dev_pipeline |
Full dev pipeline: research, plan, vote, implement, QA |
run_pipeline |
Execute a pipeline plugin by name with typed input |
pr_review |
Multi-voter PR review with verification gate (experimental) |
supply_chain_tradeoff_panel |
Per-axis tradeoff vote for build-vs-buy / supply-chain decisions |
improvement_review |
Threshold-gated observability loop — surfaces routing/tech-debt/bug/security signals from outcome+fitness data; files candidate issues |
run_quality_gate |
Run the QA quality gate (typecheck/lint/tests/build/security) over a project dir; returns structured pass/fail verdict + feedback |
suggest_research_tasks |
SUGGEST-ONLY: candidate pipeline tasks from research_discover findings for review — files/executes nothing (#1715) |
list_available_models |
Probe all model-discovery transports (OpenRouter API + opencode/claude/codex/gemini CLIs) and report per-transport health — validates the CLIs/APIs are reachable (#3406) |
run |
Default entry point — give a goal, MetaOrchestrator picks the strategy; returns the routing decision (execute:false, read-only) or runs it inline (execute:true; dev-pipeline+pipeline+research+consensus wired) (#3548) |
Environment Variables:
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY |
Claude API key |
OPENAI_API_KEY |
OpenAI API key |
GOOGLE_AI_API_KEY |
Gemini API key |
NEXUS_LOG_LEVEL |
Log level (debug/info/warn/error) |
Generate config file:
nexus-agents config init # Creates nexus-agents.yaml
| Topic | Link |
|---|---|
| Full CLI Reference | docs/ENTRYPOINTS.md |
| Architecture | docs/architecture/README.md |
| Contributing | CONTRIBUTING.md |
| Coding Standards | CODING_STANDARDS.md |
| Quick Start Guide | QUICK_START.md |
git clone https://github.com/nexus-substrate/nexus-agents.git
cd nexus-agents
pnpm install
pnpm build
pnpm test
Requirements: Node.js 22.x LTS, pnpm 9.x
git checkout -b feat/amazing-feature)feat(scope): add feature)See CONTRIBUTING.md for details.
MIT - See LICENSE
Built with Claude Code
Run in your terminal:
claude mcp add nexus-agents -- npx CSA PROJECT - FZCO © 2026 IFZA Business Park, DDP, Premises Number 31174 - 001
Security
Low riskAutomated heuristic from public metadata — not a security guarantee.