Cross Review V2

Name: Cross Review V2
Availability: InStock
Author: GitHub Actions

Бесплатно

API-first MCP server for multi-model cross-review with unanimous convergence gates.

автор: GitHub Actions

GitHub Embed

Описание

API-first MCP server for multi-model cross-review with unanimous convergence gates.

README

cross-review-v2

MCP server orchestrating API-first cross-review between Claude, ChatGPT Codex, Gemini, DeepSeek, and Grok with unanimous convergence gates.

status: stable npm runtime: API-only security: CodeQL Default Setup license: Apache 2.0

Install.

npm install -g @lcv-ideas-software/cross-review-v2
# or using the GitHub Packages mirror:
npm install -g @lcv-ideas-software/cross-review-v2 --registry=https://npm.pkg.github.com

Status. Stable. Current release: v03.00.00 (npm package 3.0.0). See CHANGELOG.md for the release history.

The version history at a glance:

Release	Scope
`v03.04.00`	Minor — Perplexity multi-failure-mode close-out 2026-05-13: 3 coordinated fixes covering 7 production sessions Codex flagged (`51973fac`, `f72e597a`, `f9a19401`, `99d46a2b`, `00d92cce`, `59776026`, `0003b2fe`). Fix #1 — streaming-path strip parity (P0, surgical 2-line edit in `src/peers/perplexity.ts:~409/~504`): the v3.2.0 `stripPerplexityThinkingBlock` fix was applied only inside `sonarText(response)` (non-streaming path at `:~426/~521`). Production `server_info.streaming.tokens=true` is the default, so virtually every Perplexity call traversed the streaming branches which used raw `stream_buffer.text()` and bypassed the strip entirely. `<think>...</think>` preambles from `sonar-reasoning-pro` / `sonar-deep-research` reached the status parser, producing `unparseable_after_recovery` despite valid trailing JSON. v3.4.0 wraps `stream_buffer.text()` with `stripPerplexityThinkingBlock(...)` at both streaming sites, restoring parity. Forensic evidence: sess `f9a19401` (v3.3.0 self-investigation) — 4 peers converged READY on the exact diagnosis; Perplexity `ready_rate=0.28125` (9/32) vs `~1.0` for other peers. Fix #2 — anti-meta-audit lock (P1, prompt clause + heuristic detector): sess `51973fac` shipped a checklist of `MISSING: diff hunk` placeholders + sections titled `Evidence Gap` / `Validation Claims (NARRATIVE` / `Peer Review Readiness Blockers` instead of refining the artifact. `leadShipModeDirective()` gains an `## Anti-Meta-Audit Lock (HARD)` clause; new exported `detectMetaAuditFabrication(text)` in `src/core/orchestrator.ts` flags placeholder + section anti-patterns with double-bar threshold `(placeholders ≥ 3) OR (sections ≥ 1 AND placeholders ≥ 2)` for false-positive resistance. Reuses the shared `consecutiveLeadDrifts` counter (cap=2); new event `session.lead_meta_audit_fabrication_detected` + finalize reason `lead_meta_audit_repeated`. Fix #3 — reviewer proportionality (P2, prompt only): sess `0003b2fe` — Perplexity reviewer demanded separate `session_attach_evidence` of the same `rg` scan output the caller had narrated inline, blocking convergence over rounds. `sessionContractDirectives()` gains item 5 scoped tightly to pure config/script/text static-scan reviews; runtime work (build/test/deploy/migration/network) still requires raw output; "when in doubt, prefer asking for evidence" preserves rigor default. 3 new smoke markers (`perplexity_streaming_strip_parity_test`, `meta_audit_fabrication_detection_test`, `proportionality_guidance_test`). 100% backward-compatible additive: new exported helper, new event type, new finalize reason; tool schema unchanged. Minor bump (3.3.0 → 3.4.0; Y-component increment per SemVer) — additive public surface is the reason; behavior change for callers passing valid args is pure failure-mode prevention.
`v03.03.00`	Minor — Caller peer-selection lock (operator directive 2026-05-12: "TODOS OS AGENTES/PEERS SEMPRE PARTICIPAM, INDEPENDENTE DA ESCOLHA OU VONTADE DO CALLER"). Closes the systematic gaming pattern where peer callers (notably Codex, observed across multiple sessions) selectively excluded other peers from their own cross-review panels via curated `peers: [...]` lists or pinned a sympathetic relator via `lead_peer`. Lock surface: `peers` is locked for ALL callers (including operator) — reviewer panel is ALWAYS the full server-configured `peer_enabled` set; operators tune via env vars (`CROSS_REVIEW_V2_PEER_=on
`v03.02.00`	Patch — Codex bug-report close-out 2026-05-12: three surgical fixes (Perplexity `<think>` parser + session-state invariant + orchestrator strict peers). Fix #1 (`src/peers/perplexity.ts`): `sonar-reasoning-pro` / `sonar-deep-research` emit a `<think>...</think>` reasoning preamble before structured JSON; pre-v3.2.0 the parser fed that raw string into the format-recovery pipeline, which failed `unparseable_after_recovery` even when the trailing JSON was valid READY. New `PERPLEXITY_THINKING_BLOCK` regex + exported `stripPerplexityThinkingBlock()` helper; `sonarText()` now strips before returning. Closes the long-standing blocker that forced v3.0.0/v3.1.0 to self-bypass HARD GATE. Fix #2 (`src/core/session-store.ts`): closes session-state corruption observed in session `41244a1c-e7e8-439a-a59e-9339f7c7175d` (R1-R3 didn't converge, R4 finalized as `converged`, R5+R6 ran on top and clobbered `convergence_health` back to `"blocked"`, leaving meta with `outcome="converged" / health.state="blocked"`). `finalize()` now validates `outcome="converged"` against the latest round's `convergence.converged` (throws `code: "session_finalize_outcome_mismatch"`); `appendRound()` refuses to append to a finalized session (`code: "session_already_finalized"`); new public `assertNotFinalized()` helper wired into `askPeers` + `runUntilUnanimous` entry points so the round fails fast instead of after burning budget. Fix #3 (`src/core/orchestrator.ts`): when the caller passes an explicit `peers: [...]` list, autowire judges are intersected with the explicit list — both the consensus and single-peer paths. Observed in session `73036fbb` where peers=[codex,gemini,deepseek,grok] but autowire still invoked perplexity as judge. New `hadExplicitPeers` flag + `judgeRespectsExplicitPeers()` helper; skipped sessions emit `session.evidence_judge_pass.autowire_skipped` with `skipped_for_explicit_peers: true` + `session_explicit_peers: [...]` for operator audit. 3 new smoke markers (`perplexity_thinking_block_strip_test` 7 scenarios + 3 pins; `session_finalize_state_invariant_test` 5 scenarios + 1 pin; `orchestrator_strict_peer_panel_test` 5 source pins). Smoke harness completes `ok: true / events: 99`. Patch bump (additive — new exports + new error codes; pre-existing anti-patterns now reject loudly instead of corrupting state). The `cross-review-v2-attachment-inline-test` smoke fixture was updated to `caller_status: "NOT_READY"` so R1 doesn't auto-converge under stub mode.
`v03.00.00`	Major — Perplexity joins the sexteto. Quinteto (5 peers) → sexteto (6). Operator directive 2026-05-12. New `PerplexityAdapter` at `https://api.perplexity.ai` (Sonar API, OpenAI-Chat-Completions-compatible; reuses shared `loadOpenAICtor` lazy SDK helper). 5 architectural traits handled explicitly: (1) web search is the DEFAULT per call — peer becomes fact-check overlay; (2) system prompt is half-honored (search component does not attend to it); (3) `reasoning_effort` enum is `minimal
`v02.28.00`	Minor — Cold-start hardening Part 3: Windows registry env-var lookup bulk-cached (3-7 s → ~100 ms). Empirical profile revealed the real boot bottleneck on Windows: `loadConfig()` consuming 3.1-7.0 s because `readWindowsRegistryEnv(name)` fired `reg query <root> /v NAME` per missing env var × 2 scopes (HKCU + HKLM). With ~140 config vars and partial `process.env`, this burned 3-7 s dwarfing every other boot cost. v2.27.0 + v2.27.1 attacked SDK imports + sweeps (~340 ms) — a side concern. v2.28.0 fix: single bulk `reg query <root>` at first miss populates a `Map<string,string>` module cache; `readWindowsRegistryEnv` becomes a pure `cache.get(name)`. Cost: `O(1 + 2 registry reads)` instead of `O(N missing × 2 spawns)`. Empirical handshake (3 trials each): v2.27.1 3.18 / 3.12 / 3.14 s → v2.28.0 0.37 / 0.37 / 0.38 s = 8.4× speedup. `loadConfig()` alone: 3,307 ms → 87 ms (38×). Cold-start now well below Claude Code's spawn timeout. New smoke `windows_registry_env_bulk_cache_test` (7-class assertion pinning Map cache + bulk loader + canonical `reg query <root>` shape + negative invariant on per-var `/v NAME` + escapeRegExp absence + thin lookup + dist parity). Public surface 100% backward-compatible. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md` (gate-fixing-itself, third installment). Minor bump — internal behavior change with measurable 8.4× runtime impact.
`v02.27.01`	Patch — Cold-start hardening Part 2: lazy-load 5 provider SDKs + defer 6 startup sweeps to setTimeout(30s). Completes the cold-start fix started in v2.27.0. Empirical motivation 2026-05-12: cross-review-v2 failed to register tools in a Claude Code session while the other 5 MCP hosts (Codex CLI / Gemini Code Assist / Antigravity / Grok CLI / DeepSeek CLI / VS Code) loaded normally. Diagnostic measurement of the real JSON-RPC `initialize` handshake showed the server taking ~4.2 s to respond — right on top of Claude Code's per-spawn timeout. Two contributors stacked: eager top-level imports of 5 provider SDK module trees (~3 s of CommonJS/ESM graph) + v2.27.0's 4 boot sweeps + 2 boot notices all running on the same event-loop tick as the initialize message. Lazy-load: every adapter source uses `import type` only for provider SDKs; new shared cached loaders `loadAnthropicCtor()` / `loadOpenAICtor()` / `loadGenaiModule()` wrap `import(<sdk>)` in a per-module promise so concurrent first-callers resolve exactly once. Each adapter's `client()` is now async; 25 call sites across the 5 adapters updated; `geminiThinkingConfig(model, ThinkingLevel)` takes the lazy-loaded enum as 2nd arg. `model-selection.ts` consumes the same loaders. Deferred sweeps: 6 boot-time `setImmediate` blocks in `server.ts` become `setTimeout(..., STARTUP_SWEEP_DELAY_MS)` with delay = 30_000 ms — initialize responds in <200 ms while sweeps run later when the operator is idle. Empirical handshake measurement post-ship: 3.6-3.9 s (vs 3.7-4.2 s pre-ship); margin is modest because Node.js + MCP SDK + orchestrator still dominate, but the architectural correctness keeps SDK and FS work off the initialize tick entirely. 2 new smoke markers (`lazy_provider_sdk_imports_test`, `startup_sweeps_use_setTimeout_test`) + 2 existing gemini assertions updated. Public surface 100% backward-compatible (3 new named exports for cross-module loader reuse; `client()` is `private` so async-conversion is internal). Patch bump.
`v02.27.00`	Minor — Cold-start hardening Part 1: corrupted meta.json auto-quarantine + finalized-session auto-prune. Empirically motivated by Claude Code reload friction 2026-05-12: 534 session dirs accumulated under `~/.cross-review/data_v2/sessions/`, including 3 corrupted by the v2.25.1 redact escape-boundary bug (`77c47284`, `be47a5b0`, `7edf63e3`). The startup sweeps iterate via `list()` which read every `meta.json`; a single corrupted file caused the sweep to throw + abort, surfacing parse-error stderr on every reload — Claude Code is more sensitive to startup stderr than other hosts. `SessionStore.list()` now silently skips + quarantines corrupted meta.json (renamed to `meta.json.bad` with one `[cross-review-v2] quarantined …` stderr line, idempotent). `SessionStore.pruneOldSessions(maxAgeDays?)` removes finalized session dirs (outcome ∈ converged/aborted/max-rounds) whose `updated_at` is older than the cutoff. Default 60 days; `CROSS_REVIEW_V2_PRUNE_AFTER_DAYS=0` disables entirely. In-flight or untyped-outcome sessions are NEVER pruned (preserves audit trail). New boot `setImmediate` block wires the prune; stderr only emitted when `pruned > 0`. Minor bump — 2 new methods on `SessionStore`; `list()` swallows-and-quarantines instead of throws (additive defensive). Backward-compatible default; operators see no behavior change unless they have corrupted meta.json or >60-day-old finalized sessions. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`.
`v02.26.01`	Patch — `max_attached_evidence_chars` default raised 80_000 → 200_000 to fix multi-file evidence truncation. Empirically demonstrated by the stepsecurity v0.2.0 ship 2026-05-12 (sess `fd1037e5` and prior `85f94725`): with 5 attached evidence files totaling ~95KB, `session-store.readEvidenceAttachments()` budget allocator at `src/core/session-store.ts:1481-1543` exhausted the 80KB total cap before reaching the 4th+ attachment, surfacing `(truncated to 33273 of 38412 bytes)` to peers, who in 5 consecutive rounds correctly flagged the truncation as a blocker. The `perFileCap = max(2_000, floor(totalCap * 0.6))` mechanic remains correct (60% per-file allowance leaves room for at least 1 other attachment); only the global `totalCap` default needed bumping. New default 200_000 chars accommodates ~5 attachments averaging 30KB each. Operator override unchanged via `CROSS_REVIEW_V2_MAX_ATTACHED_EVIDENCE_CHARS`. Documented adjacent issues (no code fix; tracked for v2.27+): (1) lead-drift abort threshold is 2 consecutive (`orchestrator.ts:3662`) — when `max_rounds` is reached with `consecutiveLeadDrifts === 1`, the session ends `max-rounds` instead of `lead_meta_review_drift`; workaround = use `ask_peers` for known-drift-prone task patterns; (2) inaccessible upstream OpenAPI spec — when peers demand verbatim spec excerpts but the spec endpoint requires browser-session cookie auth, the caller must rely on alternative-evidence patterns. Patch bump — backward-compatible default change. No public API surface change. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`.
`v02.26.00`	Minor — Full pricing-model schema: base + extended-tier + cache (read/write) + promo (limited-time discount), all env-configurable, graceful fallback when fields are absent or promo expires. Operator directive 2026-05-11 ("Cross-review-v2 precisa saber ler das variáveis configuráveis nos arquivos de configuração e no env var todos os modelos de preços vigentes, com e sem cache, com promoção e sem promoção abaixo de tantos tokens e acima de tantos tokens"). Adds 14 new optional pricing env vars per provider plus 2 metadata env vars per provider (`_THRESHOLD_TOKENS`, `_PROMO_EXPIRES_AT_UTC`) on top of the v2.0.0 required pair — total 18 env-var slots per provider × 5 providers = 90 max. Selection cascade in new exported `selectRate()`: (promo+extended) → promo → extended → base, each step automatically falling through when the corresponding field is unset OR the gating condition does not apply. When promo expires (`Date.now() >= Date.parse(promo_expires_at)`), system uses base without operator intervention; when extended is unset, base applies to all prompt sizes; when cache rates are unset entirely, cache tokens are billed at the input rate (zero savings reported, no penalty). No-hardcoded-financials directive — the legacy `src/core/cache-rates.json` runtime fallback was REMOVED; cache pricing comes exclusively from env vars or graceful input-rate fallback. CostEstimate type extended with `cache_read_cost?`, `cache_write_cost?`, `tier_used?` ("base"\|"extended"\|"promo"\|"promo_extended"). `estimateCacheSavings()` third positional arg (`configRate`) is now required — internal/MCP callers route through `estimateCost()` and are unaffected. New smoke marker `full_pricing_model_v2260_test` pinning 11 invariants. Minor bump — additive public surface; breaking only for direct `estimateCacheSavings()` callers.
`v02.25.01`	Patch — `meta.json` corruption hotfix: `redact()` env-style pattern was crossing JSON-escape boundaries. The env-style assignment regex in `src/security/redact.ts:26` used `[^\s"',}]{6,}` for the value capture group; backslash was NOT in the exclusion class, so when a peer response contained the JSON-escaped sequence `token: write\"` (the inner-string close-quote of an escaped peer text), the `{6,}` quantifier consumed `write\` (6 chars including the escape backslash). The replacement `[REDACTED]` ate the closing `\` of the escape, leaving a bare `"` that prematurely closed the outer JSON string — producing structurally-broken `meta.json` files that could not be re-parsed at session resume time. Empirical impact: 3 cross-review-v2 sessions today (`be47a5b0`, `77c47284`, `7edf63e3`) all aborted at session_init with parser errors at different positions — same root cause: peer responses to a 13-repo scorecard hotfix submission quoted `id-token: write` inside backtick-fenced YAML excerpts. Fix: extend the negative char class with `\\`. Three smoke regression cases added (`escapeBoundary`, `realAssignment`, `yamlExcerpt`). Patch bump — additive defensive narrowing of an existing pattern; no public surface change. Cross-review-v2 self-review BYPASSED per operator directive 2026-05-11 (the bug being fixed is in the cross-review gate itself; routing the fix through the broken gate would re-encounter the same corruption).
`v02.25.00`	Third deliberation mode `circular` joins `ship` and `review`. Imported from `maestro-app`'s editorial protocol after operator review of the maestro design 2026-05-11. Serial deliberative custody: caller submits artifact; non-caller peers rotate as temporary curators; each rotator either approves the current version unchanged or produces a narrowly justified revision; convergence = full rotation completes without substantive change. No parallel peer-voting in this mode — the rotator IS the actor each round. When to use each mode: `ship` (default) for approving/rejecting an external artifact (PR review, spec approval, security gate — tribunal primitive, all peers vote READY); `review` for tasks phrased as a review act where the lead emits structured response; `circular` for producing/refining a shared artifact (spec drafting, RFC, protocol evolution, CHANGELOG copy — editorial primitive, panel produces). Modes coexist; mixing within a single session is not supported. Implementation: new `SessionMode = "ship" \| "review" \| "circular"`; new `leadCircularModeDirective()` Layer-1 prompt clause with 5 subsections (approve unchanged, approved-content lock, quality preservation, no-self-review, evidence-provenance-lock shared with ship); new `runCircularLoop()` private orchestrator method called when `sessionMode === "circular"`; new `circular_state: {rotation_order, consecutive_no_change_count, last_revision_round}` persisted in `meta.json`; new `circular_max_rotations` config (default 3, env `CROSS_REVIEW_V2_CIRCULAR_MAX_ROTATIONS`); new event types `session.circular_rotation_assigned` / `_step_unchanged` / `_step_revised` / `_full_rotation_no_change` / `_max_rotations_exceeded` / `_rotation_too_small`; new finalize reasons `circular_full_rotation_no_change` / `circular_max_rotations_exceeded` / `circular_rotation_too_small`. Rotation length minimum is 2 (no-self-immediate-output guard). Drift / empty / fabrication detection from v2.23/v2.24 fires identically. New smoke marker `circular_mode_test` pinning 11 invariants. MCP tool schemas (`run_until_unanimous`, `session_start_unanimous`) accept `mode: "circular"`. Minor bump — additive public surface; pre-v2.25 callers see no behavior change.
`v02.24.00`	Evidence-provenance lock for the ship-mode relator (Codex bug report 2026-05-10). Codex empirically observed two adjacent failure modes from his own working session `019dc794`: (a) session `09c21d7a` — lead_peer (Grok) fabricating operational evidence ex nihilo in `run_until_unanimous` with `mode: ship` (symmetric-pattern SHAs, 39-char SHAs where git emits 40, test-run counts not in attached evidence, `git diff --check passed` assertions, vite asset hashes); (b) session `eee886d3` — different relator (DeepSeek) propagating caller-narrated operational claims (`cargo test: 147 passed`, `npm run typecheck: passed`) as if they were verified evidence, when the caller never attached raw command output via `session_attach_evidence`. Same architectural gap from two angles: NARRATIVE about operational evidence ≠ PROVENANCE-GRADE operational evidence. Pre-v2.24.0 the orchestrator promoted such revisions to next-round draft, burning a full round of peer calls before downstream peers (claude + deepseek) blocked convergence. Layer 1 — Evidence Provenance Lock (HARD) clause added to `leadShipModeDirective()` system prompt, instructing the relator that operational evidence (SHAs/hashes/build outputs/test counts/diff hunks/git assertions) MUST be cited verbatim from the corpus or declared as a blocker. Layer 2 — new exported `detectFabricatedEvidence(revisionText, provenanceCorpus): FabricationDetectionResult` heuristic detector with hex-token-subset check + canonical operational-assertion patterns. Thresholds: 3+ net-new hex tokens or 2+ suspicious assertions trip fabrication; corpus-quoted tokens are subtracted before scoring (false-positive guard). Layer 3 — orchestrator relator-revision branch wires the detector after `emptyText`/`driftDetected` checks, preserves prior draft on detection, increments `consecutiveLeadDrifts`, emits `session.lead_fabrication_detected` event (data.fabrication_signals carries net_new_hex_count + sample + suspicious_assertion_count + sample), finalizes with `lead_fabrication_repeated` at the consecutive-cap. New smoke marker `relator_evidence_provenance_lock_test` pins behavioral matrix (clean/hex/assertion/provenance-correct) + source-level invariants (prompt sentinel, threshold constants, event type, finalize reason, unified-counter contract). No tool surface change. Patch bump — additive event + finalize reason; failure-mode behavior change only.
`v02.23.00`	Anthropic empty-revision degenerate path detection. Patch closing a $0.21 USD waste path discovered while triaging maestro-app v0.5.20 review session `8187f5a8` (2026-05-10): Claude Opus extended-thinking responses can return a content array with only `thinking`/`redacted_thinking` blocks and no final `text` block. Pre-v2.23.0 the Anthropic adapter silently produced `text: ""` and the orchestrator promoted that empty string to the next-round draft, dispatching peer calls against an empty `Draft Or Solution Under Review:` block. Layer 1 — new `parseAnthropicContent(content)` returns `{text, parser_warning?}` instead of the lossy `string`; legacy `textFromAnthropicContent` kept as backward-compat shim. Layer 2 — anthropic.ts call sites surface `parser_warning` via new `extraParserWarnings` parameter on `resultFromText`/`generationFromText`, flowing to `PeerResult.parser_warnings` and (new) `GenerationResult.parser_warnings`. Layer 3 — orchestrator's relator-revision branch treats `generation.text.trim() === ""` the same as drift: preserve prior draft, increment `consecutiveLeadDrifts`, emit dedicated `session.lead_empty_revision` event, finalize with `lead_empty_revision_repeated` when the cap is hit. New smoke marker `anthropic_empty_text_detection_test` pins all 4 invariants (helper return shape, adapter call-site uniformity, orchestrator sentinel strings, types declaration). No public surface change for callers passing valid arguments. Patch bump — failure-mode behavior change only.
`v02.22.00`	`session_doctor` drill-down + per-round cost telemetry + budget warning event. Three observability/audit improvements identified during a forensic audit of 467 durable sessions. A.P2: `session_doctor` hides per-session enumeration of `findings.self_lead_metadata` by default (178/467 = 38% pre-v2.16.0 noise); `totals.self_lead_metadata` count remains visible; pass `include_legacy: true` to enumerate. B.P2: entries in `findings.open_evidence_sessions[]` gain `item_types` (open items grouped by surfacing peer) + `chronic_blockers` (item ids with `round_count >= 3`) so operators see which evidence asks are systemic. B.P3: new `costs_per_round[]` + `cost_ceiling_usd` in `meta.json` (snapshot at session_init time so retroactive analysis is decoupled from later env-var changes); new one-shot `session.budget_warning` event fires when cumulative cost crosses 75% of the ceiling, providing early visibility before `max_rounds_budget_exceeded`. 3 new smoke markers (`session_doctor_legacy_filter_test`, `evidence_checklist_drilldown_test`, `budget_warning_emit_test`). Minor bump — public surface is additive; pre-v2.22 callers see no behavior change.
`v02.21.00`	Cross-provider prompt caching across all 5 peers (OpenAI, Anthropic, Gemini, DeepSeek, Grok). Single coordinated ship that wires uniform prompt-caching telemetry through the runtime: each adapter parses provider-native cache fields (`prompt_tokens_details.cached_tokens` / `cache_creation_input_tokens` / `cache_read_input_tokens` / `cachedContentTokenCount` / `prompt_cache_hit_tokens` / `prompt_cache_miss_tokens`); orchestrator emits a canonical `provider.cache.usage` event; per-session `cache_manifest.json` is appended for every cached call. Anthropic uses EXPLICIT cache_control breakpoints on the system prompt (TTL `5m`/`1h`). OpenAI uses pair-scoped `prompt_cache_key` + `prompt_cache_retention` (`in_memory`/`24h`). Grok mirrors OpenAI plus `x-grok-conv-id` header for cache-bucket scoping. DeepSeek parses auto-cache telemetry (no payload changes). Gemini parses implicit-cache telemetry only (explicit `caches.create` deferred). New `src/core/prompt-parts.ts` builds the canonical `stablePrefix` that always begins with `cache_schema_version: vN` and produces a sha256 hex hash invariant across rounds for the same case. New `src/core/cache-manifest.ts` persists per-session cache history with the same atomic-write retry pattern as `meta.json`. New rate cards in `src/core/cache-rates.json` populate `CostEstimate.cache_savings_usd` (or `cache_savings_unknown` when no rate matches). Operator can disable globally with `CROSS_REVIEW_V2_DISABLE_CACHE=true`; TTL via `CROSS_REVIEW_V2_CACHE_TTL_ANTHROPIC` / `CROSS_REVIEW_V2_CACHE_TTL_OPENAI`; schema bump via `CROSS_REVIEW_V2_CACHE_SCHEMA_VERSION`. 5 new smoke markers (`cache_hash_invariance_test`, `cache_schema_version_in_prefix_test`, `cache_rates_json_loaded_test`, `cache_manifest_atomic_write_test`, `cache_disable_kill_switch_test`). New `docs/caching.md` documents per-provider behavior matrix. Minor bump — public surface is additive; pre-v2.21 callers see no behavior change.
`v02.18.08`	Site sponsor card iteration. `site/index.html` GitHub Sponsors iframe (caixa branca cross-origin) substituído por link card dark navy com ❤ pink + meta cyan + seta animada; card movido para DEPOIS dos botões (lcv.dev/sponsor primário, GitHub Sponsors alternativa). Companion ship Phase 3 (12 repos).
`v02.18.07`	Patch — `site/index.html` visual identity refresh. GitHub Pages doc/sponsor page reskin to the new LCV org dark-first navy/cyan visual identity (palette `#050b18`/`#38bdf8`/`#34d399`, radial gradients, glow shadows, gradient text on h1). Coordinated companion ship with cross-review-v1 1.12.9, deepseek-cli 0.3.1, grok-cli 1.6.2, sponsor-motor APP v01.02.02, and `.github-org/site` (org root + /sponsor). No change to the published npm tarball (`files[]` does not include `site/`); only the GitHub Pages page changes. Patch bump (no public surface change).
`v02.18.06`	Patch — Gemini API function-declaration compatibility for MCP tool inputSchemas. Gemini Code Assist forwards each MCP tool's `inputSchema` to the Gemini API as a `function_declarations[].parameters` payload; the Gemini API's OpenAPI 3.0 subset rejects three patterns the SDK was emitting from the existing zod schemas, surfacing as `400 INVALID_ARGUMENT` for every chat turn including cross-review-v2 tools. v2.18.6 cleans the offending zod usage. (1)* `additionalProperties: false` removed from every MCP tool inputSchema (~28 tools) by dropping the `.strict()` chain; runtime accepts the same valid arguments because handlers consume only declared properties via destructuring. (2) `caller` field flattened from `z.union([PeerSchema, z.literal("operator")])` (6 occurrences) to a single `CallerSchema = z.enum([...PEERS, "operator"])`, replacing the `anyOf: [enum, const]` shape with a clean single `enum`. (3) `reasoning_effort_overrides` refactored from `z.record(PeerSchema, ReasoningEffortSchema).optional()` to an explicit `z.object({codex?, claude?, gemini?, deepseek?, grok?}).optional()`, eliminating the non-OpenAPI `propertyNames` constraint and the spurious `required: [<all 5 peers>]` artifact that contradicted the field's `.optional()` declaration. No behavior change for any caller passing valid arguments — Claude Code, Codex CLI, Gemini Code Assist, Grok CLI and DeepSeek CLI continue invoking the same tools with the same keys. Lint/typecheck/format clean; smoke harness completes with `ok: true / events: 96`. Patch bump (compatibilidade pública preservada; única diferença observável é que campos extras não declarados passam a ser silenciosamente descartados em vez de rejeitados com `mcp_arg_validation_failed`).
`v02.18.05`	Patch — anti-drift smoke drivers for v2.18.4 audit closure (operator directive 2026-05-07). v2.18.4 shipped 6 surgical fixes from the Codex external audit; v2.18.5 hardens those fixes against silent regression with 5 anti-drift smoke checks (`hono_override` / `abort_signal_threading` / `max_items_per_pass_default` / `clamp_effort_for_model` / `consensus_event_per_peer_attribution`). P1.1: `package.json` overrides.hono === ">=4.12.16" + ip-address override retained. P1.3: ≥2 sites with `signal?: AbortSignal` param + `signal: params.signal` wiring + `signal: input.signal` autowire emission; consensus pass has no leftover `signal: undefined`. P1.4: source-level `?? "4"` fallback + behavioral `loadConfig()` returns max_items_per_pass=4 (env unset). P2.1: behavioral clampEffortForModel("xhigh", "grok-4.3")="high"; passthrough on multi-agent; clamp wired at exactly 2 responses.create sites. P2.4: legacy judge_peer + new judge_peers array + per_peer_verdict map co-emitted at every `this.emit({...})` event payload. `clampEffortForModel` is now exported from src/peers/grok.ts so the harness can verify directly. Companion to cross-review-v1 v1.12.7 (parallel ship, same operator directive). Smoke harness completes with `ok: true` / exit 0; lint/typecheck/format clean; `npm audit --audit-level=moderate` 0 vulnerabilities. Patch bump (additive — only new exports + new smoke markers; no runtime behavior change).
`v02.18.04`	Patch — Codex external audit 2026-05-07 outcome: 6 surgical fixes (P1.1, P1.2, P1.3, P1.4, P2.1, P2.4). Codex submitted a read-only audit of cross-review-v2 v2.18.3 with 4 P1 + 7 P2 findings; this ship lands 6 verified-actionable items. P1.1: `package.json` adds `"hono": ">=4.12.16"` override clearing 2 npm-audit moderate advisories (GHSA-9vqf-7f2p-gf9v + GHSA-69xw-7hcm-h432) via @modelcontextprotocol/sdk transitive (practical exposure ~zero in stdio runtime, but audit-gate matters for publish + defense-in-depth; same precedent as v2.18.1 ip-address override). P1.2: `src/security/redact.ts` adds `xai-` API key pattern at parity with sk-/sk-ant-/AIza/etc; logs/sessions could previously leak xAI keys via persisted provider errors. P1.3: `runEvidenceChecklistJudgeConsensusPass` + `runEvidenceChecklistJudgePass` now thread `AbortSignal` through to `judgeEvidenceAsk(context.signal)` — pre-v2.18.4 the consensus path hardcoded `signal: undefined` and single-peer omitted the field, so `session_cancel_job` could not abort judges mid-flight. Autowire call sites pass `input.signal` from round scope. P1.4: lowered default `CROSS_REVIEW_V2_EVIDENCE_JUDGE_MAX_ITEMS_PER_PASS` from 8 → 4 — with default consensus_peers=4, worst-case round goes from 4×8=32 paid judge calls down to 4×4=16. Operators wanting prior behavior set env-var explicitly. P2.1: `GROK_REASONING_EFFORT_MODELS` allowlist expanded from `{"grok-4.20-multi-agent"}` to include `"grok-4.3"` per current xAI docs (verified via WebFetch 2026-05-07; xAI added `grok-4.3` reasoning_effort support after v2.16.0 froze). New `clampEffortForModel()` narrows internal `xhigh`/`minimal` scale to `high` for grok-4.3 (which only accepts `none
`v02.18.03`	Patch — Gemini default pin bump `gemini-3.1-pro-preview` → `gemini-2.5-pro` (operator preference 2026-05-07; coordinated with cross-review-v1 v1.12.4). Source-of-truth defaults flipped: `src/core/config.ts` `models.gemini` default → `gemini-2.5-pro`; `src/peers/model-selection.ts` priority list → `["gemini-2.5-pro", "gemini-3.1-pro-preview"]` (3.1-pro-preview retained as fallback). Rationale: under Google One AI Ultra subscription, `gemini-2.5-pro` carries 1k requests/day quota vs `gemini-3.1-pro-preview`'s 250 requests/day; post-bump empirical sessions (08cbc942, 1d5be5f2, 256ac7c9 — all 2026-05-07) confirm `gemini-2.5-pro` stable across the 5-peer panel without rate_limit blockers. The 7 LCV-workspace MCP host configs already flipped `CROSS_REVIEW_GEMINI_MODEL=gemini-2.5-pro` env-override 2026-05-07; this ship aligns the source-of-truth defaults so a fresh install without env-override picks the same model. Workspace policy (operator directive 2026-05-07): only `gemini--pro` variants ≥ 2.5 are permitted — no `-flash` and no models below 2.5. Smoke fixture `scripts/smoke.ts:225` (currentOfficialModel iterator) flipped to `gemini-2.5-pro`. `docs/api-keys.md` env-var example + `docs/model-selection.md` priority documentation refreshed to match. Patch bump (no public surface change beyond default model ID; behavior unchanged for env-override users).
`v02.18.02`	Tier 5 — Windows process-tree introspection (coordinated with cross-review-v1 v1.12.2). Closes the long-standing forensics gap: pre-v2.18.2 `getParentProcessSnapshot()` returned `parent_exe_basename: null` on Windows because we only had a POSIX `/proc/<ppid>/comm` reader (Windows path deferred at F1 v2.18.0). v2.18.2 closes the gap with a defensive `tasklist /FI "PID eq <ppid>" /FO CSV /NH` reader via `child_process.spawnSync` (`timeout: 500`, `windowsHide: true`); parser uses leading-quote discriminator and the same `1 ≤ length < 128` sanity filter as POSIX. Best-effort try/catch swallows ENOENT, timeout, parse failures. POSIX path unchanged. `scripts/smoke.ts` sub-test (14) extended with shape sanity + Windows-specific populated-basename assertion + source-level anti-drift guards. Forensics-only field — NOT used by F1 token gate or v2.17.0 clientInfo cross-check. Patch bump (no public surface change).
`v02.18.00`	F1 caller capability tokens (coordinated with cross-review-v1 v1.11.0). Cryptographic identity proof that complements the v2.17.0 clientInfo gate. Pre-v2.18.0 the v2.17.0 cross-check between `caller` and `clientInfo.name` only catches inconsistent self-reports — both fields are declared by the caller. F1 introduces a per-host secret (env `CROSS_REVIEW_CALLER_TOKEN`), authoritative on match and rejected on mismatch. New `caller-tokens` module exposes generation, loading, constant-time hex matching, env verification and a best-effort parent-process snapshot for forensics (Option C / Hybrid). New MCP tool `regenerate_caller_tokens` rotates `host-tokens.json`. New env vars `CROSS_REVIEW_CALLER_TOKEN`, `CROSS_REVIEW_TOKENS_FILE`, `CROSS_REVIEW_REQUIRE_TOKEN`. New `caller_tokens` block in `server_info` surfaces the gate state. `verifyCallerIdentity` extended with `verification_method` ("token"
`v02.17.00`	HARD GATE — identity forgery rejection (operator directive 2026-05-05). Empirical evidence flagrada: cross-review-v2 session `0994cbaf` foi criada por Codex com `caller=claude` (impersonação para auto-exclusão do real Claude da panel). Pre-v2.17.0 v2 nem capturava `clientInfo` da MCP initialize handshake — `caller` era trusted unconditionally. v2.17.0 adiciona `verifyCallerIdentity(declaredCaller, clientInfo)` que cross-checks o caller declarado contra `getCallerCandidatesFromClientInfo(clientInfo)`. Aplicado em todos os 6 handlers caller-accepting: `session_init`, `ask_peers`, `session_start_round`, `run_until_unanimous`, `session_start_unanimous`, `contest_verdict` (quando `new_caller` provided). Match → OK + `identity_verified=true`. clientInfo unknown → OK + `identity_verified=false` (legitimate override). `caller="operator"` → OK (no agent claim made). Mismatch OR multi-match clientInfo → throws `identity_forgery_blocked`. Smoke `identity_forgery_blocked_test` (6 sub-tests). Coordinated ship com `cross-review-v1 v1.9.0`. Minor bump porque public surface adds `identity_forgery_blocked` error. Cross-review trilateral bypassed por operator directive (security fix to the gate itself, would otherwise route through compromised gate).
`v02.16.00`	Tribunal protocol repair plus operational doctor. Separates petitioner/caller from relator metadata, applies self-recusal to direct `ask_peers`, adds read-only `session_doctor`, fixes Windows smoke teardown, and refreshes provider model guidance from official docs.
`v02.15.01`	`server_info` consensus visibility hotfix. Exposes `consensus_peers` and `configured_consensus_peers_raw` for evidence-judge autowire so operators can audit the same configuration the dispatcher is using.
`v02.15.00`	Backlog bundle for operational judge controls. Added consensus-based judge autowire, per-call reasoning-effort overrides, opt-in real-API smoke, provider 4xx docs hints, and a Grok reasoning-capability allowlist while exposing consensus toggles across the six MCP host configs.
`v02.14.01`	Grok reasoning model hotfix. Switched the default Grok model to `grok-4.20-multi-agent` after real xAI verification and official docs showed `reasoning.effort` is accepted only on that model family.
`v02.14.00`	Grok joins the tribunal. Expanded the peer set to five with Grok, added per-peer on/off env vars, precision-report groundwork, active evidence-judge autowire, `contest_verdict`, multi-peer judge consensus, attached-evidence prompt injection, and CodeQL-safe temp-directory handling.
`v02.13.00`	Lead meta-review drift fix. Added explicit `ship` versus `review` session mode, lead drift detection, drift telemetry, and an abort gate so `run_until_unanimous` does not replace the artifact under review with a structured peer-review verdict.
`v02.12.00`	Shadow judge observability. Turned on evidence-judge shadow-mode data collection, surfaced autowire config in `server_info`, added dashboard/runtime rollups, and codified the tribunal-colegiado model for caller, relator, peer votes, and contestation.
`v02.11.00`	Relator lottery plus shadow auto-wire. Added automatic relator selection that excludes the caller and wired the v2.9 judge pass in shadow mode so self-review drift stops at the session structure.
`v02.09.00`	LLM evidence-judge pass. Added an operator-triggered judge that evaluates open evidence asks against the current draft and promotes only verified satisfied items, leaving inferred/unknown cases open.
`v02.08.00`	Per-peer health and Evidence Broker lifecycle. Added health rollups, evidence lifecycle tracking, resurfacing inference, dashboard surfaces, and the final architectural audit item on top of v2.7.
`v02.07.00`	Evidence Broker. Added a persistent per-session evidence checklist that deduplicates `NEEDS_EVIDENCE` caller requests and injects outstanding asks into subsequent revision prompts.
`v02.06.01`	Fallback/recovery budget hard gate. Replicated hard budget refusal to fallback and moderation-recovery paths so paid recovery calls cannot silently exceed the session cost ceiling.
`v02.06.00`	Token-delta compaction plus v2.5 format hotfix bundle. Coalesced streaming token delta events to reduce `events.ndjson` noise and bundled the deferred Prettier/format fix from v2.5.
`v02.05.00`	Evidence and budget hardening pass. Folded in operator-requested evidence/budget improvements plus empirical Codex/Gemini audit findings from historical session analysis.
`v02.04.01`	CI stub fail-fast hotfix. Fixed import-time server startup so the smoke harness can import MCP schemas while `CROSS_REVIEW_V2_STUB=1` is set in CI with explicit confirmation.
`v02.04.00`	Audit-closure hardening pass. Closed internal v2.3.3 technical-opinion priorities with additive public-surface hardening and several explicitly documented behavior changes.
`v02.03.03`	Prompt shielding and financial safety. Wrapped `review_focus` in escaped delimiters, blocked paid calls until financial controls are configured, expanded `server_info` financial diagnostics, and hardened MCP IDs, sweeps, jobs, and recovery cost alerts.
`v02.03.02`	CI-green README/docs cleanup. Reissued README organizational standardization under the repository Prettier policy and completed active-document rename cleanup in `NOTICE` and `CODE_OF_CONDUCT.md`.
`v02.03.01`	README organizational standardization. Adopted the shared LCV README opening while preserving the API-first runtime, model-selection, streaming, and observability sections.
`v02.03.00`	Provider-neutral `review_focus`. Added focus support across session tools, persisted focus metadata, injected bounded focus blocks into generation/review/retry prompts, and aligned auto-tag/publish automation with the stable package line.
`v02.02.00`	Provider token streaming. Added real token streaming for OpenAI, Anthropic, Gemini, and DeepSeek, with count-based progress events, runtime controls, and text-redaction defaults for persisted event logs.
`v02.01.01`	CodeQL and model-selection hardening. Fixed secret-redaction ReDoS and dashboard log-injection alerts, added decision retry for empty peer output, max-output-token controls, stronger model selection, and improved thinking controls.
`v02.01.00`	First stable `cross-review-v2` release. Promoted the API-first implementation to stable with cancellation, restart recovery, metrics, runtime capabilities, prompt compaction, budget preflight, model fallback, and stable naming.
`v02.00.04`	Session event race hotfix. Removed the CodeQL file-system race in `events.ndjson` persistence by appending under the session lock.
`v02.00.03`	Background sessions and durable reports. Added background MCP tools, durable events and reports, peer decision-quality tracking, generation accounting, provider cost rates, budget guard, moderation-safe retry, and dashboard event/report APIs.
`v02.00.02`	Publishing and dashboard sanitization. Normalized npm dist-tags, replaced the sponsor landing with the SumUp support page, sanitized dashboard 500 responses, and bumped the alpha runtime.
`v02.00.01`	Public npm/package metadata alignment. Enforced public npm visibility, added registry visibility checks, aligned funding metadata, normalized `repository.url`, and bumped the alpha runtime.
`v02.00.00`	Development package line hardening. Added parser format recovery, convergence metadata, shared MCP timeout/runtime smoke, auto-tag/release publishing, padded public tags, prepack clean builds, ignore-rule hardening, and quorum preservation.
`v2.0.0-alpha.2`	Durable session recovery alpha. Added in-flight metadata, convergence health, evidence attachment, operator escalation, session sweep, convergence inspection, silent-model-downgrade failures, and smoke coverage for the new surfaces.
`v2.0.0-alpha.1`	Model attestation and store hardening alpha. Added reported-model tracking, failed-attempt aggregation, recovery hints, atomic/locked session writes, UUID path hardening, safer probes, self-review prevention, English peer prompts, and expanded redaction.
`v2.0.0-alpha.0`	Initial API/SDK-only MCP server. Introduced official SDK adapters for OpenAI, Anthropic, Gemini, and DeepSeek, runtime model discovery, best-model selection, and a durable local session store.

What It Does

cross-review-v2 is the stable API-first implementation of the cross-review pattern. It orchestrates provider API clients (OpenAI/Codex, Anthropic/Claude, Google Gemini, DeepSeek, and xAI/Grok) and provides an MCP-compatible server surface.

Runtime calls are real provider calls by default. Stubs exist only for smoke tests and CI when CROSS_REVIEW_V2_STUB=1.

OpenAI client library for the Codex/OpenAI peer.
Anthropic TypeScript client library for Claude.
Google Gen AI client library for Gemini.
OpenAI-compatible DeepSeek API through the OpenAI client library.
OpenAI-compatible xAI Grok API through the OpenAI client library.

Quick Start

# Set API keys (PowerShell example)
[Environment]::SetEnvironmentVariable("OPENAI_API_KEY", "<OPENAI_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY", "<ANTHROPIC_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("GEMINI_API_KEY", "<GEMINI_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("DEEPSEEK_API_KEY", "<DEEPSEEK_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("GROK_API_KEY", "<GROK_API_KEY>", "User")

Restart your terminal after changing environment variables.

Build and run locally:

npm install
npm run build
node dist/src/mcp/server.js

For local smoke tests (no-cost):

$env:CROSS_REVIEW_V2_STUB = "1"
npm test

Configuration

Model selection and runtime behaviour can be controlled with environment variables. Example overrides (PowerShell):

[Environment]::SetEnvironmentVariable("CROSS_REVIEW_OPENAI_MODEL", "gpt-5.5", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_OPENAI_REASONING_EFFORT", "xhigh", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_GROK_MODEL", "grok-4.20-multi-agent", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_GROK_REASONING_EFFORT", "xhigh", "User")

For Grok, GROK_API_KEY is canonical. grok-4-latest, grok-4.3, grok-4.20, and grok-4.20-reasoning use xAI automatic reasoning without an explicit reasoning.effort field. grok-4.20-multi-agent accepts explicit reasoning.effort; low/medium select 4 agents and high/xhigh select 16 agents.

Financial and budget controls are required for paid provider calls. Configure these environment variables before running real sessions (example):

[Environment]::SetEnvironmentVariable("CROSS_REVIEW_V2_MAX_SESSION_COST_USD", "20", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_V2_PREFLIGHT_MAX_ROUND_COST_USD", "20", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_V2_UNTIL_STOPPED_MAX_COST_USD", "20", "User")

MCP Tools

server_info
runtime_capabilities
probe_peers
session_init
session_list
session_read
ask_peers
session_start_round
run_until_unanimous
session_start_unanimous
session_cancel_job
session_recover_interrupted
session_poll
session_events
session_metrics
session_doctor
session_report
session_check_convergence
session_attach_evidence
escalate_to_operator
session_sweep
session_finalize

License

Apache License 2.0 — see LICENSE and NOTICE.

© LCV Ideas & Software
_{LEONARDO CARDOZO VARGAS TECNOLOGIA DA INFORMACAO LTDA
Rua Pais Leme, 215 Conj 1713 - Pinheiros
São Paulo - SP
CEP 05.424-150
CNPJ: 66.584.678/0001-77
IM 05.424-150}

from github.com/LCV-Ideas-Software/cross-review-v2

Как установить

Выполни в терминале:

claude mcp add cross-review-v2 -- npx -y @lcv-ideas-software/cross-review-v2

Compare Cross Review V2 with

Cross Review V2vsFetch Cross Review V2vsAWS KB Retrieval Cross Review V2vsSpring AI MCP Server Cross Review V2vsllm-analysis-assistant

Не уверен что выбрать?

Найди свой стек за 60 секунд

Автор?

Embed-бейдж для README

Похожее

Все в категории ai

Весь каталог

Cross Review V2

Бесплатно

API-first MCP server for multi-model cross-review with unanimous convergence gates.

автор: GitHub Actions

GitHub Embed

Описание

API-first MCP server for multi-model cross-review with unanimous convergence gates.

README

cross-review-v2

MCP server orchestrating API-first cross-review between Claude, ChatGPT Codex, Gemini, DeepSeek, and Grok with unanimous convergence gates.

status: stable npm runtime: API-only security: CodeQL Default Setup license: Apache 2.0

Install.

npm install -g @lcv-ideas-software/cross-review-v2
# or using the GitHub Packages mirror:
npm install -g @lcv-ideas-software/cross-review-v2 --registry=https://npm.pkg.github.com

Status. Stable. Current release: v03.00.00 (npm package 3.0.0). See CHANGELOG.md for the release history.

The version history at a glance:

Release	Scope
`v03.04.00`	Minor — Perplexity multi-failure-mode close-out 2026-05-13: 3 coordinated fixes covering 7 production sessions Codex flagged (`51973fac`, `f72e597a`, `f9a19401`, `99d46a2b`, `00d92cce`, `59776026`, `0003b2fe`). Fix #1 — streaming-path strip parity (P0, surgical 2-line edit in `src/peers/perplexity.ts:~409/~504`): the v3.2.0 `stripPerplexityThinkingBlock` fix was applied only inside `sonarText(response)` (non-streaming path at `:~426/~521`). Production `server_info.streaming.tokens=true` is the default, so virtually every Perplexity call traversed the streaming branches which used raw `stream_buffer.text()` and bypassed the strip entirely. `<think>...</think>` preambles from `sonar-reasoning-pro` / `sonar-deep-research` reached the status parser, producing `unparseable_after_recovery` despite valid trailing JSON. v3.4.0 wraps `stream_buffer.text()` with `stripPerplexityThinkingBlock(...)` at both streaming sites, restoring parity. Forensic evidence: sess `f9a19401` (v3.3.0 self-investigation) — 4 peers converged READY on the exact diagnosis; Perplexity `ready_rate=0.28125` (9/32) vs `~1.0` for other peers. Fix #2 — anti-meta-audit lock (P1, prompt clause + heuristic detector): sess `51973fac` shipped a checklist of `MISSING: diff hunk` placeholders + sections titled `Evidence Gap` / `Validation Claims (NARRATIVE` / `Peer Review Readiness Blockers` instead of refining the artifact. `leadShipModeDirective()` gains an `## Anti-Meta-Audit Lock (HARD)` clause; new exported `detectMetaAuditFabrication(text)` in `src/core/orchestrator.ts` flags placeholder + section anti-patterns with double-bar threshold `(placeholders ≥ 3) OR (sections ≥ 1 AND placeholders ≥ 2)` for false-positive resistance. Reuses the shared `consecutiveLeadDrifts` counter (cap=2); new event `session.lead_meta_audit_fabrication_detected` + finalize reason `lead_meta_audit_repeated`. Fix #3 — reviewer proportionality (P2, prompt only): sess `0003b2fe` — Perplexity reviewer demanded separate `session_attach_evidence` of the same `rg` scan output the caller had narrated inline, blocking convergence over rounds. `sessionContractDirectives()` gains item 5 scoped tightly to pure config/script/text static-scan reviews; runtime work (build/test/deploy/migration/network) still requires raw output; "when in doubt, prefer asking for evidence" preserves rigor default. 3 new smoke markers (`perplexity_streaming_strip_parity_test`, `meta_audit_fabrication_detection_test`, `proportionality_guidance_test`). 100% backward-compatible additive: new exported helper, new event type, new finalize reason; tool schema unchanged. Minor bump (3.3.0 → 3.4.0; Y-component increment per SemVer) — additive public surface is the reason; behavior change for callers passing valid args is pure failure-mode prevention.
`v03.03.00`	Minor — Caller peer-selection lock (operator directive 2026-05-12: "TODOS OS AGENTES/PEERS SEMPRE PARTICIPAM, INDEPENDENTE DA ESCOLHA OU VONTADE DO CALLER"). Closes the systematic gaming pattern where peer callers (notably Codex, observed across multiple sessions) selectively excluded other peers from their own cross-review panels via curated `peers: [...]` lists or pinned a sympathetic relator via `lead_peer`. Lock surface: `peers` is locked for ALL callers (including operator) — reviewer panel is ALWAYS the full server-configured `peer_enabled` set; operators tune via env vars (`CROSS_REVIEW_V2_PEER_=on
`v03.02.00`	Patch — Codex bug-report close-out 2026-05-12: three surgical fixes (Perplexity `<think>` parser + session-state invariant + orchestrator strict peers). Fix #1 (`src/peers/perplexity.ts`): `sonar-reasoning-pro` / `sonar-deep-research` emit a `<think>...</think>` reasoning preamble before structured JSON; pre-v3.2.0 the parser fed that raw string into the format-recovery pipeline, which failed `unparseable_after_recovery` even when the trailing JSON was valid READY. New `PERPLEXITY_THINKING_BLOCK` regex + exported `stripPerplexityThinkingBlock()` helper; `sonarText()` now strips before returning. Closes the long-standing blocker that forced v3.0.0/v3.1.0 to self-bypass HARD GATE. Fix #2 (`src/core/session-store.ts`): closes session-state corruption observed in session `41244a1c-e7e8-439a-a59e-9339f7c7175d` (R1-R3 didn't converge, R4 finalized as `converged`, R5+R6 ran on top and clobbered `convergence_health` back to `"blocked"`, leaving meta with `outcome="converged" / health.state="blocked"`). `finalize()` now validates `outcome="converged"` against the latest round's `convergence.converged` (throws `code: "session_finalize_outcome_mismatch"`); `appendRound()` refuses to append to a finalized session (`code: "session_already_finalized"`); new public `assertNotFinalized()` helper wired into `askPeers` + `runUntilUnanimous` entry points so the round fails fast instead of after burning budget. Fix #3 (`src/core/orchestrator.ts`): when the caller passes an explicit `peers: [...]` list, autowire judges are intersected with the explicit list — both the consensus and single-peer paths. Observed in session `73036fbb` where peers=[codex,gemini,deepseek,grok] but autowire still invoked perplexity as judge. New `hadExplicitPeers` flag + `judgeRespectsExplicitPeers()` helper; skipped sessions emit `session.evidence_judge_pass.autowire_skipped` with `skipped_for_explicit_peers: true` + `session_explicit_peers: [...]` for operator audit. 3 new smoke markers (`perplexity_thinking_block_strip_test` 7 scenarios + 3 pins; `session_finalize_state_invariant_test` 5 scenarios + 1 pin; `orchestrator_strict_peer_panel_test` 5 source pins). Smoke harness completes `ok: true / events: 99`. Patch bump (additive — new exports + new error codes; pre-existing anti-patterns now reject loudly instead of corrupting state). The `cross-review-v2-attachment-inline-test` smoke fixture was updated to `caller_status: "NOT_READY"` so R1 doesn't auto-converge under stub mode.
`v03.00.00`	Major — Perplexity joins the sexteto. Quinteto (5 peers) → sexteto (6). Operator directive 2026-05-12. New `PerplexityAdapter` at `https://api.perplexity.ai` (Sonar API, OpenAI-Chat-Completions-compatible; reuses shared `loadOpenAICtor` lazy SDK helper). 5 architectural traits handled explicitly: (1) web search is the DEFAULT per call — peer becomes fact-check overlay; (2) system prompt is half-honored (search component does not attend to it); (3) `reasoning_effort` enum is `minimal
`v02.28.00`	Minor — Cold-start hardening Part 3: Windows registry env-var lookup bulk-cached (3-7 s → ~100 ms). Empirical profile revealed the real boot bottleneck on Windows: `loadConfig()` consuming 3.1-7.0 s because `readWindowsRegistryEnv(name)` fired `reg query <root> /v NAME` per missing env var × 2 scopes (HKCU + HKLM). With ~140 config vars and partial `process.env`, this burned 3-7 s dwarfing every other boot cost. v2.27.0 + v2.27.1 attacked SDK imports + sweeps (~340 ms) — a side concern. v2.28.0 fix: single bulk `reg query <root>` at first miss populates a `Map<string,string>` module cache; `readWindowsRegistryEnv` becomes a pure `cache.get(name)`. Cost: `O(1 + 2 registry reads)` instead of `O(N missing × 2 spawns)`. Empirical handshake (3 trials each): v2.27.1 3.18 / 3.12 / 3.14 s → v2.28.0 0.37 / 0.37 / 0.38 s = 8.4× speedup. `loadConfig()` alone: 3,307 ms → 87 ms (38×). Cold-start now well below Claude Code's spawn timeout. New smoke `windows_registry_env_bulk_cache_test` (7-class assertion pinning Map cache + bulk loader + canonical `reg query <root>` shape + negative invariant on per-var `/v NAME` + escapeRegExp absence + thin lookup + dist parity). Public surface 100% backward-compatible. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md` (gate-fixing-itself, third installment). Minor bump — internal behavior change with measurable 8.4× runtime impact.
`v02.27.01`	Patch — Cold-start hardening Part 2: lazy-load 5 provider SDKs + defer 6 startup sweeps to setTimeout(30s). Completes the cold-start fix started in v2.27.0. Empirical motivation 2026-05-12: cross-review-v2 failed to register tools in a Claude Code session while the other 5 MCP hosts (Codex CLI / Gemini Code Assist / Antigravity / Grok CLI / DeepSeek CLI / VS Code) loaded normally. Diagnostic measurement of the real JSON-RPC `initialize` handshake showed the server taking ~4.2 s to respond — right on top of Claude Code's per-spawn timeout. Two contributors stacked: eager top-level imports of 5 provider SDK module trees (~3 s of CommonJS/ESM graph) + v2.27.0's 4 boot sweeps + 2 boot notices all running on the same event-loop tick as the initialize message. Lazy-load: every adapter source uses `import type` only for provider SDKs; new shared cached loaders `loadAnthropicCtor()` / `loadOpenAICtor()` / `loadGenaiModule()` wrap `import(<sdk>)` in a per-module promise so concurrent first-callers resolve exactly once. Each adapter's `client()` is now async; 25 call sites across the 5 adapters updated; `geminiThinkingConfig(model, ThinkingLevel)` takes the lazy-loaded enum as 2nd arg. `model-selection.ts` consumes the same loaders. Deferred sweeps: 6 boot-time `setImmediate` blocks in `server.ts` become `setTimeout(..., STARTUP_SWEEP_DELAY_MS)` with delay = 30_000 ms — initialize responds in <200 ms while sweeps run later when the operator is idle. Empirical handshake measurement post-ship: 3.6-3.9 s (vs 3.7-4.2 s pre-ship); margin is modest because Node.js + MCP SDK + orchestrator still dominate, but the architectural correctness keeps SDK and FS work off the initialize tick entirely. 2 new smoke markers (`lazy_provider_sdk_imports_test`, `startup_sweeps_use_setTimeout_test`) + 2 existing gemini assertions updated. Public surface 100% backward-compatible (3 new named exports for cross-module loader reuse; `client()` is `private` so async-conversion is internal). Patch bump.
`v02.27.00`	Minor — Cold-start hardening Part 1: corrupted meta.json auto-quarantine + finalized-session auto-prune. Empirically motivated by Claude Code reload friction 2026-05-12: 534 session dirs accumulated under `~/.cross-review/data_v2/sessions/`, including 3 corrupted by the v2.25.1 redact escape-boundary bug (`77c47284`, `be47a5b0`, `7edf63e3`). The startup sweeps iterate via `list()` which read every `meta.json`; a single corrupted file caused the sweep to throw + abort, surfacing parse-error stderr on every reload — Claude Code is more sensitive to startup stderr than other hosts. `SessionStore.list()` now silently skips + quarantines corrupted meta.json (renamed to `meta.json.bad` with one `[cross-review-v2] quarantined …` stderr line, idempotent). `SessionStore.pruneOldSessions(maxAgeDays?)` removes finalized session dirs (outcome ∈ converged/aborted/max-rounds) whose `updated_at` is older than the cutoff. Default 60 days; `CROSS_REVIEW_V2_PRUNE_AFTER_DAYS=0` disables entirely. In-flight or untyped-outcome sessions are NEVER pruned (preserves audit trail). New boot `setImmediate` block wires the prune; stderr only emitted when `pruned > 0`. Minor bump — 2 new methods on `SessionStore`; `list()` swallows-and-quarantines instead of throws (additive defensive). Backward-compatible default; operators see no behavior change unless they have corrupted meta.json or >60-day-old finalized sessions. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`.
`v02.26.01`	Patch — `max_attached_evidence_chars` default raised 80_000 → 200_000 to fix multi-file evidence truncation. Empirically demonstrated by the stepsecurity v0.2.0 ship 2026-05-12 (sess `fd1037e5` and prior `85f94725`): with 5 attached evidence files totaling ~95KB, `session-store.readEvidenceAttachments()` budget allocator at `src/core/session-store.ts:1481-1543` exhausted the 80KB total cap before reaching the 4th+ attachment, surfacing `(truncated to 33273 of 38412 bytes)` to peers, who in 5 consecutive rounds correctly flagged the truncation as a blocker. The `perFileCap = max(2_000, floor(totalCap * 0.6))` mechanic remains correct (60% per-file allowance leaves room for at least 1 other attachment); only the global `totalCap` default needed bumping. New default 200_000 chars accommodates ~5 attachments averaging 30KB each. Operator override unchanged via `CROSS_REVIEW_V2_MAX_ATTACHED_EVIDENCE_CHARS`. Documented adjacent issues (no code fix; tracked for v2.27+): (1) lead-drift abort threshold is 2 consecutive (`orchestrator.ts:3662`) — when `max_rounds` is reached with `consecutiveLeadDrifts === 1`, the session ends `max-rounds` instead of `lead_meta_review_drift`; workaround = use `ask_peers` for known-drift-prone task patterns; (2) inaccessible upstream OpenAPI spec — when peers demand verbatim spec excerpts but the spec endpoint requires browser-session cookie auth, the caller must rely on alternative-evidence patterns. Patch bump — backward-compatible default change. No public API surface change. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`.
`v02.26.00`	Minor — Full pricing-model schema: base + extended-tier + cache (read/write) + promo (limited-time discount), all env-configurable, graceful fallback when fields are absent or promo expires. Operator directive 2026-05-11 ("Cross-review-v2 precisa saber ler das variáveis configuráveis nos arquivos de configuração e no env var todos os modelos de preços vigentes, com e sem cache, com promoção e sem promoção abaixo de tantos tokens e acima de tantos tokens"). Adds 14 new optional pricing env vars per provider plus 2 metadata env vars per provider (`_THRESHOLD_TOKENS`, `_PROMO_EXPIRES_AT_UTC`) on top of the v2.0.0 required pair — total 18 env-var slots per provider × 5 providers = 90 max. Selection cascade in new exported `selectRate()`: (promo+extended) → promo → extended → base, each step automatically falling through when the corresponding field is unset OR the gating condition does not apply. When promo expires (`Date.now() >= Date.parse(promo_expires_at)`), system uses base without operator intervention; when extended is unset, base applies to all prompt sizes; when cache rates are unset entirely, cache tokens are billed at the input rate (zero savings reported, no penalty). No-hardcoded-financials directive — the legacy `src/core/cache-rates.json` runtime fallback was REMOVED; cache pricing comes exclusively from env vars or graceful input-rate fallback. CostEstimate type extended with `cache_read_cost?`, `cache_write_cost?`, `tier_used?` ("base"\|"extended"\|"promo"\|"promo_extended"). `estimateCacheSavings()` third positional arg (`configRate`) is now required — internal/MCP callers route through `estimateCost()` and are unaffected. New smoke marker `full_pricing_model_v2260_test` pinning 11 invariants. Minor bump — additive public surface; breaking only for direct `estimateCacheSavings()` callers.
`v02.25.01`	Patch — `meta.json` corruption hotfix: `redact()` env-style pattern was crossing JSON-escape boundaries. The env-style assignment regex in `src/security/redact.ts:26` used `[^\s"',}]{6,}` for the value capture group; backslash was NOT in the exclusion class, so when a peer response contained the JSON-escaped sequence `token: write\"` (the inner-string close-quote of an escaped peer text), the `{6,}` quantifier consumed `write\` (6 chars including the escape backslash). The replacement `[REDACTED]` ate the closing `\` of the escape, leaving a bare `"` that prematurely closed the outer JSON string — producing structurally-broken `meta.json` files that could not be re-parsed at session resume time. Empirical impact: 3 cross-review-v2 sessions today (`be47a5b0`, `77c47284`, `7edf63e3`) all aborted at session_init with parser errors at different positions — same root cause: peer responses to a 13-repo scorecard hotfix submission quoted `id-token: write` inside backtick-fenced YAML excerpts. Fix: extend the negative char class with `\\`. Three smoke regression cases added (`escapeBoundary`, `realAssignment`, `yamlExcerpt`). Patch bump — additive defensive narrowing of an existing pattern; no public surface change. Cross-review-v2 self-review BYPASSED per operator directive 2026-05-11 (the bug being fixed is in the cross-review gate itself; routing the fix through the broken gate would re-encounter the same corruption).
`v02.25.00`	Third deliberation mode `circular` joins `ship` and `review`. Imported from `maestro-app`'s editorial protocol after operator review of the maestro design 2026-05-11. Serial deliberative custody: caller submits artifact; non-caller peers rotate as temporary curators; each rotator either approves the current version unchanged or produces a narrowly justified revision; convergence = full rotation completes without substantive change. No parallel peer-voting in this mode — the rotator IS the actor each round. When to use each mode: `ship` (default) for approving/rejecting an external artifact (PR review, spec approval, security gate — tribunal primitive, all peers vote READY); `review` for tasks phrased as a review act where the lead emits structured response; `circular` for producing/refining a shared artifact (spec drafting, RFC, protocol evolution, CHANGELOG copy — editorial primitive, panel produces). Modes coexist; mixing within a single session is not supported. Implementation: new `SessionMode = "ship" \| "review" \| "circular"`; new `leadCircularModeDirective()` Layer-1 prompt clause with 5 subsections (approve unchanged, approved-content lock, quality preservation, no-self-review, evidence-provenance-lock shared with ship); new `runCircularLoop()` private orchestrator method called when `sessionMode === "circular"`; new `circular_state: {rotation_order, consecutive_no_change_count, last_revision_round}` persisted in `meta.json`; new `circular_max_rotations` config (default 3, env `CROSS_REVIEW_V2_CIRCULAR_MAX_ROTATIONS`); new event types `session.circular_rotation_assigned` / `_step_unchanged` / `_step_revised` / `_full_rotation_no_change` / `_max_rotations_exceeded` / `_rotation_too_small`; new finalize reasons `circular_full_rotation_no_change` / `circular_max_rotations_exceeded` / `circular_rotation_too_small`. Rotation length minimum is 2 (no-self-immediate-output guard). Drift / empty / fabrication detection from v2.23/v2.24 fires identically. New smoke marker `circular_mode_test` pinning 11 invariants. MCP tool schemas (`run_until_unanimous`, `session_start_unanimous`) accept `mode: "circular"`. Minor bump — additive public surface; pre-v2.25 callers see no behavior change.
`v02.24.00`	Evidence-provenance lock for the ship-mode relator (Codex bug report 2026-05-10). Codex empirically observed two adjacent failure modes from his own working session `019dc794`: (a) session `09c21d7a` — lead_peer (Grok) fabricating operational evidence ex nihilo in `run_until_unanimous` with `mode: ship` (symmetric-pattern SHAs, 39-char SHAs where git emits 40, test-run counts not in attached evidence, `git diff --check passed` assertions, vite asset hashes); (b) session `eee886d3` — different relator (DeepSeek) propagating caller-narrated operational claims (`cargo test: 147 passed`, `npm run typecheck: passed`) as if they were verified evidence, when the caller never attached raw command output via `session_attach_evidence`. Same architectural gap from two angles: NARRATIVE about operational evidence ≠ PROVENANCE-GRADE operational evidence. Pre-v2.24.0 the orchestrator promoted such revisions to next-round draft, burning a full round of peer calls before downstream peers (claude + deepseek) blocked convergence. Layer 1 — Evidence Provenance Lock (HARD) clause added to `leadShipModeDirective()` system prompt, instructing the relator that operational evidence (SHAs/hashes/build outputs/test counts/diff hunks/git assertions) MUST be cited verbatim from the corpus or declared as a blocker. Layer 2 — new exported `detectFabricatedEvidence(revisionText, provenanceCorpus): FabricationDetectionResult` heuristic detector with hex-token-subset check + canonical operational-assertion patterns. Thresholds: 3+ net-new hex tokens or 2+ suspicious assertions trip fabrication; corpus-quoted tokens are subtracted before scoring (false-positive guard). Layer 3 — orchestrator relator-revision branch wires the detector after `emptyText`/`driftDetected` checks, preserves prior draft on detection, increments `consecutiveLeadDrifts`, emits `session.lead_fabrication_detected` event (data.fabrication_signals carries net_new_hex_count + sample + suspicious_assertion_count + sample), finalizes with `lead_fabrication_repeated` at the consecutive-cap. New smoke marker `relator_evidence_provenance_lock_test` pins behavioral matrix (clean/hex/assertion/provenance-correct) + source-level invariants (prompt sentinel, threshold constants, event type, finalize reason, unified-counter contract). No tool surface change. Patch bump — additive event + finalize reason; failure-mode behavior change only.
`v02.23.00`	Anthropic empty-revision degenerate path detection. Patch closing a $0.21 USD waste path discovered while triaging maestro-app v0.5.20 review session `8187f5a8` (2026-05-10): Claude Opus extended-thinking responses can return a content array with only `thinking`/`redacted_thinking` blocks and no final `text` block. Pre-v2.23.0 the Anthropic adapter silently produced `text: ""` and the orchestrator promoted that empty string to the next-round draft, dispatching peer calls against an empty `Draft Or Solution Under Review:` block. Layer 1 — new `parseAnthropicContent(content)` returns `{text, parser_warning?}` instead of the lossy `string`; legacy `textFromAnthropicContent` kept as backward-compat shim. Layer 2 — anthropic.ts call sites surface `parser_warning` via new `extraParserWarnings` parameter on `resultFromText`/`generationFromText`, flowing to `PeerResult.parser_warnings` and (new) `GenerationResult.parser_warnings`. Layer 3 — orchestrator's relator-revision branch treats `generation.text.trim() === ""` the same as drift: preserve prior draft, increment `consecutiveLeadDrifts`, emit dedicated `session.lead_empty_revision` event, finalize with `lead_empty_revision_repeated` when the cap is hit. New smoke marker `anthropic_empty_text_detection_test` pins all 4 invariants (helper return shape, adapter call-site uniformity, orchestrator sentinel strings, types declaration). No public surface change for callers passing valid arguments. Patch bump — failure-mode behavior change only.
`v02.22.00`	`session_doctor` drill-down + per-round cost telemetry + budget warning event. Three observability/audit improvements identified during a forensic audit of 467 durable sessions. A.P2: `session_doctor` hides per-session enumeration of `findings.self_lead_metadata` by default (178/467 = 38% pre-v2.16.0 noise); `totals.self_lead_metadata` count remains visible; pass `include_legacy: true` to enumerate. B.P2: entries in `findings.open_evidence_sessions[]` gain `item_types` (open items grouped by surfacing peer) + `chronic_blockers` (item ids with `round_count >= 3`) so operators see which evidence asks are systemic. B.P3: new `costs_per_round[]` + `cost_ceiling_usd` in `meta.json` (snapshot at session_init time so retroactive analysis is decoupled from later env-var changes); new one-shot `session.budget_warning` event fires when cumulative cost crosses 75% of the ceiling, providing early visibility before `max_rounds_budget_exceeded`. 3 new smoke markers (`session_doctor_legacy_filter_test`, `evidence_checklist_drilldown_test`, `budget_warning_emit_test`). Minor bump — public surface is additive; pre-v2.22 callers see no behavior change.
`v02.21.00`	Cross-provider prompt caching across all 5 peers (OpenAI, Anthropic, Gemini, DeepSeek, Grok). Single coordinated ship that wires uniform prompt-caching telemetry through the runtime: each adapter parses provider-native cache fields (`prompt_tokens_details.cached_tokens` / `cache_creation_input_tokens` / `cache_read_input_tokens` / `cachedContentTokenCount` / `prompt_cache_hit_tokens` / `prompt_cache_miss_tokens`); orchestrator emits a canonical `provider.cache.usage` event; per-session `cache_manifest.json` is appended for every cached call. Anthropic uses EXPLICIT cache_control breakpoints on the system prompt (TTL `5m`/`1h`). OpenAI uses pair-scoped `prompt_cache_key` + `prompt_cache_retention` (`in_memory`/`24h`). Grok mirrors OpenAI plus `x-grok-conv-id` header for cache-bucket scoping. DeepSeek parses auto-cache telemetry (no payload changes). Gemini parses implicit-cache telemetry only (explicit `caches.create` deferred). New `src/core/prompt-parts.ts` builds the canonical `stablePrefix` that always begins with `cache_schema_version: vN` and produces a sha256 hex hash invariant across rounds for the same case. New `src/core/cache-manifest.ts` persists per-session cache history with the same atomic-write retry pattern as `meta.json`. New rate cards in `src/core/cache-rates.json` populate `CostEstimate.cache_savings_usd` (or `cache_savings_unknown` when no rate matches). Operator can disable globally with `CROSS_REVIEW_V2_DISABLE_CACHE=true`; TTL via `CROSS_REVIEW_V2_CACHE_TTL_ANTHROPIC` / `CROSS_REVIEW_V2_CACHE_TTL_OPENAI`; schema bump via `CROSS_REVIEW_V2_CACHE_SCHEMA_VERSION`. 5 new smoke markers (`cache_hash_invariance_test`, `cache_schema_version_in_prefix_test`, `cache_rates_json_loaded_test`, `cache_manifest_atomic_write_test`, `cache_disable_kill_switch_test`). New `docs/caching.md` documents per-provider behavior matrix. Minor bump — public surface is additive; pre-v2.21 callers see no behavior change.
`v02.18.08`	Site sponsor card iteration. `site/index.html` GitHub Sponsors iframe (caixa branca cross-origin) substituído por link card dark navy com ❤ pink + meta cyan + seta animada; card movido para DEPOIS dos botões (lcv.dev/sponsor primário, GitHub Sponsors alternativa). Companion ship Phase 3 (12 repos).
`v02.18.07`	Patch — `site/index.html` visual identity refresh. GitHub Pages doc/sponsor page reskin to the new LCV org dark-first navy/cyan visual identity (palette `#050b18`/`#38bdf8`/`#34d399`, radial gradients, glow shadows, gradient text on h1). Coordinated companion ship with cross-review-v1 1.12.9, deepseek-cli 0.3.1, grok-cli 1.6.2, sponsor-motor APP v01.02.02, and `.github-org/site` (org root + /sponsor). No change to the published npm tarball (`files[]` does not include `site/`); only the GitHub Pages page changes. Patch bump (no public surface change).
`v02.18.06`	Patch — Gemini API function-declaration compatibility for MCP tool inputSchemas. Gemini Code Assist forwards each MCP tool's `inputSchema` to the Gemini API as a `function_declarations[].parameters` payload; the Gemini API's OpenAPI 3.0 subset rejects three patterns the SDK was emitting from the existing zod schemas, surfacing as `400 INVALID_ARGUMENT` for every chat turn including cross-review-v2 tools. v2.18.6 cleans the offending zod usage. (1)* `additionalProperties: false` removed from every MCP tool inputSchema (~28 tools) by dropping the `.strict()` chain; runtime accepts the same valid arguments because handlers consume only declared properties via destructuring. (2) `caller` field flattened from `z.union([PeerSchema, z.literal("operator")])` (6 occurrences) to a single `CallerSchema = z.enum([...PEERS, "operator"])`, replacing the `anyOf: [enum, const]` shape with a clean single `enum`. (3) `reasoning_effort_overrides` refactored from `z.record(PeerSchema, ReasoningEffortSchema).optional()` to an explicit `z.object({codex?, claude?, gemini?, deepseek?, grok?}).optional()`, eliminating the non-OpenAPI `propertyNames` constraint and the spurious `required: [<all 5 peers>]` artifact that contradicted the field's `.optional()` declaration. No behavior change for any caller passing valid arguments — Claude Code, Codex CLI, Gemini Code Assist, Grok CLI and DeepSeek CLI continue invoking the same tools with the same keys. Lint/typecheck/format clean; smoke harness completes with `ok: true / events: 96`. Patch bump (compatibilidade pública preservada; única diferença observável é que campos extras não declarados passam a ser silenciosamente descartados em vez de rejeitados com `mcp_arg_validation_failed`).
`v02.18.05`	Patch — anti-drift smoke drivers for v2.18.4 audit closure (operator directive 2026-05-07). v2.18.4 shipped 6 surgical fixes from the Codex external audit; v2.18.5 hardens those fixes against silent regression with 5 anti-drift smoke checks (`hono_override` / `abort_signal_threading` / `max_items_per_pass_default` / `clamp_effort_for_model` / `consensus_event_per_peer_attribution`). P1.1: `package.json` overrides.hono === ">=4.12.16" + ip-address override retained. P1.3: ≥2 sites with `signal?: AbortSignal` param + `signal: params.signal` wiring + `signal: input.signal` autowire emission; consensus pass has no leftover `signal: undefined`. P1.4: source-level `?? "4"` fallback + behavioral `loadConfig()` returns max_items_per_pass=4 (env unset). P2.1: behavioral clampEffortForModel("xhigh", "grok-4.3")="high"; passthrough on multi-agent; clamp wired at exactly 2 responses.create sites. P2.4: legacy judge_peer + new judge_peers array + per_peer_verdict map co-emitted at every `this.emit({...})` event payload. `clampEffortForModel` is now exported from src/peers/grok.ts so the harness can verify directly. Companion to cross-review-v1 v1.12.7 (parallel ship, same operator directive). Smoke harness completes with `ok: true` / exit 0; lint/typecheck/format clean; `npm audit --audit-level=moderate` 0 vulnerabilities. Patch bump (additive — only new exports + new smoke markers; no runtime behavior change).
`v02.18.04`	Patch — Codex external audit 2026-05-07 outcome: 6 surgical fixes (P1.1, P1.2, P1.3, P1.4, P2.1, P2.4). Codex submitted a read-only audit of cross-review-v2 v2.18.3 with 4 P1 + 7 P2 findings; this ship lands 6 verified-actionable items. P1.1: `package.json` adds `"hono": ">=4.12.16"` override clearing 2 npm-audit moderate advisories (GHSA-9vqf-7f2p-gf9v + GHSA-69xw-7hcm-h432) via @modelcontextprotocol/sdk transitive (practical exposure ~zero in stdio runtime, but audit-gate matters for publish + defense-in-depth; same precedent as v2.18.1 ip-address override). P1.2: `src/security/redact.ts` adds `xai-` API key pattern at parity with sk-/sk-ant-/AIza/etc; logs/sessions could previously leak xAI keys via persisted provider errors. P1.3: `runEvidenceChecklistJudgeConsensusPass` + `runEvidenceChecklistJudgePass` now thread `AbortSignal` through to `judgeEvidenceAsk(context.signal)` — pre-v2.18.4 the consensus path hardcoded `signal: undefined` and single-peer omitted the field, so `session_cancel_job` could not abort judges mid-flight. Autowire call sites pass `input.signal` from round scope. P1.4: lowered default `CROSS_REVIEW_V2_EVIDENCE_JUDGE_MAX_ITEMS_PER_PASS` from 8 → 4 — with default consensus_peers=4, worst-case round goes from 4×8=32 paid judge calls down to 4×4=16. Operators wanting prior behavior set env-var explicitly. P2.1: `GROK_REASONING_EFFORT_MODELS` allowlist expanded from `{"grok-4.20-multi-agent"}` to include `"grok-4.3"` per current xAI docs (verified via WebFetch 2026-05-07; xAI added `grok-4.3` reasoning_effort support after v2.16.0 froze). New `clampEffortForModel()` narrows internal `xhigh`/`minimal` scale to `high` for grok-4.3 (which only accepts `none
`v02.18.03`	Patch — Gemini default pin bump `gemini-3.1-pro-preview` → `gemini-2.5-pro` (operator preference 2026-05-07; coordinated with cross-review-v1 v1.12.4). Source-of-truth defaults flipped: `src/core/config.ts` `models.gemini` default → `gemini-2.5-pro`; `src/peers/model-selection.ts` priority list → `["gemini-2.5-pro", "gemini-3.1-pro-preview"]` (3.1-pro-preview retained as fallback). Rationale: under Google One AI Ultra subscription, `gemini-2.5-pro` carries 1k requests/day quota vs `gemini-3.1-pro-preview`'s 250 requests/day; post-bump empirical sessions (08cbc942, 1d5be5f2, 256ac7c9 — all 2026-05-07) confirm `gemini-2.5-pro` stable across the 5-peer panel without rate_limit blockers. The 7 LCV-workspace MCP host configs already flipped `CROSS_REVIEW_GEMINI_MODEL=gemini-2.5-pro` env-override 2026-05-07; this ship aligns the source-of-truth defaults so a fresh install without env-override picks the same model. Workspace policy (operator directive 2026-05-07): only `gemini--pro` variants ≥ 2.5 are permitted — no `-flash` and no models below 2.5. Smoke fixture `scripts/smoke.ts:225` (currentOfficialModel iterator) flipped to `gemini-2.5-pro`. `docs/api-keys.md` env-var example + `docs/model-selection.md` priority documentation refreshed to match. Patch bump (no public surface change beyond default model ID; behavior unchanged for env-override users).
`v02.18.02`	Tier 5 — Windows process-tree introspection (coordinated with cross-review-v1 v1.12.2). Closes the long-standing forensics gap: pre-v2.18.2 `getParentProcessSnapshot()` returned `parent_exe_basename: null` on Windows because we only had a POSIX `/proc/<ppid>/comm` reader (Windows path deferred at F1 v2.18.0). v2.18.2 closes the gap with a defensive `tasklist /FI "PID eq <ppid>" /FO CSV /NH` reader via `child_process.spawnSync` (`timeout: 500`, `windowsHide: true`); parser uses leading-quote discriminator and the same `1 ≤ length < 128` sanity filter as POSIX. Best-effort try/catch swallows ENOENT, timeout, parse failures. POSIX path unchanged. `scripts/smoke.ts` sub-test (14) extended with shape sanity + Windows-specific populated-basename assertion + source-level anti-drift guards. Forensics-only field — NOT used by F1 token gate or v2.17.0 clientInfo cross-check. Patch bump (no public surface change).
`v02.18.00`	F1 caller capability tokens (coordinated with cross-review-v1 v1.11.0). Cryptographic identity proof that complements the v2.17.0 clientInfo gate. Pre-v2.18.0 the v2.17.0 cross-check between `caller` and `clientInfo.name` only catches inconsistent self-reports — both fields are declared by the caller. F1 introduces a per-host secret (env `CROSS_REVIEW_CALLER_TOKEN`), authoritative on match and rejected on mismatch. New `caller-tokens` module exposes generation, loading, constant-time hex matching, env verification and a best-effort parent-process snapshot for forensics (Option C / Hybrid). New MCP tool `regenerate_caller_tokens` rotates `host-tokens.json`. New env vars `CROSS_REVIEW_CALLER_TOKEN`, `CROSS_REVIEW_TOKENS_FILE`, `CROSS_REVIEW_REQUIRE_TOKEN`. New `caller_tokens` block in `server_info` surfaces the gate state. `verifyCallerIdentity` extended with `verification_method` ("token"
`v02.17.00`	HARD GATE — identity forgery rejection (operator directive 2026-05-05). Empirical evidence flagrada: cross-review-v2 session `0994cbaf` foi criada por Codex com `caller=claude` (impersonação para auto-exclusão do real Claude da panel). Pre-v2.17.0 v2 nem capturava `clientInfo` da MCP initialize handshake — `caller` era trusted unconditionally. v2.17.0 adiciona `verifyCallerIdentity(declaredCaller, clientInfo)` que cross-checks o caller declarado contra `getCallerCandidatesFromClientInfo(clientInfo)`. Aplicado em todos os 6 handlers caller-accepting: `session_init`, `ask_peers`, `session_start_round`, `run_until_unanimous`, `session_start_unanimous`, `contest_verdict` (quando `new_caller` provided). Match → OK + `identity_verified=true`. clientInfo unknown → OK + `identity_verified=false` (legitimate override). `caller="operator"` → OK (no agent claim made). Mismatch OR multi-match clientInfo → throws `identity_forgery_blocked`. Smoke `identity_forgery_blocked_test` (6 sub-tests). Coordinated ship com `cross-review-v1 v1.9.0`. Minor bump porque public surface adds `identity_forgery_blocked` error. Cross-review trilateral bypassed por operator directive (security fix to the gate itself, would otherwise route through compromised gate).
`v02.16.00`	Tribunal protocol repair plus operational doctor. Separates petitioner/caller from relator metadata, applies self-recusal to direct `ask_peers`, adds read-only `session_doctor`, fixes Windows smoke teardown, and refreshes provider model guidance from official docs.
`v02.15.01`	`server_info` consensus visibility hotfix. Exposes `consensus_peers` and `configured_consensus_peers_raw` for evidence-judge autowire so operators can audit the same configuration the dispatcher is using.
`v02.15.00`	Backlog bundle for operational judge controls. Added consensus-based judge autowire, per-call reasoning-effort overrides, opt-in real-API smoke, provider 4xx docs hints, and a Grok reasoning-capability allowlist while exposing consensus toggles across the six MCP host configs.
`v02.14.01`	Grok reasoning model hotfix. Switched the default Grok model to `grok-4.20-multi-agent` after real xAI verification and official docs showed `reasoning.effort` is accepted only on that model family.
`v02.14.00`	Grok joins the tribunal. Expanded the peer set to five with Grok, added per-peer on/off env vars, precision-report groundwork, active evidence-judge autowire, `contest_verdict`, multi-peer judge consensus, attached-evidence prompt injection, and CodeQL-safe temp-directory handling.
`v02.13.00`	Lead meta-review drift fix. Added explicit `ship` versus `review` session mode, lead drift detection, drift telemetry, and an abort gate so `run_until_unanimous` does not replace the artifact under review with a structured peer-review verdict.
`v02.12.00`	Shadow judge observability. Turned on evidence-judge shadow-mode data collection, surfaced autowire config in `server_info`, added dashboard/runtime rollups, and codified the tribunal-colegiado model for caller, relator, peer votes, and contestation.
`v02.11.00`	Relator lottery plus shadow auto-wire. Added automatic relator selection that excludes the caller and wired the v2.9 judge pass in shadow mode so self-review drift stops at the session structure.
`v02.09.00`	LLM evidence-judge pass. Added an operator-triggered judge that evaluates open evidence asks against the current draft and promotes only verified satisfied items, leaving inferred/unknown cases open.
`v02.08.00`	Per-peer health and Evidence Broker lifecycle. Added health rollups, evidence lifecycle tracking, resurfacing inference, dashboard surfaces, and the final architectural audit item on top of v2.7.
`v02.07.00`	Evidence Broker. Added a persistent per-session evidence checklist that deduplicates `NEEDS_EVIDENCE` caller requests and injects outstanding asks into subsequent revision prompts.
`v02.06.01`	Fallback/recovery budget hard gate. Replicated hard budget refusal to fallback and moderation-recovery paths so paid recovery calls cannot silently exceed the session cost ceiling.
`v02.06.00`	Token-delta compaction plus v2.5 format hotfix bundle. Coalesced streaming token delta events to reduce `events.ndjson` noise and bundled the deferred Prettier/format fix from v2.5.
`v02.05.00`	Evidence and budget hardening pass. Folded in operator-requested evidence/budget improvements plus empirical Codex/Gemini audit findings from historical session analysis.
`v02.04.01`	CI stub fail-fast hotfix. Fixed import-time server startup so the smoke harness can import MCP schemas while `CROSS_REVIEW_V2_STUB=1` is set in CI with explicit confirmation.
`v02.04.00`	Audit-closure hardening pass. Closed internal v2.3.3 technical-opinion priorities with additive public-surface hardening and several explicitly documented behavior changes.
`v02.03.03`	Prompt shielding and financial safety. Wrapped `review_focus` in escaped delimiters, blocked paid calls until financial controls are configured, expanded `server_info` financial diagnostics, and hardened MCP IDs, sweeps, jobs, and recovery cost alerts.
`v02.03.02`	CI-green README/docs cleanup. Reissued README organizational standardization under the repository Prettier policy and completed active-document rename cleanup in `NOTICE` and `CODE_OF_CONDUCT.md`.
`v02.03.01`	README organizational standardization. Adopted the shared LCV README opening while preserving the API-first runtime, model-selection, streaming, and observability sections.
`v02.03.00`	Provider-neutral `review_focus`. Added focus support across session tools, persisted focus metadata, injected bounded focus blocks into generation/review/retry prompts, and aligned auto-tag/publish automation with the stable package line.
`v02.02.00`	Provider token streaming. Added real token streaming for OpenAI, Anthropic, Gemini, and DeepSeek, with count-based progress events, runtime controls, and text-redaction defaults for persisted event logs.
`v02.01.01`	CodeQL and model-selection hardening. Fixed secret-redaction ReDoS and dashboard log-injection alerts, added decision retry for empty peer output, max-output-token controls, stronger model selection, and improved thinking controls.
`v02.01.00`	First stable `cross-review-v2` release. Promoted the API-first implementation to stable with cancellation, restart recovery, metrics, runtime capabilities, prompt compaction, budget preflight, model fallback, and stable naming.
`v02.00.04`	Session event race hotfix. Removed the CodeQL file-system race in `events.ndjson` persistence by appending under the session lock.
`v02.00.03`	Background sessions and durable reports. Added background MCP tools, durable events and reports, peer decision-quality tracking, generation accounting, provider cost rates, budget guard, moderation-safe retry, and dashboard event/report APIs.
`v02.00.02`	Publishing and dashboard sanitization. Normalized npm dist-tags, replaced the sponsor landing with the SumUp support page, sanitized dashboard 500 responses, and bumped the alpha runtime.
`v02.00.01`	Public npm/package metadata alignment. Enforced public npm visibility, added registry visibility checks, aligned funding metadata, normalized `repository.url`, and bumped the alpha runtime.
`v02.00.00`	Development package line hardening. Added parser format recovery, convergence metadata, shared MCP timeout/runtime smoke, auto-tag/release publishing, padded public tags, prepack clean builds, ignore-rule hardening, and quorum preservation.
`v2.0.0-alpha.2`	Durable session recovery alpha. Added in-flight metadata, convergence health, evidence attachment, operator escalation, session sweep, convergence inspection, silent-model-downgrade failures, and smoke coverage for the new surfaces.
`v2.0.0-alpha.1`	Model attestation and store hardening alpha. Added reported-model tracking, failed-attempt aggregation, recovery hints, atomic/locked session writes, UUID path hardening, safer probes, self-review prevention, English peer prompts, and expanded redaction.
`v2.0.0-alpha.0`	Initial API/SDK-only MCP server. Introduced official SDK adapters for OpenAI, Anthropic, Gemini, and DeepSeek, runtime model discovery, best-model selection, and a durable local session store.

What It Does

Runtime calls are real provider calls by default. Stubs exist only for smoke tests and CI when CROSS_REVIEW_V2_STUB=1.

OpenAI client library for the Codex/OpenAI peer.
Anthropic TypeScript client library for Claude.
Google Gen AI client library for Gemini.
OpenAI-compatible DeepSeek API through the OpenAI client library.
OpenAI-compatible xAI Grok API through the OpenAI client library.

Quick Start

# Set API keys (PowerShell example)
[Environment]::SetEnvironmentVariable("OPENAI_API_KEY", "<OPENAI_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("ANTHROPIC_API_KEY", "<ANTHROPIC_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("GEMINI_API_KEY", "<GEMINI_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("DEEPSEEK_API_KEY", "<DEEPSEEK_API_KEY>", "User")
[Environment]::SetEnvironmentVariable("GROK_API_KEY", "<GROK_API_KEY>", "User")

Restart your terminal after changing environment variables.

Build and run locally:

npm install
npm run build
node dist/src/mcp/server.js

For local smoke tests (no-cost):

$env:CROSS_REVIEW_V2_STUB = "1"
npm test

Configuration

Model selection and runtime behaviour can be controlled with environment variables. Example overrides (PowerShell):

[Environment]::SetEnvironmentVariable("CROSS_REVIEW_OPENAI_MODEL", "gpt-5.5", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_OPENAI_REASONING_EFFORT", "xhigh", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_GROK_MODEL", "grok-4.20-multi-agent", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_GROK_REASONING_EFFORT", "xhigh", "User")

Financial and budget controls are required for paid provider calls. Configure these environment variables before running real sessions (example):

[Environment]::SetEnvironmentVariable("CROSS_REVIEW_V2_MAX_SESSION_COST_USD", "20", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_V2_PREFLIGHT_MAX_ROUND_COST_USD", "20", "User")
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_V2_UNTIL_STOPPED_MAX_COST_USD", "20", "User")

MCP Tools

server_info
runtime_capabilities
probe_peers
session_init
session_list
session_read
ask_peers
session_start_round
run_until_unanimous
session_start_unanimous
session_cancel_job
session_recover_interrupted
session_poll
session_events
session_metrics
session_doctor
session_report
session_check_convergence
session_attach_evidence
escalate_to_operator
session_sweep
session_finalize

License

Apache License 2.0 — see LICENSE and NOTICE.

© LCV Ideas & Software
_{LEONARDO CARDOZO VARGAS TECNOLOGIA DA INFORMACAO LTDA
Rua Pais Leme, 215 Conj 1713 - Pinheiros
São Paulo - SP
CEP 05.424-150
CNPJ: 66.584.678/0001-77
IM 05.424-150}

from github.com/LCV-Ideas-Software/cross-review-v2

Как установить

Выполни в терминале:

claude mcp add cross-review-v2 -- npx -y @lcv-ideas-software/cross-review-v2

Compare Cross Review V2 with

Cross Review V2vsFetch Cross Review V2vsAWS KB Retrieval Cross Review V2vsSpring AI MCP Server Cross Review V2vsllm-analysis-assistant

Не уверен что выбрать?

Найди свой стек за 60 секунд

Автор?

Embed-бейдж для README

Похожее

Все в категории ai

Cross Review V2

Описание

README

cross-review-v2

What It Does

Quick Start

Configuration

MCP Tools

License

Как установить

Похожие MCP

Fetch

AWS KB Retrieval

Spring AI MCP Server

llm-analysis-assistant

Compare Cross Review V2 with

Cross Review V2

Описание

README

cross-review-v2

What It Does

Quick Start

Configuration

MCP Tools

License

Как установить

Похожие MCP

Fetch

AWS KB Retrieval

Spring AI MCP Server

llm-analysis-assistant

Compare Cross Review V2 with

Command Palette

Cross Review V2

Описание

README

cross-review-v2

What It Does

Quick Start

Configuration

MCP Tools

License

Как установить

Похожие MCP

Fetch

AWS KB Retrieval

Spring AI MCP Server

llm-analysis-assistant

Compare Cross Review V2 with

Cross Review V2

Описание

README

cross-review-v2

What It Does

Quick Start

Configuration

MCP Tools

License

Как установить

Похожие MCP

Fetch

AWS KB Retrieval

Spring AI MCP Server

llm-analysis-assistant

Compare Cross Review V2 with