vaara-governance

Runtime tool-call governance for Claude Code. PreToolUse runs a two-layer check on Bash, WebFetch, WebSearch, and MCP calls (regex deny patterns on shell/web; conformal risk scorer on MCP). PostToolUse writes a hash-chained SQLite audit record. SessionStart validates the install and prints version, mode, and audit DB path.

Vaara

Vaara is the runtime evidence layer for AI Act compliance. Open source, no SaaS, no telemetry.

Vaara intercepts agent tool calls, scores each one with a conformal risk interval, and writes a hash-chained audit record. Online learning across five expert signals via Multiplicative Weight Update. Distribution-free conformal coverage on the score. An external auditor can verify these properties without trusting your stack. Orchestration toolkits and identity layers (Microsoft Agent Governance Toolkit, others) sit on top.

Numbers

Held-out TEST recall 84.7% (95% Wilson [82.4, 86.7]) at FPR 4.1% [2.9, 5.7]. Phase 1 PAIR scale-up to n=300 per attacker family lands at 88.1% [85.8, 90.1]. Under BIPIA-pressure context, false-positive rate on benign tool calls 1.2% [0.4, 3.6] across four agent backends (Claude Haiku 4.5, Llama-3.1-8B, Mistral-7B, Qwen-2.5-7B). Multi-attacker PAIR ASR 0/25 across three different attacker models with identical seeds. 140 µs mean / 210 µs p99 inference latency on commodity CPU (excluding one-time embedding model load). Every number reproducible end-to-end via make bench.

12,155-entry adversarial corpus (250 hand-curated + 11,905 LLM-generated), 70/15/15 split stratified by (category, source)
Classifier v9 with 236 hand-features + 384-dim MiniLM embeddings at calibrated threshold 0.9150 on held-out TEST n=1,827: recall 84.7% [82.4, 86.7] at FPR 4.1% [2.9, 5.7]
Multi-attacker PAIR robustness: 0/25 successes per attacker across Qwen2.5-32B, Qwen2.5-72B, Llama-3.3-70B hitting identical seed indices, Wilson upper 13.3%
BIPIA-pressure FPR on benign tool calls 1.2% [0.4, 3.6] across four agent backends, n=244 benign tool calls under context.source=injected_via_bipia_<class>
Chain of custody: corpus manifest SHA → split manifest SHA → training commit → bundle SHA, all locked and printed by every script
140 µs mean / 210 µs p99 inference latency, commodity CPU
Distribution-free conformal coverage on the score
MWU regret bound O(sqrt(T log N))
vaara-bench-v0.39: current methodology, chain of custody, ship-gate record. v9 retrain on BIPIA-augmented corpus with follows upweighted (--follow-weight 8.0), calibrated to T=0.9150 at a 5% FPR target on v035 VAL. BIPIA-pressure FPR collapses from 35.2% on v8 to 1.2% on v9. In-distribution recall flat within Wilson intervals. Found-and-fixed in tree: auto-labeller example.com placeholder false-positive rule (42 → 14 true follows across four backends). Historical bench docs live under bench/ for chain-of-custody continuity.
vaara-bench-v1: 77-trace synthetic-corpus regression baseline with frozen methodology, 100% soft TPR, 0% hard FPR

Each figure is reproducible from the public corpus or the bench pipeline in bench/.

Install

pip install vaara

Python 3.10+. Zero runtime deps. Optional XGBoost classifier: pip install vaara[ml].

Releases ship with SLSA Build Level 3 provenance. Verify with slsa-verifier verify-artifact.

Quick start

from vaara.pipeline import InterceptionPipeline

pipeline = InterceptionPipeline()
result = pipeline.intercept(
    agent_id="agent-007",
    tool_name="fs.write_file",
    parameters={"path": "/etc/service.yaml", "content": "..."},
    agent_confidence=0.8,
)
if result.allowed:
    pipeline.report_outcome(result.action_id, outcome_severity=0.0)
else:
    print(result.reason)

report_outcome closes the loop. MWU reweights signals based on which ones predicted the outcome.

vaara-governance

Popularity

What's Inside

README

Numbers

Install

Quick start

What evidence looks like

Confidence

Similar Plugins

context-mode

pro-workflow

claude-token-reducer