Reasoning agent for mathematics and physics with Generate-Verify-Revise loop, multi-verifier consensus, proof auditing, plus scientific figure generation
npx claudepluginhub hyperion-git/alethicReasoning agent for mathematics and physics with Generate-Verify-Revise loop (inspired by DeepMind's Aletheia), textbook-style proof/derivation converter, plus publication-quality scientific figure generation
A reasoning agent for mathematics and physics inspired by Google DeepMind's Aletheia, built on Claude (Opus 4.6). Alethic implements a Generate-Verify-Revise loop with decoupled verification — a key architectural insight from DeepMind's design — to produce rigorous mathematical proofs and physics derivations with high confidence.
Available as Claude Code skills (/alethic-solve for math, /alethic-derive for physics, /alethic-scientific-figure for scientific figures) or as a standalone Python library with CLI.
In February 2026, Google DeepMind introduced Aletheia, a multi-agent system that achieved 95% accuracy on IMO-ProofBench Advanced and autonomously resolved open Erdős conjectures. The system's central innovation lies in its separation of solution generation from solution verification: by preventing the verifier from observing the generator's intermediate reasoning traces, Aletheia avoids the confidence inflation that arises when a model evaluates its own chain of thought.
When a verifier has access to the generator's internal reasoning, it tends to follow the same logical path and confirm flawed steps with unwarranted certainty. Decoupling forces the verifier to reconstruct and independently assess each argument from the final output alone.
Alethic translates this decoupled verification approach to Claude's API. The project implements the same three-subagent loop — Generator, Verifier, and Reviser — with each role instantiated as an independent API call (in the Python library) or a separate Task sub-agent with a fresh context window (in the Claude Code skill). The orchestrator logic is domain-neutral; only the prompt templates differ between math (MathAgent, /alethic-solve) and physics (PhysicsAgent, /alethic-derive). The result is a system that can solve mathematical problems and derive physics results with verified confidence, or honestly admit failure when it cannot.
Alethic's reasoning loop proceeds through three distinct phases that repeat until the solution is verified or the iteration budget is exhausted.
┌─────────────────────────────────────────────────────────┐
│ Orchestrator Loop │
│ │
│ ┌───────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Generator │───▶│ Verifier │───▶│ Reviser │──┐ │
│ │ (T=1.0) │ │ (T=0.2) │ │ (T=0.7) │ │ │
│ └───────────┘ └──────────┘ └──────────┘ │ │
│ ▲ │ │
│ └──────────────────────────────────────────┘ │
│ │
│ Terminates when: CORRECT (≥ threshold) OR max iters │
└─────────────────────────────────────────────────────────┘
The Generator produces a candidate solution at high temperature (T=1.0) to encourage creative exploration of proof strategies. The Verifier then evaluates that solution at low temperature (T=0.2) for strict, deterministic assessment. Critically, the Verifier receives only the problem statement and the final written solution — never the Generator's thinking traces, tool outputs, or intermediate reasoning. If the Verifier identifies issues, the Reviser receives both the solution and the Verifier's structured critique, producing an improved version at moderate temperature (T=0.7) that balances faithfulness to the original with the flexibility to restructure flawed arguments.
The loop terminates under one of three conditions: the Verifier issues a CORRECT verdict with confidence at or above the configured threshold (default 90%), the maximum number of iterations is reached (strategic failure admission), or the Verifier detects that the problem's premise is false and halts early with an explanation.
The following sequence diagram illustrates the critical decoupling boundary. The Generator's internal reasoning — thinking traces, tool call results, intermediate drafts — never crosses to the Verifier. Only the final solution text is passed, forcing the Verifier to evaluate the argument on its own merits.
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Production-ready workflow orchestration with 84 marketplace plugins, 192 local specialized agents, and 156 local skills - optimized for granular installation and minimal token usage
Directory of popular Claude Code extensions including development tools, productivity plugins, and MCP integrations