Stats

Actions

Available In

Tags

Alethic

A reasoning agent for mathematics and physics inspired by Google DeepMind's Aletheia, built on Claude (Opus 4.6). Alethic implements a Generate-Verify-Revise loop with decoupled verification — a key architectural insight from DeepMind's design — to produce rigorous mathematical proofs and physics derivations with high confidence.

Available as Claude Code skills (/alethic-solve for math, /alethic-derive for physics, /alethic-scientific-figure for scientific figures) or as a standalone Python library with CLI.

Background and Motivation

In February 2026, Google DeepMind introduced Aletheia, a multi-agent system that achieved 95% accuracy on IMO-ProofBench Advanced and autonomously resolved open Erdős conjectures. The system's central innovation lies in its separation of solution generation from solution verification: by preventing the verifier from observing the generator's intermediate reasoning traces, Aletheia avoids the confidence inflation that arises when a model evaluates its own chain of thought.

When a verifier has access to the generator's internal reasoning, it tends to follow the same logical path and confirm flawed steps with unwarranted certainty. Decoupling forces the verifier to reconstruct and independently assess each argument from the final output alone.

Alethic translates this decoupled verification approach to Claude's API. The project implements the same three-subagent loop — Generator, Verifier, and Reviser — with each role instantiated as an independent API call (in the Python library) or a separate Task sub-agent with a fresh context window (in the Claude Code skill). The orchestrator logic is domain-neutral; only the prompt templates differ between math (MathAgent, /alethic-solve) and physics (PhysicsAgent, /alethic-derive). The result is a system that can solve mathematical problems and derive physics results with verified confidence, or honestly admit failure when it cannot.

Key References

T. Shoeybi et al., "Towards Autonomous Mathematics Research," arXiv:2602.10177 (Feb 2026). The primary Aletheia paper describing the Generate-Verify-Revise architecture.

J. Huang et al., "Accelerating Scientific Research with Gemini," arXiv:2602.03837 (Feb 2026). Companion paper on Gemini's scientific reasoning capabilities.

R. Anil et al., "Semi-Autonomous Mathematics Discovery," arXiv:2601.22401 (Jan 2026). The Erdős conjecture study demonstrating autonomous mathematical discovery.

Architecture

Alethic's reasoning loop proceeds through three distinct phases that repeat until the solution is verified or the iteration budget is exhausted.

┌─────────────────────────────────────────────────────────┐ │ Orchestrator Loop │ │ │ │ ┌───────────┐ ┌──────────┐ ┌──────────┐ │ │ │ Generator │───▶│ Verifier │───▶│ Reviser │──┐ │ │ │ (T=1.0) │ │ (T=0.2) │ │ (T=0.7) │ │ │ │ └───────────┘ └──────────┘ └──────────┘ │ │ │ ▲ │ │ │ └──────────────────────────────────────────┘ │ │ │ │ Terminates when: CORRECT (≥ threshold) OR max iters │ └─────────────────────────────────────────────────────────┘

The Generator produces a candidate solution at high temperature (T=1.0) to encourage creative exploration of proof strategies. The Verifier then evaluates that solution at low temperature (T=0.2) for strict, deterministic assessment. Critically, the Verifier receives only the problem statement and the final written solution — never the Generator's thinking traces, tool outputs, or intermediate reasoning. If the Verifier identifies issues, the Reviser receives both the solution and the Verifier's structured critique, producing an improved version at moderate temperature (T=0.7) that balances faithfulness to the original with the flexibility to restructure flawed arguments.

The loop terminates under one of three conditions: the Verifier issues a CORRECT verdict with confidence at or above the configured threshold (default 90%), the maximum number of iterations is reached (strategic failure admission), or the Verifier detects that the problem's premise is false and halts early with an explanation.

Information Flow and Decoupling

The following sequence diagram illustrates the critical decoupling boundary. The Generator's internal reasoning — thinking traces, tool call results, intermediate drafts — never crosses to the Verifier. Only the final solution text is passed, forcing the Verifier to evaluate the argument on its own merits.

Alethic

Available as Claude Code skills (/alethic-solve for math, /alethic-derive for physics, /alethic-scientific-figure for scientific figures) or as a standalone Python library with CLI.

Background and Motivation

Key References

T. Shoeybi et al., "Towards Autonomous Mathematics Research," arXiv:2602.10177 (Feb 2026). The primary Aletheia paper describing the Generate-Verify-Revise architecture.
J. Huang et al., "Accelerating Scientific Research with Gemini," arXiv:2602.03837 (Feb 2026). Companion paper on Gemini's scientific reasoning capabilities.
R. Anil et al., "Semi-Autonomous Mathematics Discovery," arXiv:2601.22401 (Jan 2026). The Erdős conjecture study demonstrating autonomous mathematical discovery.

Architecture

Alethic's reasoning loop proceeds through three distinct phases that repeat until the solution is verified or the iteration budget is exhausted.

┌─────────────────────────────────────────────────────────┐
│                    Orchestrator Loop                     │
│                                                         │
│   ┌───────────┐    ┌──────────┐    ┌──────────┐       │
│   │ Generator  │───▶│ Verifier │───▶│ Reviser  │──┐    │
│   │  (T=1.0)  │    │  (T=0.2) │    │  (T=0.7) │  │    │
│   └───────────┘    └──────────┘    └──────────┘  │    │
│        ▲                                          │    │
│        └──────────────────────────────────────────┘    │
│                                                         │
│   Terminates when: CORRECT (≥ threshold) OR max iters   │
└─────────────────────────────────────────────────────────┘

alethic

Popularity

What's Inside

Confidence

README

Alethic

Background and Motivation

Key References

Architecture

Information Flow and Decoupling

Similar Plugins

lean-collab

magi-researchers

gaia

archora-research

math-olympiad

co-researcher

Alethic

Background and Motivation

Key References

Architecture

Information Flow and Decoupling

Popularity

Health & Quality

Similar Plugins

lean-collab

magi-researchers

gaia

archora-research

math-olympiad

co-researcher