FEP/World-Models scaffolding for stable LLM coding agents: predict-before-act, inspectable belief state, and surprise-triggered reflection.
Perceptual-update role. Given a surprise report (prediction vs actual observation) and the current beliefs, revises the belief entries for the task. Has NO filesystem or shell access — it can only read predictions and write beliefs via the belief-store MCP server.
Builds the initial task card (z_0) for a coding task by mapping the repo, identifying in-scope files, and enumerating relevant tests. Invoke at task start or after a major scope change. Read-only; does not modify files.
Counterfactual prediction for a proposed action. Given a task card and a candidate action (diff, command, or tool call), predicts the next observation — which tests will pass/fail, what errors appear, and the blast radius. Read-only; does NOT execute actions.
Decides whether a string of unresolved prediction errors should stay at the edit-level (keep trying), escalate to re-planning, or escalate to asking the user. Pure judgment — no file, shell, or belief-write access.
Scores a set of candidate next actions by Expected Free Energy (EFE) — pragmatic value (goal progress) plus epistemic value (information gain) — using the current belief state and past similar episodes. Returns a ranked list with rationale. Has NO filesystem access; only calls belief-store MCP tools.
Snapshot the current belief state (and optionally git-stash the working tree) under a label, so a risky action can be rolled back if it produces high surprise or breaks acceptance tests.
Given a goal and a set of candidate next actions, rank them by Expected Free Energy (pragmatic + epistemic value) using past similar episodes, and pick the winner. Delegates scoring to the policy-selector subagent.
Record an explicit predicted observation before a side-effecting tool call. Use whenever you are about to run a command, edit a file, or invoke any tool whose outcome you have not already observed. Upholds the predict-before-act invariant.
Run a cheap, reversible real probe to ground a belief instead of hallucinating. Use when dreamer confidence is low or when you catch yourself guessing about repository state.
Surprise-triggered reflection. When a PostToolUse hook reports high prediction error (surprise >= 0.6), invoke /reflect to update beliefs via the belief-reviser and decide whether to continue, replan, reclarify, or ask the user.
Matches all tools
Hooks run on every tool call, not just specific ones
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A Claude Code plugin that scaffolds LLM coding agents with structure inspired by the Free Energy Principle (Friston) and World Models (Ha & Schmidhuber).
The goal: make agents more stable at completing tasks by enforcing four procedural invariants:
See docs/ARCHITECTURE.md for the full design.
Phases 1–3 are in place.
belief-store MCP server, PostToolUse
hook, cartographer + dreamer subagents, /predict + /probe
skills.policy-selector + /plan-efe,
non-parametric recall_similar_episodes.belief-reviser + hierarchy-arbiter
subagents, /reflect + /checkpoint skills, belief snapshots,
precision-weighted skill library (log_skill_outcome,
score_skill_reliability).belief-store| Tool | Purpose |
|---|---|
store_prediction | record expected observation before a tool call |
record_observation | write actual + surprise (normally by hook) |
retrieve_beliefs | list predictions for a task (optionally only surprising) |
list_beliefs | list current belief entries (key/value/precision) |
update_belief | upsert a belief entry |
snapshot_beliefs / rollback_beliefs | labelled checkpoint |
log_skill_outcome / score_skill_reliability | skill library stats |
recall_similar_episodes | k-NN over past reconciled predictions |
The belief-store MCP server is TypeScript. Build the compiled bundle
(checked into the repo so users don't need tsc at runtime):
cd mcp/belief-store
npm install
npm run build
When the plugin is loaded by Claude Code, a SessionStart hook
(scripts/bootstrap.sh) syncs dist/ into ${CLAUDE_PLUGIN_DATA} and
runs npm install there — only when package.json changes. Per the
Claude Code plugin guidelines, runtime state (node_modules, the SQLite
DB) lives in ${CLAUDE_PLUGIN_DATA} so it survives plugin updates.
Point Claude Code at this directory as a plugin:
claude --plugin-dir .
This plugin does not implement variational inference or compute true free
energy. "Confidence" is ordinal, not probabilistic. FEP and World Models
are used as a design vocabulary that organizes four specific practices
that existing agent patterns handle only partially. See
docs/ARCHITECTURE.md §1 for what is explicitly NOT claimed.
npx claudepluginhub jason-hchsieh/predictive-mind --plugin predictive-mindInline code review with MCP integration — add comments in the browser via SSH port forward, let Claude Code read and fix them via MCP tools.
Developer toolkit for the harness project. Currently provides skill-toolkit for creating, auditing, and improving Claude Agent Skills.
Send push notifications from Claude Code sessions via ntfy
A single-skill package for generating harness blueprints for agentic systems.
The epistemic-posture layer for AI coding agents. Ships the Reasoning Surface protocol, named failure-mode counters, operator profile schema, and workflow loop as Claude Code skills, agents, and hooks. Posture first. Kernel outlives the tooling.
Six portable harness skills distilled from a CC-style coding agent: dream-memory, memory-extractor, verification-gate, swarm-coordinator, structured-context-compressor, kairos-lite. The parts that separate a fun demo from a stable toolchain.
First Principles Framework (FPF) for structured reasoning using workflow command pattern. Implements ADI (Abduction-Deduction-Induction) cycle via propose-hypotheses workflow with fpf-agent for hypothesis generation, logical verification, empirical validation, and auditable decision-making. Includes utility commands for status, query, decay, actualize, and reset.
Fable 5 behavioral patterns as a harness for Opus 4.8: grounded progress, self-verification loops, delegation triggers, file-based memory, autonomy calibration. Zero-config: installs a SessionStart hook; skills auto-trigger.
Harness-native ECC operator layer - 67 agents, 271 skills, 92 legacy command shims, reusable hooks, rules, selective install profiles, and production-ready workflows for Claude Code, Codex, OpenCode, Cursor, and related agent harnesses