Codex Plugins

Most Claude Code plugins give you a set of slash commands and some domain knowledge. These plugins do something different: they learn.

Each plugin in this repo is a domain-specialized engineering intelligence that accumulates knowledge across sessions, grounds itself in real library source code (not training data), and coordinates with a companion chat skill on Claude.ai. The plugin implements. The chat skill plans. Over time, the plugin gets better at its job because it tracks what works, what doesn't, and what it's still uncertain about.

This is the two-surface architecture: one surface for thinking, one for building.

What's Inside Each Plugin

A typical plugin contains four layers:

Specialist agents and slash commands. Each plugin ships with 3 to 7 agents that handle specific subtasks. UI-Design-Pro has a design critic, a component builder, an accessibility auditor, an animation engineer, and a visual architect. Django-Engine-Pro has agents for model design, ORM optimization, migration planning, and MCP server exposure. Agents compose in defined sequences: you always run the stack detector before the component builder, always run the design critic after.

Source-code references. Plugins include install.sh scripts that shallow-clone real library repos into a local refs/ directory. When UI-Design-Pro needs to know how Radix handles focus restoration, it greps the actual Radix source, not its training data. When D3-Pro needs to verify a scale constructor's API, it reads the Observable source directly. This matters because training data goes stale. Source code doesn't.

Skills and decision frameworks. Static knowledge: inheritance decision tables, ORM anti-pattern catalogs, polymorphic rendering rules, animation physics constants. These encode the expert judgment that doesn't change between sessions.

An epistemic knowledge layer. This is the part that learns. Each plugin maintains a knowledge/ directory containing typed claims in JSONL, confidence scores, session logs, and (for some plugins) SBERT embeddings. Claims start as drafts. After review, they become active. Active claims carry Bayesian confidence that updates based on session outcomes: when a suggestion informed by a claim gets accepted, confidence rises; when it gets rejected, confidence drops. Over time, each plugin develops its own body of verified, weighted knowledge about its domain.

The Two-Surface Architecture

Each plugin here has a counterpart: a chat skill that runs on Claude.ai (or Claude Desktop). The division of labor is deliberate.

The chat skill handles planning, reasoning, and decision-making. When you're deciding between DRF and Ninja for an API, or choosing an inheritance strategy for a model hierarchy, or evaluating whether a component needs polymorphic rendering, the chat skill walks you through the tradeoffs and produces a structured handoff document.

The Claude Code plugin handles implementation and learning. It takes the handoff document, builds the thing, greps real source code when it needs to verify an API, logs what it tried, and updates its knowledge base with what it learned.

The chat skill never sees knowledge/claims.jsonl. The plugin never produces planning documents. Each surface does what it's good at.

Chat Skill (Claude.ai)	Claude Code Plugin
Decision frameworks	Slash commands and agents
Tradeoff analysis	Source-code grepping
Structured handoff docs	Implementation and testing
Domain reasoning	Session logging and learning
Static (expert knowledge)	Dynamic (knowledge that evolves)

The Epistemic Layer

Every plugin with a knowledge/ directory runs the same protocol:

Session start: Read manifest.json for current state. Load active claims sorted by confidence. Check tensions.jsonl for unresolved conflicts in the task's domain. Surface tensions before making decisions, not after.

During work: Track which claims informed each suggestion. Note when the user accepts, modifies, or rejects a recommendation.

Session end: Write observations to session_log/. Flag contradictions as tension signals. Note recurring patterns the knowledge base doesn't yet cover.

The knowledge types are borrowed from Theseus (a separate epistemic engine project):

Claims: factual assertions with confidence scores and evidence links
Tensions: unresolved conflicts between claims or approaches
Questions: open research threads the plugin hasn't resolved
Methods: process knowledge (how to do X effectively)
Preferences: user-specific defaults that override generic best practices

Current knowledge stats across the fleet:

Plugin	Total Claims	Active	Avg Confidence
UI-Design-Pro	140	135	0.667
Django-Engine-Pro	111	29	0.75

Available Plugins

Codex Plugins

Most Claude Code plugins give you a set of slash commands and some domain knowledge. These plugins do something different: they learn.

This is the two-surface architecture: one surface for thinking, one for building.

What's Inside Each Plugin

A typical plugin contains four layers:

The Two-Surface Architecture

Each plugin here has a counterpart: a chat skill that runs on Claude.ai (or Claude Desktop). The division of labor is deliberate.

The chat skill never sees knowledge/claims.jsonl. The plugin never produces planning documents. Each surface does what it's good at.

Chat Skill (Claude.ai)	Claude Code Plugin
Decision frameworks	Slash commands and agents
Tradeoff analysis	Source-code grepping
Structured handoff docs	Implementation and testing
Domain reasoning	Session logging and learning
Static (expert knowledge)	Dynamic (knowledge that evolves)

The Epistemic Layer

Every plugin with a knowledge/ directory runs the same protocol:

During work: Track which claims informed each suggestion. Note when the user accepts, modifies, or rejects a recommendation.

Session end: Write observations to session_log/. Flag contradictions as tension signals. Note recurring patterns the knowledge base doesn't yet cover.

The knowledge types are borrowed from Theseus (a separate epistemic engine project):

Claims: factual assertions with confidence scores and evidence links
Tensions: unresolved conflicts between claims or approaches
Questions: open research threads the plugin hasn't resolved
Methods: process knowledge (how to do X effectively)
Preferences: user-specific defaults that override generic best practices

Current knowledge stats across the fleet:

Plugin	Total Claims	Active	Avg Confidence
UI-Design-Pro	140	135	0.667
Django-Engine-Pro	111	29	0.75

theseus-pro

Popularity

Confidence

What's Inside

README

Codex Plugins

What's Inside Each Plugin

The Two-Surface Architecture

The Epistemic Layer

Available Plugins

Similar Plugins

feature-dev

ecc

context7-plugin

fullstack-dev-skills

nature-skills

claude-md-management

More by travis-gilbert

app-pro

app-forge

cosmos-pro

shipit

d3-pro

Codex Plugins

What's Inside Each Plugin

The Two-Surface Architecture

The Epistemic Layer

Available Plugins

Popularity

Health & Quality

More by travis-gilbert

app-pro

app-forge

cosmos-pro

shipit

d3-pro

Similar Plugins

feature-dev

ecc

context7-plugin

fullstack-dev-skills

nature-skills

claude-md-management