Search everything...

Stats

Actions

Available In

ml-pro

Name: ml-pro
Author: travis-gilbert

By travis-gilbert

ML engineering plugin: build, train, debug, and deploy machine learning systems with verified API knowledge. 5 domain-expert agents (model-architect, training-engineer, ml-debugger, graph-engineer, systems-optimizer), 6 reference files, 5 runnable templates, and 4 workflow commands.

npx claudepluginhub travis-gilbert/claude-marketplace --plugin ml-pro

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Slash Commands5

learn

/learn

End-of-session learning. Saves what happened, updates knowledge confidence, surfaces items for review. Run this when your work session is complete.

Ml Build

/ml-build

Build an ML system from a handoff document or problem description

Ml Debug

/ml-debug

Diagnose and fix ML training failures using the 5-step protocol

Ml Deploy

/ml-deploy

Export, optimize, and deploy a trained model

Ml Train

/ml-train

Generate a complete training pipeline for a model

Agents5

graph-engineer

/graph-engineer

Specializes in graph ML: GNN architecture, PyG data handling, message passing, knowledge graph embeddings, link prediction, node classification. Route here for: any GNN task, "graph neural network," "knowledge graph," "link prediction," "node classification," "message passing," "PyG," "PyKEEN," or any graph-structured ML problem. <example> Context: User wants to build a KG completion model user: "Build an R-GCN model for link prediction on my knowledge graph" assistant: "I'll use graph-engineer to design the R-GCN encoder with DistMult decoder and the full PyG training pipeline." </example>

ml-debugger

/ml-debugger

Diagnoses and fixes ML training failures. Follows a systematic protocol: overfit-one-batch, loss curve analysis, gradient inspection, data pipeline verification, simplification. Route here for: "training isn't working," "loss is stuck," "loss is NaN," "model isn't learning," "overfitting," or any training failure. <example> Context: User's GNN training loss is flat user: "My GNN loss hasn't moved in 20 epochs" assistant: "I'll use ml-debugger to run the systematic diagnostic protocol." </example>

model-architect

/model-architect

Designs ML model architectures. Selects layers, dimensions, activations, normalization, and skip connections. Produces complete nn.Module code with parameter counts and shape annotations. Route here for: "build a model," "design the architecture," "what layers should I use," or any request to create a new model from a spec or handoff document. <example> Context: User has a handoff doc specifying a GNN for link prediction user: "Implement the model architecture from this handoff" assistant: "I'll use model-architect to build the R-GCN encoder with DistMult decoder specified in the handoff." </example> <example> Context: User wants a custom transformer for sequence classification user: "Build me a 4-layer transformer classifier for 512-token sequences" assistant: "I'll use model-architect to design the encoder with the specified depth and produce the nn.Module." </example>

systems-optimizer

/systems-optimizer

Optimizes ML systems for production. Mixed precision, torch.compile, distributed training, quantization, profiling, memory optimization, inference serving. Route here for: "make this faster," "reduce memory," "deploy this model," "optimize inference," "distributed training," "quantize," "profile," or any performance/deployment task. <example> Context: User's training is running out of GPU memory user: "My 7B model fine-tuning OOMs on a 48GB A6000" assistant: "I'll use systems-optimizer to apply the memory optimization ladder: QLoRA + gradient checkpointing + bf16." </example> <example> Context: User wants to serve a model in production user: "How do I deploy this classifier as an API?" assistant: "I'll use systems-optimizer to set up ONNX export, quantization, and a FastAPI serving layer." </example>

training-engineer

/training-engineer

Builds complete training pipelines. Data loading, loss functions, optimizers, schedulers, training loops, validation, checkpointing, and experiment tracking. Route here for: "write the training loop," "set up training," "train this model," or any request to create or modify training infrastructure. <example> Context: User has a model and needs a training pipeline user: "Write the training loop for this GNN classifier" assistant: "I'll use training-engineer to build the complete pipeline with PyG DataLoader, cross-entropy loss, AdamW, and wandb tracking." </example>

Stats

Version1.0.0

LanguageTypeScript

Stars0

MaintenanceExcellent

Last CommitMar 31, 2026

AddedApr 28, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

codex-marketplace

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools

README

Codex Plugins

Most Claude Code plugins give you a set of slash commands and some domain knowledge. These plugins do something different: they learn.

Each plugin in this repo is a domain-specialized engineering intelligence that accumulates knowledge across sessions, grounds itself in real library source code (not training data), and coordinates with a companion chat skill on Claude.ai. The plugin implements. The chat skill plans. Over time, the plugin gets better at its job because it tracks what works, what doesn't, and what it's still uncertain about.

This is the two-surface architecture: one surface for thinking, one for building.

What's Inside Each Plugin

A typical plugin contains four layers:

Specialist agents and slash commands. Each plugin ships with 3 to 7 agents that handle specific subtasks. UI-Design-Pro has a design critic, a component builder, an accessibility auditor, an animation engineer, and a visual architect. Django-Engine-Pro has agents for model design, ORM optimization, migration planning, and MCP server exposure. Agents compose in defined sequences: you always run the stack detector before the component builder, always run the design critic after.

Source-code references. Plugins include install.sh scripts that shallow-clone real library repos into a local refs/ directory. When UI-Design-Pro needs to know how Radix handles focus restoration, it greps the actual Radix source, not its training data. When D3-Pro needs to verify a scale constructor's API, it reads the Observable source directly. This matters because training data goes stale. Source code doesn't.

Skills and decision frameworks. Static knowledge: inheritance decision tables, ORM anti-pattern catalogs, polymorphic rendering rules, animation physics constants. These encode the expert judgment that doesn't change between sessions.

An epistemic knowledge layer. This is the part that learns. Each plugin maintains a knowledge/ directory containing typed claims in JSONL, confidence scores, session logs, and (for some plugins) SBERT embeddings. Claims start as drafts. After review, they become active. Active claims carry Bayesian confidence that updates based on session outcomes: when a suggestion informed by a claim gets accepted, confidence rises; when it gets rejected, confidence drops. Over time, each plugin develops its own body of verified, weighted knowledge about its domain.

The Two-Surface Architecture

Each plugin here has a counterpart: a chat skill that runs on Claude.ai (or Claude Desktop). The division of labor is deliberate.

The chat skill handles planning, reasoning, and decision-making. When you're deciding between DRF and Ninja for an API, or choosing an inheritance strategy for a model hierarchy, or evaluating whether a component needs polymorphic rendering, the chat skill walks you through the tradeoffs and produces a structured handoff document.

The Claude Code plugin handles implementation and learning. It takes the handoff document, builds the thing, greps real source code when it needs to verify an API, logs what it tried, and updates its knowledge base with what it learned.

The chat skill never sees knowledge/claims.jsonl. The plugin never produces planning documents. Each surface does what it's good at.

Chat Skill (Claude.ai)	Claude Code Plugin
Decision frameworks	Slash commands and agents
Tradeoff analysis	Source-code grepping
Structured handoff docs	Implementation and testing
Domain reasoning	Session logging and learning
Static (expert knowledge)	Dynamic (knowledge that evolves)

The Epistemic Layer

Every plugin with a knowledge/ directory runs the same protocol:

Session start: Read manifest.json for current state. Load active claims sorted by confidence. Check tensions.jsonl for unresolved conflicts in the task's domain. Surface tensions before making decisions, not after.

During work: Track which claims informed each suggestion. Note when the user accepts, modifies, or rejects a recommendation.

Session end: Write observations to session_log/. Flag contradictions as tension signals. Note recurring patterns the knowledge base doesn't yet cover.

The knowledge types are borrowed from Theseus (a separate epistemic engine project):

Claims: factual assertions with confidence scores and evidence links
Tensions: unresolved conflicts between claims or approaches
Questions: open research threads the plugin hasn't resolved
Methods: process knowledge (how to do X effectively)
Preferences: user-specific defaults that override generic best practices

Current knowledge stats across the fleet:

Plugin	Total Claims	Active	Avg Confidence
UI-Design-Pro	140	135	0.667
Django-Engine-Pro	111	29	0.75

Available Plugins

View full README on GitHub

ml-pro

Popularity

What's Inside

Confidence

README

Codex Plugins

What's Inside Each Plugin

The Two-Surface Architecture

The Epistemic Layer

Available Plugins

Similar Plugins

feature-dev

ecc

pr-review-toolkit

context7-plugin

c4-architecture

fullstack-dev-skills

More by travis-gilbert

app-pro

app-forge

cosmos-pro

shipit

d3-pro

Codex Plugins

What's Inside Each Plugin

The Two-Surface Architecture

The Epistemic Layer

Available Plugins

Popularity

Health & Quality

More by travis-gilbert

app-pro

app-forge

cosmos-pro

shipit

d3-pro

Similar Plugins

feature-dev

ecc

pr-review-toolkit

context7-plugin

c4-architecture

fullstack-dev-skills