Search everything...

Stats

Actions

Available In

rl-experiment-assistant

Name: rl-experiment-assistant
Author: junhyekh

By junhyekh

Evidence-driven RL experiment assistant for reward tuning, reward engineering, curriculum design, and domain randomization design.

npx claudepluginhub junhyekh/rlxp --plugin rl-experiment-assistant

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Agents3

experiment-director

/experiment-director

Coordinates RL experiment planning and execution, ensuring user-confirmed metrics, scope, budget, and evidence-gated decisions.

metrics-analyst

/metrics-analyst

Analyzes RL training/evaluation metrics, reward components, failure modes, confidence, and guardrails before deciding whether an experiment improved.

reward-reviewer

/reward-reviewer

Reviews reward/curriculum/domain-randomization changes for correctness, reward hacking risk, confounding, and reproducibility.

Skills4

rl-experiment-loop

/rl-experiment-loop

Execute the evidence-gated RL experiment loop: launch approved training/eval jobs, parse results, update the report, decide accept/reject/inconclusive, and propose the next experiment within the approved budget.

rl-experiment-plan

/rl-experiment-plan

Turn an audited RL task into a user-confirmed experiment plan with metrics, tuning scope, GPU budget, launcher commands, report skeleton, and first baseline/ablation proposal. Use before launching RL training jobs.

rl-reward-curriculum-design

/rl-reward-curriculum-design

Design evidence-backed RL reward parameter changes, reward code changes, curriculum schedules, adaptive sampling, or domain-randomization distributions. Use after baseline/result analysis identifies a concrete failure mode.

rl-task-audit

/rl-task-audit

Audit an RL training codebase to infer task definition, training/eval commands, rewards, terminations, curriculum, domain randomization, logs, and metrics before planning experiments. Use for robotics/RL codebases, reward tuning, curriculum design, domain-randomization design, or ambiguous task requests.

Stats

Version0.1.2

LanguagePython

Stars0

MaintenanceGood

LicenseMIT

Last CommitMay 14, 2026

AddedMay 14, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

local-rl-experiment-assistant

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools

README

RL Experiment Assistant Plugin

Dual-host Codex and Claude Code plugin for evidence-gated reinforcement-learning experiment work. It gives LLM agents the workflow and deterministic helper tools they need to audit an RL repository, create a task/metric/scope/budget contract, initialize repo-local .rlxp/ state, score results, and propose bounded next experiments without treating training reward as the only objective.

No training by default

Setup, audit, planning, package validation, and offline validation do not run GPU jobs, simulators, W&B/network calls, or training entry points. Training remains blocked unless the target repository's .rlxp/contract.yaml records explicit approval for:

task definition
primary metric and guardrails
tuning scope
wall-clock/GPU budget
hardware target
baseline command
evaluation protocol
an approved experiment inside the confirmed scope and budget

Generated .rlxp/ state belongs in the target RL repository, not in this plugin package except temporary validation fixtures.

Layout

.
├── .agents/plugins/marketplace.json
├── .claude-plugin/marketplace.json
└── plugins/rl-experiment-assistant/
    ├── .codex-plugin/plugin.json
    ├── .claude-plugin/plugin.json
    ├── skills/
    ├── agents/
    ├── scripts/
    ├── templates/
    └── examples/

Install In Codex

From this package root:

codex plugin marketplace add .

Then open /plugins, install local-rl-experiment-assistant / rl-experiment-assistant, and start a fresh Codex session.

Typical request from a target RL repository:

Use RL Experiment Assistant to audit this RL codebase. Prepare the task card, metric candidates, tuning scope, budget questions, and .rlxp report skeleton. Do not launch training.

To initialize state through the plugin, ask the agent directly:

Use RL Experiment Assistant to initialize .rlxp for this repository, then audit the RL task and prepare the first experiment plan. Use the current working directory as the target repository. Do not launch training.

Install In Claude Code

From this package root:

claude plugin marketplace add .

Then in Claude Code:

/plugin install rl-experiment-assistant@local-rl-experiment-assistant
/reload-plugins

Development fallback without marketplace installation:

claude --plugin-dir ./plugins/rl-experiment-assistant

Claude-specific agents are additive wrappers around the same contract-gated skills:

experiment-director: coordinates audit/plan/loop and owns .rlxp/contract.yaml checks.
metrics-analyst: validates run evidence against held-out metrics and guardrails.
reward-reviewer: reviews reward/curriculum/domain-randomization changes before deployment.

Target Repository Setup

Normal setup is agent-driven. The user should not need to run the bundled Python helpers by hand. Ask Codex or Claude Code to initialize the target repository, and the plugin skill should resolve the installed plugin root, run the helper internally, and write .rlxp/ into the target RL repository.

Good prompt:

Use RL Experiment Assistant to initialize .rlxp for /path/to/target-rl-repo with project name <project-name>. Audit the codebase and leave training blocked until the contract is confirmed.

Expected target-repository state:

.rlxp/
├── adapter.yaml
├── contract.yaml
├── report.md
├── experiments.yaml
├── ledger.jsonl
└── runs/

Initializing From Codex

When this plugin is used inside a Codex session, initialization still writes to the target repository the agent passes as --root. The plugin root, Codex plugin cache, and Codex home are not the experiment target.

Preferred pattern:

Use RL Experiment Assistant to initialize .rlxp for this repository. Use the current working directory as --root. Do not launch training.

If Codex is not running from the target repository, name the absolute target path:

Use RL Experiment Assistant to initialize .rlxp for /path/to/target-rl-repo. Treat that path as --root even if the Codex session is currently elsewhere. Do not launch training.

The agent may run scripts/rlxp_init.py internally as an implementation detail. It should not ask the user to paste Python commands unless the agent cannot execute local commands.

Not Holosoma Only

Holosoma support is a bundled profile and validation example, not a package-level assumption. The core workflow is framework-agnostic:

audit the target RL repository
create .rlxp/adapter.yaml for commands, config, metrics, logs, and hardware
create .rlxp/contract.yaml for task, metric, scope, budget, and launch approval
score structured metrics from JSON, JSONL, CSV, W&B-exported history, TensorBoard-exported history, or repo-specific summaries
propose bounded reward, curriculum, domain-randomization, launcher, or evaluation changes

View full README on GitHub

rl-experiment-assistant

Popularity

What's Inside

Confidence

README

RL Experiment Assistant Plugin

No training by default

Layout

Install In Codex

Install In Claude Code

Target Repository Setup

Initializing From Codex

Not Holosoma Only

Similar Plugins

ecc

octo

compound-engineering

superpowers-plus

unity-dev-toolkit

claude-code-harness

RL Experiment Assistant Plugin

No training by default

Layout

Install In Codex

Install In Claude Code

Target Repository Setup

Initializing From Codex

Not Holosoma Only

Popularity

Health & Quality

Similar Plugins

ecc

octo

compound-engineering

superpowers-plus

unity-dev-toolkit

claude-code-harness