Design evidence-backed RL reward parameter changes, reward code changes, curriculum schedules, adaptive sampling, or domain-randomization distributions. Use after baseline/result analysis identifies a concrete failure mode.
How this skill is triggered — by the user, by Claude, or both
Slash command
/rl-experiment-assistant:rl-reward-curriculum-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design interventions for reward tuning, reward engineering, curriculum learning, adaptive sampling, and domain randomization. Do not invent changes without evidence from code inspection, baseline metrics, reward component statistics, videos, termination causes, per-scene/per-motion performance, or user-provided diagnostics.
Design interventions for reward tuning, reward engineering, curriculum learning, adaptive sampling, and domain randomization. Do not invent changes without evidence from code inspection, baseline metrics, reward component statistics, videos, termination causes, per-scene/per-motion performance, or user-provided diagnostics.
Before proposing deployment, compare the intervention against .rlxp/contract.yaml. Block if the contract is missing, not approved_for_launch, lacks explicit tuning-scope approval, or does not allow the intervention class.
Prefer this order unless evidence justifies skipping:
Before editing reward code, produce:
Return an experiment proposal conforming to templates/experiment-proposal.schema.json, plus a short rationale:
# Intervention Proposal
## Evidence
## Hypothesis
## Exact change
## Command/config patch
## Expected metric movement
## Risks and guardrails
## Decision rule
Also provide the machine-readable proposal JSON, or explicitly state which schema field is still unknown.
npx claudepluginhub junhyekh/rlxp --plugin rl-experiment-assistantGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.