Monitor running experiments, check progress, collect results. Use when user says "check results", "is it done", "monitor", or wants experiment output.
How this skill is triggered — by the user, by Claude, or both
Slash command
/auto-research-with-eyes:monitor-experimentThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Monitor: $ARGUMENTS
Monitor: $ARGUMENTS
ssh <server> "screen -ls"
For each screen session, capture the last N lines:
ssh <server> "screen -S <name> -X hardcopy /tmp/screen_<name>.txt && tail -50 /tmp/screen_<name>.txt"
If hardcopy fails, check for log files or tee output.
ssh <server> "ls -lt <results_dir>/*.json 2>/dev/null | head -20"
If JSON results exist, fetch and parse them:
ssh <server> "cat <results_dir>/<latest>.json"
Present results in a comparison table:
| Experiment | Metric | Delta vs Baseline | Status |
|-----------|--------|-------------------|--------|
| Baseline | X.XX | — | done |
| Method A | X.XX | +Y.Y | done |
npx claudepluginhub llv22/autoresearchwitheyesDisplays experiment dashboard with results, active loops, progress, metrics, and status. Supports single/domain views and Markdown/CSV exports via /ar:status.
Manages ML experiment lifecycle via structured YAML registry. Registers experiments, records results, compares runs, tracks status. Activates on experiment-related queries.
Creates, runs, and analyzes Arize experiments for evaluating model performance. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows via the ax CLI.