harness-evals-export

Harness and Eval Export

Use this skill when the user wants benchmarking, dataset runs, or eval handoff artifacts.

AutoForge supports:

Preferred commands:

Run a dataset:
- autoforgeai harness run <dataset.jsonl>
Prewarm referenced images:
- autoforgeai harness prewarm <dataset.jsonl>
Export a dataset or run to an eval bundle:
- autoforgeai harness openai-export <dataset-or-run-path>

Expected exported artifacts:

When a harness run has already completed, expect AutoForge to emit:

Always surface:

Use this skill when the user wants benchmarking, dataset runs, or eval handoff artifacts.

AutoForge supports:

Preferred commands:

Run a dataset:
- autoforgeai harness run <dataset.jsonl>
Prewarm referenced images:
- autoforgeai harness prewarm <dataset.jsonl>
Export a dataset or run to an eval bundle:
- autoforgeai harness openai-export <dataset-or-run-path>

Expected exported artifacts:

When a harness run has already completed, expect AutoForge to emit:

Always surface: