From steel-skill-creator
Use this skill when the user wants to turn a recurring browser workflow into a reusable, parameterized agent skill, especially when the task has concrete inputs and a clear output such as scheduled scrapes, form submissions, data extraction, monitoring flows, price probes, or login-gated reports. Do not use for one-off web tasks; use steel-browser.
How this skill is triggered — by the user, by Claude, or both
Slash command
/steel-skill-creator:steel-skill-creatorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Turn a recurring web task into an agent skill. The user describes the task and one set of example inputs. You drive the task end-to-end twice in a real Steel browser (once with the example inputs, once with mutated inputs), capture both traces through the Steel CLI, author a parameterized SKILL.md, install it, and verify it on a third input set.
README.mdevals/evals.jsonreferences/authoring-workflow.mdreferences/eval-writing.mdreferences/llm-judge-rubric.mdreferences/parameterization.mdreferences/skill-template-guide.mdreferences/skill-template.mdreferences/steel-primitives.mdreferences/trace-anatomy.mdreferences/verification.mdscripts/compare-traces.mjsscripts/fetch_trace.mjsscripts/install_skill.mjsscripts/scaffold-skill.mjsscripts/validate-skill.mjstemplates/browser-task-skill/README.mdtemplates/browser-task-skill/evals/evals.jsontemplates/browser-task-skill/references/examples.mdtemplates/browser-task-skill/references/troubleshooting.mdTurn a recurring web task into an agent skill. The user describes the task and one set of example inputs. You drive the task end-to-end twice in a real Steel browser (once with the example inputs, once with mutated inputs), capture both traces through the Steel CLI, author a parameterized SKILL.md, install it, and verify it on a third input set.
The user does not provide a session ID. You generate both sessions yourself.
Steel agent traces are already 80% of a SKILL.md. They contain stable selectors prioritized by quality (testId → id → aria → name → CSS), accessibleNames for every clicked element, page boundaries, idle gaps that mark wait points, and parameter values inline as URL query strings and form inputs. Your job is mostly reading two traces side by side and writing the parts that aren't there yet: the goal, the parameter names, the success criteria, and the right Steel session configuration (stealth, proxy, credentials, profiles) for production replay.
The reason you generate both traces yourself: parameter extraction is reliable when you can diff two runs of the same flow with different inputs. Anything that differs at corresponding positions is a parameter; anything identical is an invariant. You can't get that from a single trace without guessing, and asking the user to record twice doubles their effort. Better that you do it.
Do not try to handle every edge case programmatically. Use your judgment. The references/ directory tells you what to look for; the rest is reading code and writing code.
steel CLI is installed and authenticated for browser driving.steel-browser skill is available — you'll use the same primitives (steel browser start, navigate, snapshot, click, fill, wait, etc.) to drive both recording sessions.You need three things from the user:
flight-price-probe, weekly-traffic-report). If the user has not given one, propose one based on what the flow does and confirm.If anything is missing or ambiguous, ask once, concisely. Don't drag the user through a long Q&A.
Before running the task even once, think about side effects:
Never silently run a side-effect flow twice. The user's trust is more important than the skill.
Use the steel-browser primitives to perform the task end-to-end with the example inputs from step 1. Start a session (with the right configuration per references/steel-primitives.md), navigate, click, wait, extract — exactly the same patterns you'd use if you were running /steel-browser for the user.
Take screenshots and snapshots along the way to verify each step worked. If a step fails, debug it like you would in any normal Steel session — don't push through a broken state.
When the task reaches its success state (the data is extracted, the form is submitted, the confirmation page renders), stop the session. Save the session ID — you'll need it in step 5.
Pick mutated inputs. For each example value from step 1, choose a sensible alternative that exercises the same flow:
Aim for different inputs, same intent. The goal is a trace that takes the same conceptual path so the diff isolates parameters cleanly.
Run the task again in a fresh Steel session using the mutated inputs. Same configuration, same starting URL, same logical sequence of actions.
If trace #2 diverges from trace #1 — hits a login wall trace #1 didn't, encounters a CAPTCHA, lands on a structurally different page — stop and investigate. Consult references/steel-primitives.md to decide whether to add Steel session options (stealth, proxy, captcha solving) or whether to escalate to the user. Do not paper over the divergence.
Save the second session ID.
node scripts/fetch_trace.mjs <session-id-1>
node scripts/fetch_trace.mjs <session-id-2>
The script calls steel --json sessions traces <session-id>, writes the normalized trace JSON to a temp file, and prints the path. The trace has events with accessibleName, selectors, page URLs, value fields, and timestamps — see references/trace-anatomy.md for what to look at.
You now have two traces. Read them together. The differences at corresponding positions are the parameters; the similarities are the invariants.
Consult these references in order:
references/trace-anatomy.md — how to read each event and how to align two traces.references/steel-primitives.md — the decision tree for credentials, profiles, stealth, proxy, and CAPTCHA. This determines how the generated skill creates its Steel session.references/skill-template.md — the scaffold to fill in.The generated skill depends on the steel-browser skill at runtime — it does not embed CLI commands inline. Every generated skill must include a ## Prerequisites section (the template shows the exact wording) that points the user at curl -fsS https://setup.steel.dev | sh, which installs the steel CLI and the steel-browser skill. The body steps then describe intent in plain prose, with element names in italics; the executor translates that into snapshot → click → fill → wait sequences using steel-browser's primitives.
Selector choice for the generated skill, in priority order:
data-testid if present.id if present and looks intentional.aria-label or name.:nth-of-type chains unless nothing else exists.Use target.selector as evidence for available selectors, but prefer a readable semantic step when the accessible name is specific enough. For example, write "Click the Direct only checkbox" instead of exposing a raw selector.
Name parameters by what they mean, not what they are. depart_date, not param_2. The model that will later use the generated skill is reading these names cold; clarity matters.
Write the goal in the imperative ("Find the cheapest direct round-trip price for {origin} → {destination} between {depart} and {return}"). Write success criteria in terms of what gets returned, not what gets clicked.
node scripts/install_skill.mjs <skill-name> --skill-md <path-to-generated-SKILL.md>
The script writes to ~/.claude/skills/<skill-name>/ and tells you the final path. Generated skills go system-wide by default so the user can invoke them from any project.
Pick a third set of inputs — distinct from both step 1 (example inputs) and step 4 (mutated inputs). Invoke the freshly-installed skill against those inputs in a fresh Steel session. Capture the resulting session ID, fetch trace #3.
Read references/llm-judge-rubric.md. With the three traces, the generated skill source, and the inputs/outputs you've gathered, form a verdict:
Do not run a structural diff. Form a judgment.
If the verdict is good: tell the user the skill is installed, where, and show one example invocation with concrete inputs.
If the verdict is mixed: show the user the diagnosis in plain English and offer to revise — usually the fix is one of: a wait point in the wrong place, a selector that was too specific, a parameter that should have been split into two, or a Steel session option that was missed.
Ask the user when:
Proceed without asking when:
references/steel-primitives.md is unambiguous.A successful run leaves the user with:
~/.claude/skills/<skill-name>/.steel credentials create --origin https://foo.com once before first use").Provides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub steel-dev/skills --plugin steel-skill-creator