Skill

test-skill

Scaffolds pytest smoke tests and runs behavioral tests for Claude Code skills in Docker harness. Generates golden files, runs pytest, reports LLM verdicts and costs.

Python

Pytest

Docker Compose

testing

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/dstoic:test-skill

User invocable

Model invocable

Inline context

Default effort

Configuration

Modelsonnet

Tool Access

This skill is limited to the following tools:

BashReadWriteEditAskUserQuestionGlob

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Scaffold + run behavioral tests for a skill in the Docker test harness.

Supporting Files

reference.md

SKILL.md

47 lines · ~549 tokens

Stats

LanguagePython

Parent stars12

Parent forks3

MaintenanceExcellent

Last CommitApr 6, 2026

Actions

View Source View Plugin View on GitHub View README

Test Skill

Scaffold + run behavioral tests for a skill in the Docker test harness.

⚠️ AskUserQuestion Guard

CRITICAL: After EVERY AskUserQuestion call, check if answers are empty/blank. Known Claude Code bug: outside Plan Mode, AskUserQuestion silently returns empty answers without showing UI.

If answers are empty: DO NOT proceed with assumptions. Instead:

Output: "⚠️ Questions didn't display (known Claude Code bug outside Plan Mode)."
Present the options as a numbered text list and ask user to reply with their choice number.
WAIT for user reply before continuing.

Arguments

From $ARGUMENTS: skill_name (positional, kebab-case), --run-only, --scaffold-only.

Derive: snake_name = - → _, test_file = test/tests/test_{snake_name}.py, golden_file = test/fixtures/golden/{skill_name}-smoke.md.

Steps

1. Validate — dstoic/skills/{skill_name}/SKILL.md must exist. Error if not.

2. Scaffold test (skip if --run-only) — If test_file exists → skip to 4. Otherwise AskUserQuestion:

"What scenario should the smoke test cover?" (header: Scenario)
"What YES/NO question should the LLM judge answer?" (header: Judge Q)

Generate test_file from template in reference.md. Derive prompt from scenario.

3. Scaffold golden (skip if --run-only) — If golden_file exists → skip. Read source SKILL.md → generate simplified version: frontmatter (name+description) + minimal body. Under 250 tokens.

4. Run (skip if --scaffold-only)

docker compose -f test/docker-compose.test.yml run --rm skill-tester pytest tests/test_{snake_name}.py -v -s

5. Report — Parse test/output/{snake_name}_smoke.yaml: status, judge verdict/reason, cost USD. Also show latest trace file test/output/{snake_name}_smoke_trace_*.yaml.

test-skill

Popularity

Invocation

Configuration

Tool Access

Context Preview

Supporting Files

SKILL.md

test-skill

Popularity

Invocation

Configuration

Tool Access

Context Preview

Supporting Files

SKILL.md

Test Skill

⚠️ AskUserQuestion Guard

Arguments

Steps

Similar Skills

Test Skill

⚠️ AskUserQuestion Guard

Arguments

Steps

Similar Skills