Spec-driven feature development for any codebase or tech stack. Guides the developer through interactive discovery, generates a formal-ish spec with EARS requirements and state contracts, derives an implementation plan, and produces a test strategy. Use this skill whenever the developer wants to start a new feature, modify complex behavior, scope a bug fix, plan an implementation, or write any kind of feature spec — even if they don't say "spec" explicitly. Also trigger when the developer says things like "let's think through this before coding", "scope this out", "what should the plan be", or asks for requirements, acceptance criteria, or a contract before implementation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/think-discover-define:think-discover-defineThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
A Claude Code skill that replaces vibe-coding with structured, contract-based
A Claude Code skill that replaces vibe-coding with structured, contract-based feature development. Works with any language, framework, or architecture.
Spec is the contract. Not a waterfall document. Not a rigid pipeline. A lightweight, living reference that both you and Claude Code point at when things drift.
Discovery before implementation. The biggest source of wasted cycles is starting to code before requirements are clear. This skill front-loads the thinking through guided questions.
Formal-methods-flavored, not formal-methods-heavy. We borrow pre/post conditions, state invariants, and EARS notation from the formal world without requiring mathematical notation or verification tooling.
Micro-spec, not mega-spec. One spec per feature or change. Small, focused, scoped to one shippable unit. Not a living document for the whole app.
Specs are persistent, not disposable. Specs remain valuable after the feature ships. They serve as compressed context for future LLM sessions, onboarding references, and regression investigation aids. To keep them trustworthy, every spec carries a file manifest — a list of the source files and sections it references. Before any phase begins, the skill checks whether manifested files have changed since the spec's last update. If drift is detected, the skill flags the stale sections and prompts the developer to review and update before proceeding.
Stack-agnostic with automatic context. The spec describes behavior, not implementation. Tech stack details surface in the Plan phase, not the Spec phase. The same spec format works whether you're building in Swift, TypeScript, Kotlin, Rust, or anything else. However, the skill automatically discovers and synthesizes project documents (CLAUDE.md, .cursorrules, architecture docs, etc.) so it understands the project's conventions without the developer having to explain them.
| Change Type | What to Do |
|---|---|
| Trivial (rename, typo, config tweak) | Skip this skill entirely. Just do it. |
| Everything else | Run the full four-phase pipeline. |
Seemingly small bugs can have deep root causes. A single-function change can ripple across modules. Always run the full pipeline — the overhead is low and the documentation pays for itself later (specs are persistent, not throwaway). The developer controls what to execute and when, but the skill always produces the complete set of outputs.
If a task feels too large for a single spec, it should have been broken down at the ticket/story stage. This skill specs one shippable unit at a time.
Before each phase, scan the project for relevant documents and synthesize them into a context summary. This is NOT a one-time step — it runs before every phase because different phases need different documents (e.g., testing docs are most relevant before Phase 4, architecture docs before Phase 3).
Priority: reuse before re-scanning. Before running a fresh scan, ALWAYS check the documents already generated by this skill (ORIENT, DISCOVER, SPEC, PLAN, TEST). These contain synthesized context from prior phases and should be the first source of truth. Only scan the project for additional documents not already captured. This saves time and tokens — avoid re-discovering what was already synthesized.
Search for project documents using a broad scan — do NOT assume a fixed folder
name like specs/. Treat every .md file in the project as a potential document
worth checking. Specifically look for:
CLAUDE.md, .cursorrules, .github/copilot-instructions.md,
AGENTS.md, .ai/, .prompts/, or similarARCHITECTURE.md, docs/architecture*, design/, ADR/,
decisions/, or any architectural decision recordsAPI.md, docs/api*TESTING.md, docs/testing*, test/README*, test strategy filesdocs/, specs/, design/,
.specs/, documentation/, and root-level markdown filespackage.json, Cargo.toml, pyproject.toml, build.gradle,
etc. for tech stack signals.md files: Every markdown file in the project is a potential
document — scan broadly and assess relevance based on content, not just nameFor each phase, produce a short Project Context Summary that includes only the documents relevant to that phase. Each excerpt MUST include the full file path so the LLM knows where to go deeper:
## Project Context (synthesized for Phase N)
### Tech Stack (from `./package.json`, `./CLAUDE.md`)
- TypeScript / Next.js 14 / App Router
- Zustand for state, Tailwind for styling
- Jest + React Testing Library for tests
### Conventions (from `./CLAUDE.md` lines 12-45, `./docs/ARCHITECTURE.md`)
- Feature modules in `src/features/<name>/`
- Co-located tests in `__tests__/` subdirectories
- API clients in `src/api/` with typed responses
### Testing Strategy (from `./docs/TESTING.md`, only included for Phase 4)
- Unit tests for all business logic, integration tests for API layer
- No snapshot tests, prefer behavioral assertions
- Mock API calls with msw
If no project documents are found, note this and proceed with information gathered from code analysis alone.
┌──────────────────────────────────────────────────────────────────────────────────┐
│ /spec │
│ │
│ Phase 0: ORIENT ► Phase 1: DISCOVER ► Phase 2: SPECIFY ► Phase 3: PLAN │
│ (silent recon) (interactive Q&A) (generate spec) (derive tasks) │
│ │
│ ► Phase 4: TEST │
│ (derive tests) │
│ │
│ Outputs: │
│ Phase 0: <slug>-ORIENT.md Phase 1: <slug>-DISCOVER.md │
│ Phase 2: <slug>-SPEC.md Phase 3: <slug>-PLAN.md │
│ Phase 4: <slug>-TEST.md │
│ │
│ Smart Document Discovery runs before EVERY phase. │
│ Each phase has a CHECKPOINT where the developer reviews │
│ and approves before proceeding. Looping back is expected. │
└──────────────────────────────────────────────────────────────────────────────────┘
Critical rule: Never auto-advance without developer approval. After each phase, stop and ask: "Does this look right? Want to adjust anything?"
Before asking the developer a single question, scan the project to build context. This prevents asking questions the code or docs already answer.
Run the Smart Document Discovery scan (see section above). For Phase 0, focus on LLM context files, architecture docs, and project config. Synthesize findings into the Project Context Summary with full file paths.
Based on the developer's initial message, locate the relevant files, modules, or directories. Read key files (entry points, models, existing tests).
Note the project's conventions: naming, architecture style, test framework, existing state management, error handling patterns. Cross-reference what you find in code with what the project documents say the patterns should be. This catches drift between documented conventions and actual practice — valuable context for later phases.
Search broadly for prior specs or design documents — do NOT assume a specs/
folder. Scan docs/, design/, documentation/, .specs/, ADR/, and
root-level markdown for anything with EARS requirements, acceptance criteria,
or spec-like structure. If found, read them to understand precedent and check
their file manifests for staleness.
Record the current state (last-modified timestamps or git hashes) of the key files identified in Steps 2-4. This becomes the baseline for the spec's file manifest, used later to detect staleness.
<feature-slug>-ORIENT.mdSave the Phase 0 findings in the spec folder (see File Naming & Output Location below). Include: Project Context Summary, identified area, detected patterns, existing specs found, and the file manifest baseline.
Use Phase 0 findings to skip obvious discovery questions ("What tech stack?" when you can see it's a Next.js app) and to make Layer 2+ questions more specific ("I see you use Zustand for state — do you want the comment state in the existing task store or a new one?").
Surface requirements, edge cases, and constraints through adaptive questioning. The developer should never need to write a perfect prompt from scratch — the questionnaire pulls out what's needed.
Run the discovery as an interactive conversation. Ask questions in layers. Do not dump all questions at once. Ask Layer 1, wait for answers, then adapt Layer 2 based on what was said.
Iteration rule: Do NOT accept the first answers at face value. After the developer answers Layer 2, review the responses and probe for gaps:
Loop at least once. Present your improved understanding and ask the developer to confirm or correct. The goal is to get from a rough sketch to a precise behavioral description before moving to Layer 3.
Based on the developer's Layer 1 and Layer 2 answers, proactively generate a list of likely edge cases and error states before asking open-ended questions. Present them for the developer to review:
"Based on what you've described, here are edge cases and error states I'd expect. For each, tell me: in scope, out of scope, or 'hadn't thought of that — let's discuss'."
Then ask the open-ended questions for anything not already covered:
Proactive discovery rule: The developer should not have to invent edge cases from scratch. The LLM proposes based on the feature type, the codebase patterns found in Phase 0, and common failure modes for similar features. The developer's job is to review, confirm, and add anything domain-specific that the LLM couldn't anticipate.
Ask these only if relevant based on earlier answers:
Before asking questions, run Smart Document Discovery for testing-related documents (TESTING.md, test strategy files, test READMEs). If testing docs exist, synthesize their guidance and present it:
"I found your project's testing docs at
./docs/TESTING.md. Here's what they prescribe: [summary]. I'll use these as the baseline for the test strategy. Do you want to override anything for this feature?"
Then ask, adapting questions based on what the testing docs already cover:
If NO testing docs exist, use these questions to establish a default testing strategy for this feature, and note the absence — the developer may want to create project-level testing docs based on the patterns established here.
The developer might arrive with a half-formed spec, a Jira ticket, a Slack thread, or a detailed prompt that already answers many discovery questions. Always run the full pipeline regardless — but use the provided input to pre-fill answers and accelerate each step.
If the developer says "skip discovery" or "I know what I want," present the Discovery Summary with what you have and flag the gaps. Let them decide whether to fill them or proceed with the gaps noted as Open Questions. But recommend completing the full pipeline — the spec persistence makes it worth the investment.
After the Q&A, summarize the raw answers in a structured block at the top of the spec:
## Discovery Summary
- **Project / Stack:** [project name, language, framework]
- **Area:** [module/screen/service]
- **Type:** [new feature | modification | bugfix]
- **User goal:** [one sentence]
- **API surface:** [endpoints/mutations/services involved, or "none"]
- **Entry:** [how user/caller gets here]
- **Happy path:** [numbered steps]
- **Exit:** [where user/caller goes after, or resulting system state]
- **Error states:** [list]
- **Anti-requirements:** [what this does NOT do]
- **Test focus:** [what must be tested]
Present the Discovery Summary. Ask:
"Here's what I captured. Does this cover everything? Anything missing or wrong?"
Do NOT proceed to Phase 2 until the developer confirms.
Once confirmed, save the full discovery output as <feature-slug>-DISCOVER.md
in the spec folder. Include all layer answers, the structured Discovery Summary,
and any notes from the iteration loops.
Run Smart Document Discovery focused on architecture docs, existing specs, and domain documentation. Synthesize any relevant context that should inform the requirements (e.g., existing API contracts, domain terminology, architectural constraints).
Transform discovery answers into a precise, testable specification using EARS notation and formal-methods-flavored contracts. The spec describes BEHAVIOR, not implementation. No tech-stack-specific code or patterns belong here.
Read the template at SPEC_TEMPLATE.md (in the same directory as this skill)
and the completed example at EXAMPLE_SPEC.md for reference. Fill each
section based on the discovery answers.
Every behavioral requirement MUST be written in EARS notation:
| Pattern | Keyword | Template | Use When |
|---|---|---|---|
| Ubiquitous | (none) | The <system> shall <response> | Always-true constraints |
| Event-driven | When | When <trigger>, the <system> shall <response> | User actions, API calls, events |
| State-driven | While | While <precondition>, the <system> shall <response> | Mode-dependent behavior |
| Unwanted | If/Then | If <bad thing>, then the <system> shall <response> | Error handling |
| Complex | Combined | While <pre>, when <trigger>, the <system> shall <response> | Multi-condition behavior |
Rules for writing EARS requirements:
For any feature with more than two meaningful states, define:
Format as a simple table. Only produce a full diagram if the developer asks.
For each significant operation (API call, database write, state transition, command, etc.), define:
### Operation: `OperationName`
**Preconditions** (must be true before calling):
- [condition]
**Postconditions (success):**
- [what must be true after successful completion]
**Postconditions (failure):**
- [what must be true after failure — typically: previous state preserved]
Explicit list of what this feature does NOT do. Just as important as the requirements — they prevent scope creep and give Claude Code clear boundaries.
List what must exist before this feature can be built: APIs that need to be available, data migrations, feature flags, other features that must ship first, third-party services. Use checkboxes so the developer can track them.
Capture anything unresolved after discovery that must be answered before
implementation begins. These are not blockers for the spec — they're
blockers for coding. Mark them resolved with [x] as answers come in.
Present the full spec. Ask:
"Here's the spec. Review each EARS requirement — are they accurate? Any missing states or error cases? Anti-requirements to add? Dependencies or open questions I missed?"
This is the most important checkpoint. Loop until the developer confirms.
Run Smart Document Discovery focused on architecture docs, project structure conventions, and any implementation guidelines. This is the phase where tech-stack details matter most — synthesize everything relevant to making implementation decisions.
Derive a concrete implementation plan from the spec. THIS is where tech-stack details appear — not in the spec. The plan maps spec requirements to specific files, components, and patterns in the codebase.
Read EXAMPLE_PLAN.md for a completed reference. Analyze the spec and the
relevant parts of the codebase, then produce:
List which files/modules need changes. For each:
Break work into small, independently verifiable tasks. Each task:
Format:
- [ ] **T1: [short title]** — [what to do]. Fulfills: R1, R3. Tests: TC-1, TC-4.
Done when: [specific verifiable condition]
Suggest an implementation order based on dependencies. Explicitly note which tasks are independent and can be done in any order:
T1 → T2 → T3 (sequential, each depends on previous)
↘ T4 → T5 (parallel track, can start after T1)
Flag implementation decisions NOT covered by the spec that need developer input during coding. These are typically tech-stack-specific choices.
Present the plan. Ask:
"Here's the implementation plan. Want to reorder anything? Any tasks missing or too large? Any decisions you want to make upfront?"
Run Smart Document Discovery focused on testing-related documents. If testing docs were already surfaced in Layer 5, reuse that synthesis. If new docs are found, update the context. The test strategy MUST conform to the project's testing conventions if they exist.
Derive a test strategy directly from the EARS requirements. Every test traces back to a spec requirement. Every test case that connects to a requirement is necessary — do not prioritize or tier tests. No orphan tests. No tests for trivial things unless the developer explicitly wants them.
Read EXAMPLE_TEST.md for a completed reference.
For each EARS requirement, derive test cases using this mapping:
| EARS Pattern | Test Structure |
|---|---|
When <trigger>, shall <response> | Setup: trigger condition. Assert: response. |
While <pre>, shall <response> | Setup: establish precondition. Assert: response holds. |
If <bad>, then shall <response> | Setup: cause bad condition. Assert: response. |
| Complex | One test per condition combination. |
Default format (adapt casing to the project's convention):
test_[subject]_[condition]_[expectedOutcome]
Rules:
testThat, should, verify prefixes — just the behaviorIf the project has an existing test naming convention, defer to it.
### TC-1: `test_functionName` → R1
**Setup:** [what state to establish]
**Action:** [what to trigger]
**Assert:** [what must be true after]
Every test case that traces to an EARS requirement or invariant is necessary. Do NOT assign priority tiers (P0/P1/P2) or suggest that some tests are optional. If a requirement is in the spec, its test cases are required.
The developer can choose to descope requirements from the spec — but that's a spec decision, not a testing decision. If a requirement exists, it gets tested.
If the project has testing docs that define a testing strategy (found during Smart Document Discovery), follow their conventions for test organization, coverage expectations, and tooling.
Explicitly list exclusions, referencing Phase 1 answers. Common exclusions:
Present the test strategy. Ask:
"Here are the derived test cases. Are the names clear? Any cases missing? Anything here that's not worth testing?"
All generated files go into a dedicated folder named with the feature or ticket number. The folder name format is:
specs/<ticket-number>--<feature-slug>/
Examples:
specs/PROJ-1234--task-comments/specs/GH-42--user-auth-flow/specs/add-comment/ (if no ticket number exists)Use kebab-case for the feature slug. If no ticket/issue number exists, use
just the feature slug. If a specs/ directory doesn't exist yet, create it
at the project root.
Every output file is prefixed with the feature slug so the task is identifiable from the filename alone — even if the file is moved, copied, or shows up in a search result outside its folder.
Save outputs incrementally as each phase completes (do not wait until all phases are done):
<feature-slug>-ORIENT.md — Phase 0 findings: project context, detected patterns, file manifest<feature-slug>-DISCOVER.md — Phase 1 output: full discovery summary, all layer answers<feature-slug>-SPEC.md — Phase 2 output: EARS requirements, state model, contracts<feature-slug>-PLAN.md — Phase 3 output: implementation plan with task checklist<feature-slug>-TEST.md — Phase 4 output: test strategy with all test casesExamples for a feature slug task-comments:
specs/PROJ-1234--task-comments/task-comments-ORIENT.mdspecs/PROJ-1234--task-comments/task-comments-SPEC.mdOnce all phases are complete, present the developer with the implementation starting point:
"Spec is locked. All outputs saved to
specs/<folder>/. Ready to start implementing. Want me to begin with task T1, or a different one?"
During implementation, always reference the spec when making decisions. If the developer asks for something that contradicts the spec, flag it:
"This would change the behavior from R3. Want to update the spec first, or proceed with the change?"
Specs change. That's fine — the point is to change them explicitly, not silently. Because specs are persistent, keeping them accurate is important. Drift happens in three ways:
Developer-initiated drift — the developer asks for something that contradicts the spec during implementation:
## Change Log
section at the bottom of SPEC.md.Discovery drift — implementation reveals something the spec didn't anticipate (new edge case, unexpected API behavior, missing state):
File manifest drift — the source files referenced by the spec have changed since the spec was written:
src/api/tasks.ts has changed since this spec was written.
Requirements R5-R7 reference this file's error handling."Change Log format (append to bottom of SPEC.md when changes occur):
## Change Log
- **[date]** R3 updated: changed from "display toast" to "display inline
error" per developer feedback during T4 implementation.
- **[date]** R12 added: discovered that API returns 429 on rate limit,
added retry-with-backoff requirement.
Never silently deviate from the spec. If you catch yourself implementing something the spec doesn't describe, stop and spec it first.
UBIQUITOUS: The <system> shall <response>
EVENT: When <trigger>, the <system> shall <response>
STATE: While <precondition>, the <system> shall <response>
UNWANTED: If <bad thing>, then the <system> shall <response>
COMPLEX: While <pre>, when <trigger>, the <system> shall <response>
PRECONDITION: What must be true BEFORE an operation
POSTCONDITION: What must be true AFTER an operation succeeds
INVARIANT: What must ALWAYS be true regardless of state
IMPOSSIBLE: Transitions or states that must NEVER occur
Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub benedekvarga/bens-claude-code-toolkit --plugin think-discover-define