Skill

grillme

Socratic plan validation. Grills your plan one question at a time against the project's domain language (CONTEXT.md) and prior decisions (ADRs in docs/adr/), updating both inline as terms resolve. Catches terminology drift, hidden assumptions, and undocumented decisions BEFORE implementation. WRITES to CONTEXT.md and docs/adr/ with per-write confirmation — use --no-write for dry-run. Triggers: "grill my plan", "stress-test this plan", "validate this approach", "is this consistent with our domain", "are there decisions missing", "before I implement", "challenge my assumptions", "/grillme". Example: "/squad:grillme --deep store user notifications in a separate table". For exploring options BEFORE a plan exists, use /brainstorm. For implementation, use /squad:implement.

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/squad:grillme

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Take the user's plan (a paragraph, a bullet list, a PRD-ish blurb — whatever) and run a Socratic interview that surfaces contradictions, sloppy terminology, and unstated decisions, **one question at a time**, grounded in the project's domain artefacts. When a term gets resolved or a decision gets made, capture it inline in `CONTEXT.md` and (sparingly) `docs/adr/`. The point is not to write code...

Supporting Files

ADR-FORMAT.mdCONTEXT-FORMAT.md

SKILL.md

235 lines · ~3.9k tokens

Stats

LanguageTypeScript

Stars2

Forks3

MaintenanceExcellent

Last CommitMay 29, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Skill: Grillme

Objective

Take the user's plan (a paragraph, a bullet list, a PRD-ish blurb — whatever) and run a Socratic interview that surfaces contradictions, sloppy terminology, and unstated decisions, one question at a time, grounded in the project's domain artefacts. When a term gets resolved or a decision gets made, capture it inline in CONTEXT.md and (sparingly) docs/adr/. The point is not to write code; it is to harden the plan before code gets written.

Position in the workflow:

/squad:brainstorm — explores what to build (outward research)
/squad:grillme — interrogates a chosen plan against MY codebase (this skill)
/squad:implement — writes the change

Inviolable Rules

This skill writes user files. Unlike brainstorm / debug / question / stats (read-only by rule), grillme may mutate CONTEXT.md, CONTEXT-MAP.md, and files under docs/adr/. Every write is preceded by an inline confirmation ("update CONTEXT.md to record this resolution?"). No silent writes. No writes to anything outside those three paths. Never edits source code.
No git commit, no git push. Read-only git is fine for context (git log, git status, git diff). The user owns the commit.
One question at a time. No question lists, no batching. Ask, wait for the answer, integrate it, then ask the next. The Socratic value comes from forcing depth on each branch before moving on.
Recommend your answer. For every question, propose your best current answer so the user can correct rather than invent from scratch. Phrase: "I'd say X — does that match your model?"
No AI attribution in any artifact produced. CONTEXT.md and ADR files must not carry Co-Authored-By: Claude / Anthropic / Generated with [...] lines. Consistent with the squad-wide commit-authorship rule.
ADRs are offered sparingly. Only when all three are true: (1) hard to reverse, (2) surprising without context, (3) the result of a real trade-off with rejected alternatives. If any one is missing, skip the ADR.

Inputs

/squad:grillme [--quick | --normal | --deep] [--no-write] [--domain <name>] <plan text>

Flag	Default	Description
`--quick`	off	Cap at ~3 questions; only flag the most load-bearing inconsistencies. Sub-2-minute session for low-stakes plans.
`--normal`	default	~5–8 questions; full glossary + ADR offer flow. Use when the plan touches a non-trivial slice.
`--deep`	off	~10+ questions; cross-references code paths; walks every branch of the design tree. Use for architectural plans where misunderstanding is expensive.
`--no-write`	off	Dry-run: emit proposed `CONTEXT.md` / ADR patches inline but apply nothing. Useful in PR review or read-only environments.
`--domain <name>`	auto	When the repo has multiple bounded contexts (`CONTEXT-MAP.md` present), pin the interview to one. Auto-detect from the plan text if omitted.

The user's free-form text after the flags is the plan. Capped at 16 KB; truncate and surface note: plan truncated at 16 KB if exceeded.

Phase 0 — Setup

Generate a fresh run id following the spec from skills/squad/SKILL.md: Date.now().toString(36) + "-" + 6 chars from [a-z0-9]. Hold it for the in_flight / terminal telemetry pair.

Phase 1 — Detect domain artefacts

Read-only filesystem probe. Do not write anything in this phase.

Check for CONTEXT-MAP.md at the repo root.
- If present → multi-context mode. Parse to enumerate contexts and their CONTEXT.md paths. If --domain was passed, pin to that one; otherwise infer from the plan text (match nouns against context names; if ambiguous, ask one clarifying question before continuing).
- If absent → single-context mode.
Check for root CONTEXT.md.
Check for docs/adr/ directory and enumerate existing ADRs (collect highest number for sequential ID assignment later).

Three possible states:

State	What you'll find	Entry behaviour
Established	`CONTEXT.md` (or map) + at least one ADR present	Read both. Use them as the source of truth to grill against.
Partial	One of the two present	Read what's there; treat the missing artefact as something to create lazily during the session if the need arises.
Greenfield	Neither present	Open with the 3 seed questions below before starting normal grilling. Create files lazily as resolutions emerge.

Greenfield seed questions (only on greenfield state)

Ask these three, one at a time, before grilling the plan itself:

"In one sentence, what problem does this plan solve, in the user's voice?"
"Name the 3 nouns that matter most in this plan. For each, what does it refer to — and is there another word the team has been using for it?"
"Is there any decision in this plan that would be expensive to undo six months from now?"

These three answers seed CONTEXT.md (Q2) and pre-stage potential ADRs (Q3). If the user balks at greenfield setup ("just grill the plan, skip the docs"), respect that and switch to read-only ad-hoc mode for this session — no file creation.

Phase 1.5 — Write `in_flight` telemetry row

Before launching Phase 2's questioning loop, append the in_flight row. Same pattern as skills/brainstorm/SKILL.md Step 1.5 and skills/debug/SKILL.md Phase A:

record_run({
  workspace_root: <cwd>,
  record: {
    schema_version: 2,
    id: <runId>,
    status: "in_flight",
    started_at: <ISO 8601 now>,
    invocation: "grillme",
    mode: <"quick" | "normal" | "deep" from flag, default "normal">,
    mode_source: <"user" if a depth flag was explicit, "auto" otherwise>,
    git_ref: null,
    files_count: <count of detected CONTEXT*.md + ADR files>,
    agents: [],                              // grillme runs no subagents by default
    est_tokens_method: "chars-div-3.5",
    mode_warning: null,
  },
});

Non-blocking try/catch per shared/_Telemetry-Contract.md: I/O errors log silently; SquadError surfaces code + message verbatim. If this write fails, set a flag to skip the Phase 4 finalisation.

Phase 2 — Question loop

The core mechanic. For each iteration:

2.1 Pick the next question

Prioritise in this order:

Terminology conflicts: the plan uses a term that contradicts CONTEXT.md. Example: "Your glossary defines Cancellation as a customer-initiated refund. Your plan uses cancellation for an internal admin reversal. Which is it — are these the same concept, or do you mean a different operation?"
Fuzzy nouns: the plan uses an overloaded word ("account", "user", "session"). Ask the user to disambiguate against the existing glossary. Propose the precise canonical term.
Unspoken decisions: the plan implies a choice that has alternatives. Surface it: "You're storing this in Postgres — was Redis considered for this hot path? Why Postgres?"
Code contradictions: read a small slice of the relevant code (Glob + Read, never Edit) and check whether the plan's claim matches reality. If it doesn't, surface: "Your plan says we already debounce this — I see no debounce in src/foo.ts:42. Which is right?"
Scenario probes: invent a concrete scenario that stresses the plan's boundaries. "Customer places an order, then changes their email mid-shipping. Where does the new email live — on the Order, the Customer, or both?"

2.2 Ask the question

Format: one sentence question, one sentence proposed answer, optional one-line reasoning.

Q: Does `Order.cancel()` produce a refund automatically, or is that a separate step the user triggers?
My read: separate step — refunds are policy-laden, and the glossary says "Cancellation" is the state change, not the money movement.

Wait for the user's answer before continuing. Do not stack multiple questions.

2.3 Integrate the answer

Three possible outcomes per resolution:

Term resolved → propose a CONTEXT.md update inline. Show the exact diff. Ask: "update CONTEXT.md with this entry?" Apply only on confirmation. Use the format from CONTEXT-FORMAT.md (sibling file).
Decision crystallised → check the ADR triad (hard to reverse / surprising / real trade-off). If all three are true, propose an ADR. Show the exact file path (docs/adr/NNNN-short-slug.md, NNNN = highest existing + 1) and content. Ask before writing. Use the format from ADR-FORMAT.md (sibling file).
Plan revision → record a note for the final summary; do not touch any file beyond the glossary/ADR.

If --no-write is set, replace "apply on confirmation" with "emit the patch as a code block for the user to apply themselves." Telemetry still tracks proposed_writes count.

2.4 Stopping condition

End the loop when any of:

The user says "enough", "stop", "done", or equivalent.
The mode budget is hit (--quick: 3 Q; --normal: 8 Q; --deep: 12 Q hard cap).
You've run out of high-value questions — say so explicitly: "I don't have a sharp question left. Want to keep going, or wrap up?"

Phase 3 — Resolution summary

After the loop ends, emit a closing report:

# Grill summary — {short topic}

## Terms resolved
- **Cancellation**: state change initiated by customer; does NOT trigger a refund. (Added to CONTEXT.md)
- **Account**: ambiguous — confirmed to mean **Customer** in the ordering context. (Added to CONTEXT.md "Flagged ambiguities")

## Decisions recorded
- ADR-0007: Refunds are a separate flow, not a side-effect of cancellation. (docs/adr/0007-refund-separate-from-cancel.md)

## Plan revisions you mentioned
- Stripe webhook handler will treat `payment_intent.canceled` as a no-op when the order is already in `Cancelled` state.

## Open questions (still unresolved)
- Does timeout-based auto-cancellation count as "customer-initiated"? (parking lot)

## Next step
- `/squad:implement <refined plan summary>` — when you're ready to build.
- `/squad:grillme --deep <subtopic>` — if a single thread above needs deeper grilling.

If --no-write was set, replace "Added to CONTEXT.md" with "Proposed patch (not applied)" and include the patches inline.

Phase 4 — Finalise telemetry

Use the SAME id from Phase 1.5:

record_run({
  workspace_root: <cwd>,
  record: {
    schema_version: 2,
    id: <same runId>,
    status: "completed",                       // or "aborted" on early stop
    started_at: <same>,
    completed_at: <ISO 8601 now>,
    duration_ms: <completed_at - started_at>,
    invocation: "grillme",
    mode: <same>,
    mode_source: <same>,
    git_ref: null,
    files_count: <CONTEXT/ADR files touched or detected>,
    agents: [],
    verdict: null,                             // grillme runs don't carry a verdict
    weighted_score: null,                      // no rubric
    est_tokens_method: "chars-div-3.5",
    mode_warning: null,
  },
});

Same non-blocking try/catch; on SquadError write the fallback row per shared/_Telemetry-Contract.md.

Edge Cases

User says "grill it" but provides no plan text → ask one clarifying question: "what's the plan? paste it or describe it in 1–2 paragraphs."
CONTEXT.md already exists but means something unrelated (e.g., onboarding doc) → detect by looking for a ## Language section. If absent, do not overwrite. Ask: "your CONTEXT.md looks like an onboarding doc, not a glossary. Want me to use a different file (docs/CONTEXT.md?) or rename the existing one?"
Plan is purely tactical (one-line bug fix, naming change) → most plans don't need a glossary update. Say so: "this is a tactical change; CONTEXT.md isn't load-bearing here. Want a one-question sanity check instead, or skip?"
User confirms a term update but the glossary entry already matches → no-op write; report "no diff, glossary already says this."
ADR triad not met → skip the ADR offer entirely. Do not weaken the criteria.
Multi-context plan touches two domains → ask the user up front which one is in scope. Don't grill across boundaries in one session — that's two sessions.
Repo has CONTEXT-MAP.md but listed paths don't exist → flag the stale map as an open question; offer to update it once the session resolves which context applies.

Boundaries

This skill never edits source code.
This skill never runs state-mutating git commands.
This skill never writes outside CONTEXT.md, CONTEXT-MAP.md, and docs/adr/.
This skill never invents glossary entries — every entry must come from a resolved exchange with the user.
This skill never carries AI attribution into the artefacts it writes.

Considerations

Cost vs depth

Question counts per mode are in the Inputs table. Token ballpark: --quick ~5-10K, --normal ~20-40K, --deep ~60-100K. Same --quick / --normal / --deep vocabulary as the other squad skills.

When to use vs alternatives

Use /squad:grillme when: you have a plan and want to stress-test it against your project's own language and decisions before building.
Use /squad:brainstorm when: you're upstream of a plan — comparing approaches, surveying industry, deciding what to build.
Use /squad:question when: you need to look up how the codebase currently works (no plan validation, no Socratic interview).
Use /squad:implement when: the plan is hardened and you're ready to build.

Origin

The interview mechanic, the CONTEXT.md / ADR file structure, and the ADR triad criteria are adapted from Matt Pocock's grill-with-docs skill (MIT, github.com/mattpocock/skills). The squad-mcp adaptation adds: telemetry integration, the --no-write dry-run mode, the greenfield seed questions, and explicit write-authority gating. See NOTICE for attribution.

grillme

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

grillme

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Skill: Grillme

Objective

Inviolable Rules

Inputs

Phase 0 — Setup

Phase 1 — Detect domain artefacts

Greenfield seed questions (only on greenfield state)

Phase 1.5 — Write in_flight telemetry row

Phase 2 — Question loop

2.1 Pick the next question

2.2 Ask the question

2.3 Integrate the answer

2.4 Stopping condition

Phase 3 — Resolution summary

Phase 4 — Finalise telemetry

Edge Cases

Boundaries

Considerations

Cost vs depth

When to use vs alternatives

Origin

Similar Skills

Skill: Grillme

Objective

Inviolable Rules

Inputs

Phase 0 — Setup

Phase 1 — Detect domain artefacts

Greenfield seed questions (only on greenfield state)

Phase 1.5 — Write in_flight telemetry row

Phase 2 — Question loop

2.1 Pick the next question

2.2 Ask the question

2.3 Integrate the answer

2.4 Stopping condition

Phase 3 — Resolution summary

Phase 4 — Finalise telemetry

Edge Cases

Boundaries

Considerations

Cost vs depth

When to use vs alternatives

Origin

Similar Skills

Phase 1.5 — Write `in_flight` telemetry row

Phase 1.5 — Write `in_flight` telemetry row