From research-workflow
Use when generating NEW research direction candidates: brainstorm research directions, 发散 / 没 idea / 找新方向 / 帮我想方向 / ideation, or when an existing scout arc needs fresh candidates. Produces candidate cards — or anomaly-probe plans when no private signal exists. NOT for: evaluating an existing candidate (use research-workflow:research Phase 0 Converge or your evaluation flow), systematic literature review (research-workflow:research), writing related work (writing phases of research-workflow:research), novelty/occupancy checking or validating a known idea (your evaluation/scout flow) — unless the user explicitly asks to re-diverge.
How this skill is triggered — by the user, by Claude, or both
Slash command
/research-workflow:ideationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Why this skill exists (worldview, 3 lines).** Predictable ideas are pre-occupied ideas: everyone (and everyone's LLM) samples from roughly the same public distribution, and "hot-topic × gap" reasoning is objective search on a shared landscape — the whole field converges to the same local optima, so collisions are an algorithmic property, not bad luck (empirically: 4,000 LLM-generated ideas ...
Why this skill exists (worldview, 3 lines). Predictable ideas are pre-occupied ideas: everyone (and everyone's LLM) samples from roughly the same public distribution, and "hot-topic × gap" reasoning is objective search on a shared landscape — the whole field converges to the same local optima, so collisions are an algorithmic property, not bad luck (empirically: 4,000 LLM-generated ideas dedupe to ~5%; idea-level novelty scores collapse after execution). The core move of this skill is therefore: generate from private signals; when no private signal exists, generate probes that manufacture private observations — do not force out paper directions.
research-workflow:research · write related work · "has anyone done X?" (novelty checking ≠ generation) · "validate this idea" — unless the user explicitly asks to re-diverge.| Entry | Source | Hard fields (all required) |
|---|---|---|
| E1 Anomaly-driven | surprises you personally produced: tool/experiment output that contradicted expectation · deployed systems doing badly at something · incident reports / audit-contest escalations / maintainer pain / standards ambiguities (incident mining) · two communities asserting contradictory things · spec-vs-implementation divergence | expected_vs_observed · artifact_pointer (command or file) · why_expert_surprised — no hindsight dressing: the surprise must predate the idea |
| E2 Private-asset residue | unexplained leftovers of assets only you hold (tools you built, data you collected, salvaged assets of dead directions) | the full chain asset → unexplained_observation → why_public_corpus_cannot_infer_it — all three links, else you are laundering a tool into a moat |
| E3 Assumption flip / contrarian | implicit assumptions in prior work; accepted trade-offs (TRIZ-style contradiction pump: "what would dissolve this trade-off?"); goal-driven ("make X work for the first time", then derive the questions nobody asks) | consensus_anchor (who/which paper states the consensus) · who_disagrees · falsifiable_reversal — else it is a slogan, not a flip |
| E4 Cross-domain structural transfer | a mechanism (invariant, oracle structure, adversary model) imported from a distant field; exogenous random anchors (low-co-occurrence concept pairs from your corpus, or a random far-field paper × your domain problem) | mechanistic_bridge (mechanism-level, not surface analogy) · first_falsifiable_transfer (the first concrete claim the transfer makes testable) |
| E6 Failure archaeology | abandoned repos · dead PRs · withdrawn standards proposals · failed benchmarks | why_failed_then · why_now (what changed that disarms the old cause of death) |
| E7 Time window | a fresh change (new spec revision, new protocol, new regulation) not yet digested by the literature | changed_artifact + exact date · why_literature_has_not_absorbed_it (freshness proof — else it decays into a hot topic) |
| E5 Pure hot-topic combination | "hot A × hot B, nobody did it" | deprecated by default. Must be CONVERTED into one of the entries above (find the private signal, the contradiction, or the failure-archaeology angle). It may NOT pass by merely surviving the filters below — self-laundered hot combos are the canonical recurring failure mode. |
For each E1 anomaly: write ≥3 candidate explanations, then order probes by economy — test the cheapest-to-eliminate explanation first.
Hard-field artifacts must be verified at card time — E1/E2: open the file / re-run the command; E3/E4/E6/E7: actually read the anchoring source. If a private signal is later corrected or reversed, every candidate built on it re-enters the filters (downgrade and re-derive; do not patch in place).
This is NOT a generation-time checklist. During overgeneration, E4 candidates may stay rough one-liners — do not run this gate on every raw candidate. Run it ONLY when promoting an E4 raw candidate into a Route-a candidate card (after overgeneration, warm-up discard, dedupe, and the reverse filters). It adds NO new card fields; the Route-a card keeps its existing 8-field schema.
Before promoting an E4 raw candidate to a candidate card, verify:
source_anchor_read — the far-field paper/artifact was actually read (pointer required; unread = does not count as E4).source_mechanism_core — state the mechanism in source-domain terms in one sentence, before translating into target-domain vocabulary (forces real mechanism extraction, not buzzword relabeling).transfer_structure — the E4 hard field mechanistic_bridge names a mechanism-level structure (invariant / oracle structure / adversary model / feedback loop / evaluation defect), not a surface "X has not been applied to Y".private_or_demand_join — the card's private_signal names the private premise / anomaly / asset-residue that public brainstorming lacks; if none exists, emit a Route-b anomaly-probe instead of a card. Add a one-line demand stub (named actor/system + current cost); if it can only name "researchers / reviewers / the community", do not promote it to a card; emit a Route-b anomaly-probe plan whose target_anomaly is demand evidence. This stub is not a keep/kill decision and does not satisfy the downstream demand gate; it only blocks no-demand E4 mechanisms from becoming cards.falsifiable_transfer — the E4 hard field first_falsifiable_transfer is concrete enough to justify the card's cheapest_killer, not merely an occupancy search.Red flag: if you find yourself attaching a rank / novelty / fit score to an E4 candidate, stop — that is score smuggling from a scoring-tool worldview, and this skill forbids LLM novelty scores (see Ranking).
| Filter | Question | Required artifact |
|---|---|---|
| F1 Tarpit | Could anyone produce this by just sitting down to brainstorm? Is it attractive to many peers? Who tried it before and was the cause of death structural or timing? Can you answer "why now"? | nearest-neighbor sample (the closest existing attempts) |
| F2 Repeated-convergence alarm | Does independent resampling keep converging on this idea? Convergence ⇒ it lies in the population-reachable zone ⇒ assume occupied. | resampling record with independence noted (different prompt, model/persona — else you only proved prompt anchoring) |
| F3 Naked-model probe | Can a bare LLM, given only the public background, derive the core conclusion? If yes ⇒ incremental. | the naked-model transcript |
| F4 Merton premise check | Are ALL premises public and widely diffused? Then treat the idea as already-a-multiple. A singleton requires ≥1 private premise. | pointer to the private premise (which one, where it lives) |
no value → drop · incentives scared people off → conditionally edible (must answer why-now: what changed in the incentive structure, or why it does not apply to you) · missing tool / missing data / new standard / social taboo → edible only with an attack.Route a — candidate exists. Emit a candidate card (all 8 fields required):
generation_path: E1-E7 (which entry; E5 only if converted — name the conversion)
private_signal: the observation/asset/premise others lack
nearest_canon: closest existing work you already know of
density_flag: your honest read: how reachable is this for the population?
cheapest_killer: the cheapest experiment/search that could kill it (same field name as your claim contract)
decisive_artifact: the table/figure/witness that would carry the paper
contribution_hypothesis: method / finding / tool / benchmark / measurement — one
why_not_predictable: one sentence, MUST be bound to evidence (one of: naked-model transcript / nearest_canon distance / private_signal)
→ hand to Phase 0 Converge (gap-disproof, disposition filters, F5 card) or your evaluation flow. Candidate cards are inputs to evaluation, not conclusions.
Route b — no private signal found. Do NOT force paper directions. Emit an anomaly-probe plan (≤3 probes, each 4 fields):
target_anomaly: what kind of surprise this probe could surface
command_or_data_source: the concrete command / dataset / system to poke
budget_cap + expected_signal: cost ceiling and what a positive signal looks like
stop_condition: when to stop (no stop = procrastination exit)
Anomaly-probes (manufacture private observations) are distinct from killer-probes (kill an existing candidate — that is evaluation's G2). Authorization default: $0-first; ask before docker/spend.
Ranking: order by probe priority — cheapest-fastest signal first — never "best idea" (pairwise LLM judging rewards polished familiar ideas); never let an LLM emit a novelty score.
Do not commit in the same session just to have an answer. Apply the island question ("if I were the only person on earth, which would I work on?") and the revisit test (ideas that keep coming back for months beat ideas from last week).
"hot A × hot B, nobody did it" → tarpit suspicion ❙ "searched, found nothing = opportunity" → run the six-way blank diagnosis first ❙ "the novelty score is high" → paper-stage scores systematically inflate; meaningless before a pilot ❙ all candidates from one entry → collapsed, switch entries ❙ a probe without stop_condition ❙ "I feel others wouldn't think of this" without bound evidence ❙ "write related work / check occupancy / validate this idea" → not ideation, route to evaluation
While replaying production traces through a self-built conformance oracle for a protocol with two independent implementations, both implementations accepted a frame sequence the prose spec says to reject (
expected_vs_observed: spec rule R says keep-NEW on duplicate-first-frame; both implementations keep-OLD).artifact_pointer: oracle run + the two parser source files.why_expert_surprised: the implementations are independently maintained, yet share the same deviation — a common-mode misreading, invisible to differential testing. → candidate card:private_signal= the replay corpus + deviation;cheapest_killer= check whether the spec's ambiguity (not the implementations) explains it;why_not_predictable= naked-model probe could not name this rule as divergent (transcript attached). The candidate came from running a tool, not from brainstorming — that is the point of E1.
cheapest_killer is input to evaluation, not the evaluation process.research-workflow:research Phase 0 Converge (disposition filters, gap-disproof, deep novelty check) — this skill generates; Phase 0 and your kill-gates evaluate.Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub dianxiang-sun/research-workflow --plugin research-workflow