Skill

ideation

Use when generating NEW research direction candidates: brainstorm research directions, 发散 / 没 idea / 找新方向 / 帮我想方向 / ideation, or when an existing scout arc needs fresh candidates. Produces candidate cards — or anomaly-probe plans when no private signal exists. NOT for: evaluating an existing candidate (use research-workflow:research Phase 0 Converge or your evaluation flow), systematic literature review (research-workflow:research), writing related work (writing phases of research-workflow:research), novelty/occupancy checking or validating a known idea (your evaluation/scout flow) — unless the user explicitly asks to re-diverge.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/research-workflow:ideation

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

> **Why this skill exists (worldview, 3 lines).** Predictable ideas are pre-occupied ideas: everyone (and everyone's LLM) samples from roughly the same public distribution, and "hot-topic × gap" reasoning is objective search on a shared landscape — the whole field converges to the same local optima, so collisions are an algorithmic property, not bad luck (empirically: 4,000 LLM-generated ideas ...

SKILL.md

117 lines · ~3.5k tokens

Stats

LanguagePython

Stars0

MaintenanceExcellent

Last CommitJun 15, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Ideation — Low-Density Research Direction Generation

Why this skill exists (worldview, 3 lines). Predictable ideas are pre-occupied ideas: everyone (and everyone's LLM) samples from roughly the same public distribution, and "hot-topic × gap" reasoning is objective search on a shared landscape — the whole field converges to the same local optima, so collisions are an algorithmic property, not bad luck (empirically: 4,000 LLM-generated ideas dedupe to ~5%; idea-level novelty scores collapse after execution). The core move of this skill is therefore: generate from private signals; when no private signal exists, generate probes that manufacture private observations — do not force out paper directions.

Trigger boundary

Positive: diverge / no ideas left / find a new direction / help me think of directions / brainstorm research directions / re-diverge after a kill.
Negative (route elsewhere): evaluate this candidate → Phase 0 Converge or your evaluation flow · systematic literature survey → research-workflow:research · write related work · "has anyone done X?" (novelty checking ≠ generation) · "validate this idea" — unless the user explicitly asks to re-diverge.

Entry points (descending privacy of signal; each has HARD FIELDS — missing fields = does not count as that entry)

Entry	Source	Hard fields (all required)
E1 Anomaly-driven	surprises you personally produced: tool/experiment output that contradicted expectation · deployed systems doing badly at something · incident reports / audit-contest escalations / maintainer pain / standards ambiguities (incident mining) · two communities asserting contradictory things · spec-vs-implementation divergence	`expected_vs_observed` · `artifact_pointer` (command or file) · `why_expert_surprised` — no hindsight dressing: the surprise must predate the idea
E2 Private-asset residue	unexplained leftovers of assets only you hold (tools you built, data you collected, salvaged assets of dead directions)	the full chain `asset → unexplained_observation → why_public_corpus_cannot_infer_it` — all three links, else you are laundering a tool into a moat
E3 Assumption flip / contrarian	implicit assumptions in prior work; accepted trade-offs (TRIZ-style contradiction pump: "what would dissolve this trade-off?"); goal-driven ("make X work for the first time", then derive the questions nobody asks)	`consensus_anchor` (who/which paper states the consensus) · `who_disagrees` · `falsifiable_reversal` — else it is a slogan, not a flip
E4 Cross-domain structural transfer	a mechanism (invariant, oracle structure, adversary model) imported from a distant field; exogenous random anchors (low-co-occurrence concept pairs from your corpus, or a random far-field paper × your domain problem)	`mechanistic_bridge` (mechanism-level, not surface analogy) · `first_falsifiable_transfer` (the first concrete claim the transfer makes testable)
E6 Failure archaeology	abandoned repos · dead PRs · withdrawn standards proposals · failed benchmarks	`why_failed_then` · `why_now` (what changed that disarms the old cause of death)
E7 Time window	a fresh change (new spec revision, new protocol, new regulation) not yet digested by the literature	`changed_artifact` + exact date · `why_literature_has_not_absorbed_it` (freshness proof — else it decays into a hot topic)
E5 Pure hot-topic combination	"hot A × hot B, nobody did it"	deprecated by default. Must be CONVERTED into one of the entries above (find the private signal, the contradiction, or the failure-archaeology angle). It may NOT pass by merely surviving the filters below — self-laundered hot combos are the canonical recurring failure mode.

For each E1 anomaly: write ≥3 candidate explanations, then order probes by economy — test the cheapest-to-eliminate explanation first.

Hard-field artifacts must be verified at card time — E1/E2: open the file / re-run the command; E3/E4/E6/E7: actually read the anchoring source. If a private signal is later corrected or reversed, every candidate built on it re-enters the filters (downgrade and re-derive; do not patch in place).

E4 Promotion Mini-Gate (card-time only)

This is NOT a generation-time checklist. During overgeneration, E4 candidates may stay rough one-liners — do not run this gate on every raw candidate. Run it ONLY when promoting an E4 raw candidate into a Route-a candidate card (after overgeneration, warm-up discard, dedupe, and the reverse filters). It adds NO new card fields; the Route-a card keeps its existing 8-field schema.

Before promoting an E4 raw candidate to a candidate card, verify:

source_anchor_read — the far-field paper/artifact was actually read (pointer required; unread = does not count as E4).
source_mechanism_core — state the mechanism in source-domain terms in one sentence, before translating into target-domain vocabulary (forces real mechanism extraction, not buzzword relabeling).
transfer_structure — the E4 hard field mechanistic_bridge names a mechanism-level structure (invariant / oracle structure / adversary model / feedback loop / evaluation defect), not a surface "X has not been applied to Y".
private_or_demand_join — the card's private_signal names the private premise / anomaly / asset-residue that public brainstorming lacks; if none exists, emit a Route-b anomaly-probe instead of a card. Add a one-line demand stub (named actor/system + current cost); if it can only name "researchers / reviewers / the community", do not promote it to a card; emit a Route-b anomaly-probe plan whose target_anomaly is demand evidence. This stub is not a keep/kill decision and does not satisfy the downstream demand gate; it only blocks no-demand E4 mechanisms from becoming cards.
falsifiable_transfer — the E4 hard field first_falsifiable_transfer is concrete enough to justify the card's cheapest_killer, not merely an occupancy search.

Red flag: if you find yourself attaching a rank / novelty / fit score to an E4 candidate, stop — that is score smuggling from a scoring-tool worldview, and this skill forbids LLM novelty scores (see Ranking).

Generation discipline

Each round must draw from ≥2 different entry points.
Overgenerate (≥15 raw candidates) → embedding-dedupe → watch the unique-survivor curve: when it plateaus, the current entry/prompt distribution is exhausted — switch entry points, do not keep sampling.
Discard the first 3 candidates as warm-up (serial-order effect: later candidates are measurably more original).
Vary personas/models across rounds where available; identical convergence across runs is itself a signal (see F2).

Reverse filters (run immediately after generation, BEFORE any occupancy search; artifact-mandatory — self-report does not count as passing)

Filter	Question	Required artifact
F1 Tarpit	Could anyone produce this by just sitting down to brainstorm? Is it attractive to many peers? Who tried it before and was the cause of death structural or timing? Can you answer "why now"?	nearest-neighbor sample (the closest existing attempts)
F2 Repeated-convergence alarm	Does independent resampling keep converging on this idea? Convergence ⇒ it lies in the population-reachable zone ⇒ assume occupied.	resampling record with independence noted (different prompt, model/persona — else you only proved prompt anchoring)
F3 Naked-model probe	Can a bare LLM, given only the public background, derive the core conclusion? If yes ⇒ incremental.	the naked-model transcript
F4 Merton premise check	Are ALL premises public and widely diffused? Then treat the idea as already-a-multiple. A singleton requires ≥1 private premise.	pointer to the private premise (which one, where it lives)

Safety recipe (for survivors)

Conventional core + single atypical insertion — do not go fully weird; ground the method/evaluation in solid convention and inject exactly one rare combination (empirically ~2× hit rate).
Attack check (Hamming): contrarian without a reasonable attack is a dead idea — only file candidates you have an entry wedge into.
Blank-area six-way diagnosis — why is this low-density area empty? no value → drop · incentives scared people off → conditionally edible (must answer why-now: what changed in the incentive structure, or why it does not apply to you) · missing tool / missing data / new standard / social taboo → edible only with an attack.

Exit (two routes)

Route a — candidate exists. Emit a candidate card (all 8 fields required):

generation_path:      E1-E7 (which entry; E5 only if converted — name the conversion)
private_signal:       the observation/asset/premise others lack
nearest_canon:        closest existing work you already know of
density_flag:         your honest read: how reachable is this for the population?
cheapest_killer:      the cheapest experiment/search that could kill it (same field name as your claim contract)
decisive_artifact:    the table/figure/witness that would carry the paper
contribution_hypothesis: method / finding / tool / benchmark / measurement — one
why_not_predictable:  one sentence, MUST be bound to evidence (one of: naked-model transcript / nearest_canon distance / private_signal)

→ hand to Phase 0 Converge (gap-disproof, disposition filters, F5 card) or your evaluation flow. Candidate cards are inputs to evaluation, not conclusions.

Route b — no private signal found. Do NOT force paper directions. Emit an anomaly-probe plan (≤3 probes, each 4 fields):

target_anomaly:          what kind of surprise this probe could surface
command_or_data_source:  the concrete command / dataset / system to poke
budget_cap + expected_signal: cost ceiling and what a positive signal looks like
stop_condition:          when to stop (no stop = procrastination exit)

Anomaly-probes (manufacture private observations) are distinct from killer-probes (kill an existing candidate — that is evaluation's G2). Authorization default: $0-first; ask before docker/spend.

Ranking: order by probe priority — cheapest-fastest signal first — never "best idea" (pairwise LLM judging rewards polished familiar ideas); never let an LLM emit a novelty score.

Slow-selection hygiene

Do not commit in the same session just to have an answer. Apply the island question ("if I were the only person on earth, which would I work on?") and the revisit test (ideas that keep coming back for months beat ideas from last week).

Red flags (stop on sight)

"hot A × hot B, nobody did it" → tarpit suspicion ❙ "searched, found nothing = opportunity" → run the six-way blank diagnosis first ❙ "the novelty score is high" → paper-stage scores systematically inflate; meaningless before a pilot ❙ all candidates from one entry → collapsed, switch entries ❙ a probe without stop_condition ❙ "I feel others wouldn't think of this" without bound evidence ❙ "write related work / check occupancy / validate this idea" → not ideation, route to evaluation

Worked example (anonymized, E1)

While replaying production traces through a self-built conformance oracle for a protocol with two independent implementations, both implementations accepted a frame sequence the prose spec says to reject (expected_vs_observed: spec rule R says keep-NEW on duplicate-first-frame; both implementations keep-OLD). artifact_pointer: oracle run + the two parser source files. why_expert_surprised: the implementations are independently maintained, yet share the same deviation — a common-mode misreading, invisible to differential testing. → candidate card: private_signal = the replay corpus + deviation; cheapest_killer = check whether the spec's ambiguity (not the implementations) explains it; why_not_predictable = naked-model probe could not name this rule as divergent (transcript attached). The candidate came from running a tool, not from brainstorming — that is the point of E1.

Integration

Keep an ideation ledger — a persistent file accumulating E1-grade anomalies as they happen (fields: date / expected-vs-observed / artifact_pointer / why-surprising / source project / status raw·used·promoted·stale / reuse_pointer / next_probe). Location is yours; check it at the start of every ideation session. Ideas grow from the ledger, not from the news feed.
If you maintain a deaths/graveyard ledger for killed directions, read it before generating (same-shape deaths first; salvaged assets of dead directions are E2 seeds).
If you maintain a tagged corpus/knowledge base, use it for E4 random anchors (low-co-occurrence cells) and F1 nearest-neighbor samples.
Exit handshake: once a candidate card exists, switch to your evaluation flow and open its ledger BEFORE any evaluation effort (occupancy deep-reads, gates, probes) — evaluating inside the ideation transcript loses the audit trail. The card's cheapest_killer is input to evaluation, not the evaluation process.
Write outcomes back: when a candidate card is killed/parked downstream, update its ledger row status (promoted→killed/parked) in the same session.
Downstream: candidate cards → research-workflow:research Phase 0 Converge (disposition filters, gap-disproof, deep novelty check) — this skill generates; Phase 0 and your kill-gates evaluate.

ideation

Invocation

Context Preview

SKILL.md

ideation

Invocation

Context Preview

SKILL.md

Ideation — Low-Density Research Direction Generation

Trigger boundary

Entry points (descending privacy of signal; each has HARD FIELDS — missing fields = does not count as that entry)

E4 Promotion Mini-Gate (card-time only)

Generation discipline

Reverse filters (run immediately after generation, BEFORE any occupancy search; artifact-mandatory — self-report does not count as passing)

Safety recipe (for survivors)

Exit (two routes)

Slow-selection hygiene

Red flags (stop on sight)

Worked example (anonymized, E1)

Integration

Similar Skills

Ideation — Low-Density Research Direction Generation

Trigger boundary

Entry points (descending privacy of signal; each has HARD FIELDS — missing fields = does not count as that entry)

E4 Promotion Mini-Gate (card-time only)

Generation discipline

Reverse filters (run immediately after generation, BEFORE any occupancy search; artifact-mandatory — self-report does not count as passing)

Safety recipe (for survivors)

Exit (two routes)

Slow-selection hygiene

Red flags (stop on sight)

Worked example (anonymized, E1)

Integration

Similar Skills