From the-pipeline
Pressure-test a product idea or PR/FAQ with real research and sharpen it into a crisp problem statement and a defensible, falsifiable bet. Takes the future press release + FAQ produced by the product-idea skill (or any idea) and runs autonomous deep research to produce three artifacts — a research synthesis, an opportunity map, and assumption tests — then applies an exit gate: the problem and the bet must be sharp. Use this skill whenever the user wants to validate, de-risk, or pressure-test an idea or PR/FAQ; run product discovery; figure out whether a real problem and a winnable bet underpin a concept; build an opportunity map or opportunity-solution tree; identify and test the riskiest assumptions; or decide whether an idea is worth pursuing. Trigger on phrases like "is this idea real," "validate this," "pressure-test the PR/FAQ," "do discovery on this," "what are the riskiest assumptions," "map the opportunity," "should we build this," or when handed a PR/FAQ to scrutinize.
How this skill is triggered — by the user, by Claude, or both
Slash command
/the-pipeline:discoveryThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Writing the PR/FAQ (the `product-idea` skill) is the easy part — it's a vision, and a
Writing the PR/FAQ (the product-idea skill) is the easy part — it's a vision, and a
vision can be wrong. Discovery is where the vision meets reality. It takes the aspirational
press release and its internal FAQ — full of confident claims about the customer, the
problem, the TAM, the unit economics, and the risks — and asks the uncomfortable question:
is any of this actually true, and is it the sharpest version of the bet we could make?
Discovery's job is not to produce more documents. Its job is to produce a decision backed by evidence: proceed, refine, pivot, or kill. The output is only "done" when two things are sharp:
The discipline that makes this work: separate evidence from inference, prioritize the problem before the solution, and test the riskiest assumption first with the cheapest experiment. A bet without a falsification condition is a hope, not a bet.
Read references/discovery-methods.md for the full frameworks, formulas, and templates
before your first run, and whenever you need the exact opportunity-score math, the
assumptions-map quadrants, or the experiment-card format.
product-idea (PR/FAQ) → discovery (this skill) → a sharp problem + bet that either
sends you back to revise the PR/FAQ or forward to build. Discovery normally runs on a
PR/FAQ, but it works on any raw idea — if there's no PR/FAQ, treat the user's description
as the vision and proceed.
If the input is a PR/FAQ, read it in full (a file path, a pasted doc, or the most recent
product-idea output). Extract every load-bearing claim into a short claims ledger —
this is what discovery will test:
If the input is a bare idea, build the same ledger from the user's description and your own framing, marking everything as unverified.
State the ledger briefly so the user can see what you're about to scrutinize.
Run autonomous deep research to confirm, challenge, or refine the claims ledger. See the Research tooling section below for what to invoke. Scope the research to the highest-stakes claims first (problem reality, market size, competitive alternatives, "why now") rather than boiling the ocean.
Structure the synthesis (full section list in the reference):
The single most important discipline here: cite every factual claim, and never let an inference masquerade as a fact. Triangulate — when multiple independent sources point the same way, confidence rises. Where the PR/FAQ's invented numbers turn out wrong, say so plainly; that's discovery working.
Build an opportunity-solution tree, then score it. Purpose: confirm the PR/FAQ is chasing the sharpest opportunity, or surface a better-adjacent one.
Importance + max(Importance − Satisfaction, 0) (each rated 1–10; ≥15 highly attractive
/ under-served, 10–15 attractive, <10 over-served). Deliberately exclude effort at
this stage — any opportunity has both easy and hard solutions. Also weigh sizing,
market/competitive position, and strategic fit.Output the tree (as a nested list) plus a scored opportunity table and a one-line verdict: is the PR/FAQ's bet on the best-scoring opportunity, or should it pivot?
Harvest every assumption the bet depends on — from the claims ledger, the synthesis's open questions, and the internal FAQ. Classify each across the risk types: desirability / value, usability, feasibility, business viability, ethical.
Map them on the assumptions 2×2 (Y = importance: if wrong, does the idea fail?; X = evidence: do we have observable evidence, not opinion?). The top-right (important + no evidence) assumptions are the leap-of-faith ones — these get tested first.
For each top-right assumption, write an experiment card (full template in the reference): assumption → XYZ hypothesis ("at least X% of [Y] will [Z]") → test method (interview / landing page / fake door / concierge / Wizard of Oz / pre-sale) → metric → pass/fail threshold set before running → cost & time → evidence strength.
Because real customer testing isn't possible from here, also estimate how each assumption would most likely resolve based on the research synthesis — give a directional call (likely true / uncertain / likely false) with an explicit confidence flag and the evidence behind it. Be honest where the estimate is weak; the estimate never replaces the test, it prioritizes it. Prefer experiments that measure real behavior (a click, a payment) over stated intent, and tell the human how to run interviews truthfully (Mom Test: ask about past specifics, never pitch, listen more than you talk).
Synthesize everything into the discovery brief: a sharp problem statement and a sharp bet, using the templates in the reference. The bet must include the single riskiest assumption and a falsification condition (metric, threshold, date).
Then run the "is it sharp enough to exit discovery?" checklist (full version in the reference) — problem-as-need not solution, named target customer, first-hand-ish evidence with confidence, measurable outcome, prioritized opportunity with rationale, explicit point of view, named riskiest assumption, falsification condition, riskiest assumption tested or cheaply testable with real behavior.
Close with an explicit verdict and the handoff:
product-idea PR/FAQ (e.g. wrong
customer, inflated TAM, weak differentiation) and note what changed.Run discovery the way an elite, multidisciplinary research team would — never one search summarized off the first page. Fan out multiple independent threads in parallel, each owned by a different specialist lens and pulling from a different class of source, then converge, triangulate, and adversarially verify. Breadth across domains and source types is what separates a sharp, defensible bet from a confident guess. The orchestration below is the same in every environment; only the tools change.
Scope each thread to a sharp question, never the whole idea. At minimum cover the threads that bear on the load-bearing claims:
Add domain-specific threads whenever the bet lives in a vertical (e.g. clinical evidence for health, on-chain & market data for fintech, scientific literature for deep tech, procurement cycles for govtech). An elite team brings in the right specialist per domain.
General web search · academic & preprint (arXiv, SSRN, scholar) · patents · primary competitor docs / pricing / changelogs / job posts · industry & analyst reports · news & recency scans · practitioner & expert long-form · community & review sites · and domain data sources (financial, scientific, government/statistics) when relevant. A claim that survives across multiple modalities is far stronger than one that appears in many pages of the same kind.
Use the strongest stack available — the thread plan above does not change.
In Claude Code (full power — orchestrate a real fan-out):
deep-research skill — the workhorse for each major thread. Hand it ONE sharply
scoped question per thread (problem reality, sizing, competition, frontier-tech,
why-now) — never the whole idea. It fans out searches, fetches sources, adversarially
verifies, and returns a cited report. Run several in parallel, one per thread, like a
team working concurrently.Workflow fan-out — spawn multiple ce-web-researcher
agents (or author a Workflow that pipelines thread → verify) so threads run
concurrently and independently. Give each agent a distinct lens so they don't
converge on the same handful of sources — diversity of angle is the point.jina — parallel_search_web for broad multi-query sweeps, search_arxiv /
search_ssrn for the academic/frontier thread, parallel_read_url to pull many primary
sources at once, search_images / screenshots for product teardowns.firecrawl — crawl and scrape competitor sites, docs, and pricing pages for
primary-source numbers rather than secondary summaries.last30days — the recency / "why now" thread in fast-moving markets.WebSearch / WebFetch — targeted gap-filling and reading specific pages.Default to running the threads concurrently (parallel agents or a Workflow), then
synthesize once all return — that is what makes this an elite team rather than one
researcher doing serial lookups.
In Claude Cowork / Claude Desktop:
jina, firecrawl, or
domain-data servers) via Customize → Connectors — this is how you recover
academic/parallel-read/crawl power and domain data.deep-research, last30days,
ce-web-researcher, Workflow) do not exist here. Replicate their effect by running
the thread plan yourself across many native searches plus connectors, and keep the exact
same triangulation and verification discipline.Anywhere else (API / Agent SDK): use the WebSearch tool plus any attached MCP research
servers, following the same thread plan.
If no research tooling is reachable at all, say so explicitly, proceed from reasoning and the user's input, mark the synthesis confidence Low throughout, and tell the user exactly which searches would raise it.
Write four Markdown files into a discovery/ folder next to the input (or a sensible
project location), so the artifacts can be reviewed and handed off independently:
00-discovery-brief.md — the tying-together brief: the sharp problem statement, the
sharp bet (with falsification condition), the exit-gate checklist, and the verdict.
This is the file that hands back to product-idea or forward to build.01-research-synthesis.md02-opportunity-map.md03-assumption-tests.mdThe brief should stand on its own; the other three are its evidence. Render the brief inline for quick one-offs if the user prefers not to create files.
Before handing over, reread with a cold eye for the discovery anti-patterns (expanded in the reference):
references/discovery-methods.md — the full frameworks and reusable templates: Teresa
Torres opportunity-solution trees, Marty Cagan's four big risks, the Lean Startup /
Testing Business Ideas assumptions map and experiment library + experiment-card template,
JTBD/ODI desired-outcome statements and the opportunity-score formula, the Mom Test rules,
the sharp-problem and sharp-bet templates, the exit-gate checklist, and the anti-pattern
table with counters. Read it before your first run.Provides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.
npx claudepluginhub zone17/the-pipeline --plugin the-pipeline