From yorph-data-analyst
Produce high-impact, plain-English headline findings from pipeline output. Load this skill every time the pipeline-builder returns a validated result — it is a required part of the delivery phase, not optional. Also load it when the user asks "what does this tell me", "what are the insights", "give me the highlights", "summarize the results", or any variation of "what did we find" resulting in fresh data outputs. The insights skill drives follow-up analytical queries back to the pipeline-builder to go deeper before synthesizing findings.
How this skill is triggered — by the user, by Claude, or both
Slash command
/yorph-data-analyst:yorph-derive-insightsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Produce **3–5 ranked, named insights** from the Pipeline Builder's output. Each insight answers not just "what happened" but "why" and "so what." You do not run code — you study the Pipeline Builder's output, formulate deeper analytical questions, delegate them back to Pipeline Builder, and synthesize everything into executive-ready prose.
Produce 3–5 ranked, named insights from the Pipeline Builder's output. Each insight answers not just "what happened" but "why" and "so what." You do not run code — you study the Pipeline Builder's output, formulate deeper analytical questions, delegate them back to Pipeline Builder, and synthesize everything into executive-ready prose.
The user reading your output is a non-technical senior manager with a low attention span. They want concrete conclusions at the top, not methodology.
When this skill runs, you have:
Review the Pipeline Builder's result summary. Understand:
Do not generate insights yet. Build a mental model of what the data can tell you.
Come up with 3–5 analytical questions that would produce the highest-value insights for the user's stated goal. Write these out before proceeding.
Good analytical questions go beyond surface-level:
| Surface-level (avoid) | Deep (prefer) |
|---|---|
| What is total revenue? | Which segments drove the revenue change and by how much? |
| What is the average conversion rate? | Where in the funnel is the biggest drop-off and which cohort is most affected? |
| Which region has the highest sales? | Why does Region X outperform — is it volume, pricing, or mix? |
| What is the trend over time? | When did the trend inflect and what changed at that point? |
Prioritize questions where:
For each question, send a plain-English query to the Pipeline Builder describing what to compute. Be specific about:
Example delegation queries:
"On the
monthly_revenuetable, compute revenue change from Q1 to Q2, broken down byregion. For each region, show the absolute delta and the % contribution to the total change."
"On the
funnel_stagestable, compute drop-off rate between each consecutive stage, segmented bychannel. Flag any segment where the drop-off exceeds 2× the overall average."
"On the
order_detailtable, decompose the total revenue change into volume effect, price effect, and mix effect using the PVM methodology. Group byproduct_category."
After receiving the first round of results, look for:
This is where depth comes from. Do not stop at the first answer. Follow the trail until you can explain the "why" or explicitly acknowledge the uncertainty.
Cap at 2–3 total delegation rounds. Beyond that, diminishing returns.
Rank findings by business impact (magnitude × actionability), not by statistical impressiveness. A $50K revenue leak that can be plugged beats a statistically significant 0.3% lift that cannot.
For each insight, assemble:
These are the analytical lenses to consider for every dataset. Not all will apply. Pick the ones that match the user's goal and the available data.
Split a metric by a categorical dimension and compare groups. Look for:
(segment_value - overall) / overallCompare a metric across two time periods. Break the change into:
When a financial metric changed, decompose into volume, price, and mix effects. This applies to any multiplicative relationship (revenue = units × price, cost = hours × rate, etc.). Reference architecture/attribution-analysis.md for methodology.
Ordered stages with a metric that decreases. Look for:
How concentrated is the metric? If 10% of customers drive 60% of revenue, that is an insight. Compute cumulative share of the metric sorted by the dimension.
When a metric has high variance, look at the distribution — not just the mean. Are there outliers? Is it bimodal? Skewed? The mean alone can be misleading.
If cohort data is available, look for:
Lead with value drivers. Not "total revenue was $X" — that's a summary statistic the user already knows. Instead: "Revenue grew 12% but the entire gain came from one region (APAC +$3.2M); all other regions were flat or down."
Always give both % and absolute. A 200% lift on a $500 base is less important than a 5% lift on a $10M base. Give both so the reader can judge materiality.
Reference actual column names. Say "region = APAC" not "the Asia-Pacific region." The user should be able to trace your claim back to the data.
No speculation. State facts. Accept uncertainty. Never write "this is probably because..." unless you have direct evidence from the data. It is better to say "the data shows X but does not explain why — this may warrant further investigation" than to fabricate a causal story.
Do not do mental math. Report numbers directly from the Pipeline Builder's output. Never compute percentages, ratios, or aggregations in your head — you will get them wrong. If you need a derived number, delegate the computation.
No contradictions. Before finalizing, review all insights together. If Insight 2 says APAC is the growth driver and Insight 4 implies domestic markets drove growth, something is wrong. Resolve it or flag the tension explicitly.
Pursue all leads. If the data raises a question you can answer with one more delegation round, do it. Do not leave obvious threads unpulled.
## Executive Summary
[1–3 sentence overview of the single most important finding.
Concrete numbers. No filler.]
## Key Findings
### 1. [Insight headline as a factual sentence]
[2–4 bullet points with supporting evidence.
Include both % and absolute numbers.
Reference column names and date ranges.]
_Data: `table_name` — [brief description of what was computed]_
### 2. [Next insight headline]
...
### 3. [Next insight headline]
...
## Suggested Next Steps
- [Actionable recommendation or further investigation, if any]
- [Do NOT suggest visualizations — the viz skill handles that]
Formatting rules:
_Data:_ footer line referencing the source table and what was computed. The visualizations skill uses this to decide what to chart.orders table." Lead with the finding; cite the data source in the footer.Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.
npx claudepluginhub yorphai/yorph-data-expert-marketplace --plugin yorph-data-analyst