From epic
Self-assesses AI usage quality across 5 dimensions (thought amplification, self-improvement, etc.) using session metrics and memory data. Invoke via /reflect.
How this skill is triggered — by the user, by Claude, or both
Slash command
/epic:reflectThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill is for **you** (the human) to reflect on how well you're leveraging AI as a thought amplifier — not a review of agent performance.
This skill is for you (the human) to reflect on how well you're leveraging AI as a thought amplifier — not a review of agent performance.
Data source: The reflect hook (session-end) automatically collects observations, analyzes patterns, and updates metrics.json. This skill consumes that hook-produced data to produce a human-readable self-assessment.
Hook (auto) Skill (on-demand /reflect)
───────────── ──────────────────────────
observe → obs/*.jsonl ──→ epic reflect --context 30
evolve → metrics.json ──→ 5-dimension scorecard
seed → evolved skills ──→ Action items for the human
ingest → memory graph ──→ Trend analysis
No score without evidence. Every rating must directly cite at least one of: obs stats, evolution patterns, memory nodes, or session summaries. Block self-serving bias: "doing well" conclusions require concrete metrics.
# Uses Rust subcommand — works on all platforms (Linux, macOS, Windows)
epic reflect --context 30 > /tmp/reflect_ctx.json
Fallback if subcommand fails:
echo "obs_files: $(ls "$HARNESS_DIR/obs/" | wc -l)"
python3 -c "import json; m=json.load(open('$HARNESS_DIR/metrics.json')); print('total_sessions:', m.get('total_sessions',0))"
Query memory (if active):
epic mem recall "AI usage patterns decisions metacognition" --limit 8
epic mem list --type decision --limit 5
epic mem list --type pattern --limit 5
Score each dimension independently: 1–10 + evidence citation + one-line diagnosis.
Question: Is AI a mere executor (code typist) or a genuine thought partner?
Metrics:
Agent / total_obs — higher = delegated thinking)Scoring:
| Score | Signal |
|---|---|
| 8–10 | Agent delegation ≥ 5%, diverse skill usage, council execution recorded |
| 5–7 | Mostly Bash/Read/Edit, occasional Agent, complex decisions made solo |
| 1–4 | Bash+Edit 90%+, AI at autocomplete level |
Question: Learning from mistakes, or repeating the same patterns?
Metrics:
evolution_stats.pattern_frequency — recurring patterns?evolution_stats.stagnation_count — stagnant sessionsevolution_stats.trend_last10 — improving/stable/declining distributionScoring:
| Score | Signal |
|---|---|
| 8–10 | Trend improving ≥ 60%, no pattern recurrence, evolved skills growing |
| 5–7 | Mostly stable trend, some pattern repeats, evolved skills plateau |
| 1–4 | Trend declining or frequent stagnation, same mistakes repeated |
Question: Are conversations with AI helping recognize and upgrade your own thinking?
Metrics:
/discover /spec execution history (problem-framing practice)Scoring:
| Score | Signal |
|---|---|
| 8–10 | Regular decision nodes, /discover /spec used, ADRs exist |
| 5–7 | Intermittent recording, decision rationale only in code, no explicit notes |
| 1–4 | Almost no memory nodes, context breaks between sessions |
Question: Is prompt quality improving over time?
Metrics:
output_quality dimension average trend (metrics score_history)tool_success rate trendScoring:
| Score | Signal |
|---|---|
| 8–10 | output_quality ≥ 0.80, score trend upward, evolved prompts increasing |
| 5–7 | output_quality 0.65–0.80, improvement plateau |
| 1–4 | output_quality < 0.65, downward trend, frequent re-edits |
Question: Achieving the same goals faster and cheaper through AI?
Metrics:
execution_cost dimension average (1.0 = optimal)Scoring:
| Score | Signal |
|---|---|
| 8–10 | execution_cost ≥ 0.90, parallel Agent usage, compaction < 20% of sessions |
| 5–7 | Bash 40–60%, mostly single-agent serial execution |
| 1–4 | Bash 70%+, no sub-agent usage, trivial tasks delegated to AI |
## AI Thought-Amplifier Reflection Report
Generated: {ISO-8601} | Window: {N} days | Total sessions: {total_sessions}
| Dimension | Score | Grade | Key Evidence |
|----------------------|-------|-------|--------------|
| Thought Amplification | X/10 | 🔴/🟡/🟢 | Agent {N}x ({P}%), Skill {M}x |
| Self-Improvement | X/10 | 🔴/🟡/🟢 | trend={T}, stagnation={S}x |
| Metacognitive Expansion | X/10 | 🔴/🟡/🟢 | decisions {D}, session notes {M} |
| Prompt Engineering | X/10 | 🔴/🟡/🟢 | output_quality={Q}, trend={Δ} |
| Execution Efficiency | X/10 | 🔴/🟡/🟢 | execution_cost={C}, Bash%={B} |
| **Overall** | **X/10** | | |
Grade: 🟢 8–10 (good) 🟡 5–7 (fair) 🔴 1–4 (needs improvement)
3–5 sentences. Rules:
Format: [Priority] Title — Concrete action — Expected impact
### Next Reflection Actions
1. [HIGH] {title}
- Action: {concrete steps}
- Metric: {how to measure}
- Deadline: {session count or date}
2. [MED] ...
3. [LOW] ...
Recommended action pool (select by low-scoring dimensions):
/evolve history periodic review, manual pattern notesepic mem add --type decision after every important decisionSave reflection to memory (if mem tools active):
epic mem add \
--type session \
--title "AI usage reflection {date}" \
--tags "reflection,metacognition" \
--importance 0.8 \
--body "Overall: {score}/10. Lowest: {lowest_dim}. Top action: {top_action}"
| Excuse | Rebuttal | What to do instead |
|---|---|---|
| "Too few sessions for accurate reflection" | Even 3 sessions reveal patterns. Insufficient data is not an excuse for a high score. | Reflect on available data; add data collection improvement as an action item. |
| "Bash-heavy because it's a Rust project" | Tool usage distribution doesn't only reflect task type. Check for Bash calls that could be delegated to Agent. | Find ≥ 3 Bash calls that could be replaced by Agent delegation. |
| "Score 0.75 is good enough" | 0.75 is not an absolute standard. Trend vs previous period matters more. | Compare last 5-session average vs prior 5-session average in score_history. |
| "No memory needed — context is enough" | Context dies at session end. Without cross-session learning continuity, you start from scratch every time. | Start the habit of epic mem add after every important decision — now. |
| "Code output is already good, so it's fine" | Code output quality ≠ thought amplification. Good code can still mean AI is doing the thinking for you. | Count how many decisions in the last 5 sessions you actually designed yourself. |
/tmp/reflect_ctx.json or manually collected data existsnpx claudepluginhub epicsagas/epic-harness --plugin epic-harnessGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.