From claude-impl-tools
Logs cross-project outcomes and recalls lessons to inform new sessions, avoiding past mistakes. Analyzes skill executions for better routing. Use /memento modes: log, global recall, health, route.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-impl-tools:mementoThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Inspired by [Memento-Skills (arXiv:2603.18743)](https://arxiv.org/abs/2603.18743) — "Let Agents Design Agents"
Inspired by Memento-Skills (arXiv:2603.18743) — "Let Agents Design Agents"
Core idea: Skills evolve through execution experience, not just manual editing. LLM parameters stay frozen; only SKILL.md files and routing knowledge change.
Cross-project knowledge memory with two primary purposes:
Skill intelligence is secondary — analyzing execution data to improve future routing:
Invoke with: /memento <mode>
| Mode | Purpose | When |
|---|---|---|
log | Record what happened — successes, failures, decisions | After any significant outcome |
global recall <topic> | Retrieve cross-project learnings on a topic | Session start, before new work |
global search <query> | Full-text search across all projects | Need specific past experience |
These modes analyze skill execution data to improve future routing. They are means to improve recommendations, not the primary purpose.
| Mode | Purpose | When |
|---|---|---|
route <task> | Recommend best skill with experience weighting | workflow-guide static rules insufficient |
health | Skill ecosystem dashboard | Periodic review |
reflect <skill> | Failure pattern analysis + improvement suggestions | Skill underperforms repeatedly |
profile <skill> | Detailed execution history for one skill | Before modifying or deprecating |
harness <skill> | Auto-generate deterministic guardrails from failures | Recurring failures caught by code |
| Mode | Purpose |
|---|---|
global health | Unified ecosystem dashboard across all projects |
global sync | Sync MEMORY.md files → DuckDB |
global sql <query> | Direct SQL on unified experience store |
logRecord what happened after a skill ran. This is the foundation — without experience data, routing and reflection have nothing to learn from.
Read the experience schema from references/experience-schema.md, then create an entry:
# Append to the project's experience store
STORE="${PROJECT_ROOT}/.claude/memento/experience.jsonl"
mkdir -p "$(dirname "$STORE")"
Observe these signals to judge success. Don't ask the user — infer from context:
| Signal | Interpretation | Confidence |
|---|---|---|
| User proceeds to next task | success | high |
| Explicit positive ("좋아", "perfect") | success | very high |
| Quality gate passed (/checkpoint, /audit) | success | very high |
| Same skill re-invoked immediately | partial | medium |
| User corrects output or says "아니" | failure | high |
| Quality gate failed | partial | high |
| Session ends without feedback | unknown — exclude from stats | low |
The ideal setup: a PostToolUse hook on Skill invocations that writes experience entries automatically. See references/hook-setup.md for the hook configuration. Until automated, log manually after significant skill runs.
route <task description>Recommend the best skill using a 3-layer scoring system that improves as experience accumulates.
Layer 1: Rule-based matching (existing workflow-guide logic)
↓ produces rule_scores: {skill: 0.0-1.0}
Layer 2: Experience-based matching
↓ find similar past tasks in experience.jsonl
↓ compute success_rate per skill, weighted by:
↓ - recency (recent experiences count more, decay=0.95)
↓ - similarity (closer task signatures count more)
↓ produces exp_scores: {skill: 0.0-1.0}
Layer 3: Blend
↓ alpha = min(0.7, 0.3 + experience_count * 0.04)
↓ → starts at 0.3 (rules dominate)
↓ → grows to 0.7 (experience dominates at 10+ data points)
↓ final_score = (1-alpha)*rule_score + alpha*exp_score
Cold start is handled gracefully: with zero experience, alpha=0.3 and exp_score=0.5 (neutral), so the existing workflow-guide rules drive routing. As experience accumulates, data gradually takes over.
From the user's request, extract:
Use these dimensions for similarity matching against past experiences.
Present the recommendation with confidence and evidence:
📊 Memento Route: maintenance (confidence: 0.87)
Rule match: 0.82 (source_code=yes, intent=bugfix)
Experience: 0.91 (7 similar tasks, 6 succeeded with /maintenance)
Blend α: 0.58 (experience weight, based on 7 data points)
Alternative: /agile iterate (0.64)
Recent similar:
• 2026-03-25 auth middleware fix → /maintenance → success
• 2026-03-22 payment validation → /maintenance → success
• 2026-03-20 cross-domain refactor → /maintenance → partial
healthDisplay a skill ecosystem dashboard. Read all experience data, generate profiles, and present:
For each skill with experience data, auto-generate a profile. See references/skill-profile-schema.md for the schema. Profiles include:
Store profiles in .claude/memento/profiles/<skill-name>.json.
reflect <skill-name>Analyze why a skill underperforms and suggest concrete improvements.
When confidence is high (>0.8) and the pattern is clear:
/autoresearch-skills with the new test caseWhen confidence is medium (0.5-0.8):
When confidence is low (<0.5):
🔍 Memento Reflect: maintenance
Evidence: 12 runs analyzed (8 success, 3 partial, 1 failure)
Pattern detected: cross-domain failures
• 3/3 cross-domain tasks resulted in partial or failure
• All succeeded for single_file and multi_file scales
Hypothesis: SKILL.md lacks guidance for cross-domain impact analysis.
The 5-stage ITIL process jumps to modification without checking
cross-domain dependencies first.
Proposed fix: Add "Step 0: Run /impact for cross-domain changes"
before Stage 3 (Safe Modification).
Confidence: 0.75 (3 consistent data points)
Recommendation: Present to user for approval before modifying.
profile <skill-name>Show the complete execution history and statistics for one skill.
Read references/skill-profile-schema.md for the data structure. Display:
harness <skill-name>Inspired by AutoHarness (arXiv:2603.03329) — LLM agents that generate their own guardrail code.
Analyze failure patterns and generate validation scripts that prevent recurring failures. The key insight from AutoHarness: a small model + code guardrails beats a large model without them.
| Type | File | Purpose | LLM needed at runtime? |
|---|---|---|---|
| Pre-check | scripts/harness/pre_check.sh | Validate environment before skill runs | No |
| Action-verifier | scripts/harness/verify_action.py | Check proposed actions are valid | No |
| Post-verify | scripts/harness/post_verify.sh | Confirm skill achieved its goal | No |
1. Read experience.jsonl for target skill
2. Group failures by root cause:
- Missing prerequisites (file not found, tool not installed)
- Invalid actions (wrong file modified, forbidden operation)
- Incomplete results (partial output, missing verification)
3. For each failure pattern, generate a validation script:
- pre_check.sh: catches prerequisite failures
- verify_action.py: catches invalid action patterns
- post_verify.sh: catches incomplete results
4. Test the harness against past failure cases
5. If it would have caught the failures → install to skill's scripts/harness/
If maintenance skill fails 3/3 times on cross-domain changes because it doesn't check dependencies first:
# scripts/harness/pre_check.sh (auto-generated)
#!/bin/bash
# Harness: cross-domain dependency check
# Generated from 3 failure cases (2026-03-25, 03-22, 03-20)
CHANGED_FILES=$(git diff --name-only HEAD 2>/dev/null)
DOMAINS=$(echo "$CHANGED_FILES" | sed 's|/.*||' | sort -u | wc -l)
if [ "$DOMAINS" -gt 1 ]; then
echo "HARNESS_WARN: Cross-domain change detected ($DOMAINS domains)."
echo "HARNESS_SUGGEST: Run /impact first to map dependencies."
exit 1
fi
exit 0
skill-name/
├── SKILL.md
└── scripts/
└── harness/ ← Auto-generated by /memento harness
├── pre_check.sh ← Runs before skill, exit 1 = block
├── verify_action.py ← Validates proposed actions
└── post_verify.sh ← Runs after skill, exit 1 = warn
Harness scripts are deterministic (no LLM calls). They're the cheapest possible guardrails.
See references/harness-generation.md for the full generation algorithm and templates.
global <subcommand>Unified Experience Store powered by DuckDB. Breaks project silos. 22개 프로젝트의 66개 메모리 파일을 하나의 SQL DB로 통합 검색.
pip install duckdb # 1회만
python3 ~/.claude/memento/query.py sync # MEMORY.md → DB 동기화
global search <query> — 전 프로젝트 학습 검색
python3 ~/.claude/memento/query.py search "cross-domain bugfix"
# → 14개 프로젝트에서 관련 학습 검색
global recall <topic> — 특정 주제의 크로스 프로젝트 지식 회수
python3 ~/.claude/memento/query.py recall "FastAPI 인증"
# → feedback/project 타입 우선, 실행 가능한 지식 반환
global health — 통합 대시보드
python3 ~/.claude/memento/query.py health
# → 프로젝트별 학습 현황 + 스킬 건강 (experience 데이터 있을 때)
global sync — MEMORY.md 동기화
python3 ~/.claude/memento/query.py sync
# → 모든 프로젝트의 메모리 파일을 DB에 upsert
global sql <query> — 직접 SQL
python3 ~/.claude/memento/query.py sql "SELECT type, COUNT(*) FROM learnings GROUP BY type"
~/.claude/memento/experience.duckdb ← 전역 통합 저장소
~/.claude/memento/query.py ← CLI 쿼리 도구
| Name | Type | Content |
|---|---|---|
experience | table | 스킬 실행 경험 (memento log에서 축적) |
learnings | table | MEMORY.md 파일들 (22개 프로젝트 통합) |
skill_health | view | 스킬별 사용횟수, 성공률, 평균토큰 |
project_knowledge | view | 프로젝트별 학습 통계 |
All memento data lives in .claude/memento/ within the project:
.claude/memento/
├── experience.jsonl ← Append-only execution log
└── profiles/ ← Auto-generated skill profiles
├── maintenance.json
├── agile.json
└── ...
Experience is project-scoped because skill effectiveness varies by project type. A skill that works well for a web app may not suit a CLI tool.
For detailed schemas and technical specifications:
references/experience-schema.md — Experience log entry formatreferences/skill-profile-schema.md — Skill profile data structurereferences/smart-router.md — Full routing algorithm with edge casesreferences/hook-setup.md — Automated experience logging via hooksreferences/harness-generation.md — AutoHarness-inspired validation script generationnpx claudepluginhub insightflo/claude-impl-tools --plugin claude-impl-toolsMines SpecStory coding histories from any AI agent into skill candidates, then interactively forges selected ones into installed skills. Use when you want to turn past sessions into reusable skills.
Logs errors, user corrections, missing features, API failures, knowledge gaps, and best practices to .learnings/ markdown files. Promotes key insights to CLAUDE.md and AGENTS.md for AI agent self-improvement.
Runs structured retrospectives on significant tasks, consults accumulated lessons before new work, and proposes concrete skill edits when recurring patterns emerge.