From field-report
Generate structured performance reports on plugins, skills, and agents by analysing Claude Code session conversations. Produces evidence-based narrative reports with actionable recommendations for artifact developers. Triggers: "field report", "analyse session", "skill performance", "agent report", "session analysis", "how did X perform".
How this skill is triggered — by the user, by Claude, or both
Slash command
/field-report:field-reportThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate a structured field report for one subject (plugin, skill, or agent) using one Claude Code session as evidence. The output is a narrative evaluation focused on concrete session evidence, clear limitations, and actionable recommendations for the subject maintainer.
Generate a structured field report for one subject (plugin, skill, or agent) using one Claude Code session as evidence. The output is a narrative evaluation focused on concrete session evidence, clear limitations, and actionable recommendations for the subject maintainer.
This skill supports two input modes for selecting the session: explicit session ID (ses_...) or natural language description (for example, "last session", "today's session", or "session where I used reason"). Session discovery is part of the core workflow.
The report must stay privacy-safe through explicit sanitisation rules, avoid dumping raw conversation content, and avoid numeric scoring language. The workflow terminates after writing the report and presenting a concise completion summary.
You are an evidence-first analyst. Follow all steps in order. Do not skip steps. Do not infer evidence that is not present in the session data.
Parse the command arguments into:
ses_...) or natural language description.reason, crosscheck/reason, or byfuglien.SKILL.md or agent.md file.Rules:
ses_, treat it as explicit session ID and continue to Step 2.Abort conditions:
/field-report ses_abc123 reason/field-report "last session" crosscheck/reason/field-report "session where I used byfuglien" byfuglien crosscheck/agents/byfuglien.mdResolve natural language session descriptions into one session ID using session_list and session_search.
Discovery strategy:
session_list(limit=1) and select that session.session_list(from_date=<today-iso>, limit=20) and prefer newest match.session_list(limit=10).session_search(query=<X>, session_id=<candidate>).Selection and disambiguation:
Abort conditions:
Call session_info(session_id) for the resolved session.
Capture:
Validation:
message_count is fewer than 10 because evidence density is too low for a reliable report.Record the metadata for the report header.
Run a probe read before deep analysis:
session_read(session_id, include_transcript=true, limit=5).Inspect and record:
This step governs later analysis behavior. If tool call structure is unavailable, do not fabricate tool metrics later.
Determine expected behavior for the subject.
Lookup order:
Read it first.Glob for **/skills/{subject-name}/SKILL.md.Glob for **/agents/{subject-name}.md.If subject is provided in namespaced form (crosscheck/reason), search using terminal segment (reason) plus full token.
If no subject definition file is found:
Collect subject-relevant session evidence.
Process:
session_search(subject_name, session_id=...).reason, /reason, crosscheck/reason).Read strategy by size:
session_search excerpts plus targeted session_read windows around key moments.session_read to inspect the full session.Abort condition:
Evidence discipline:
Evaluate whether the subject helped complete the user goal.
Required checks:
Evidence criteria:
Fallback statement:
Insufficient data — <reason>.Evaluate how closely execution matched subject instructions.
When subject definition exists:
When subject definition is unavailable:
Subject definition not found — unavailable.Output style:
Evaluate tool selection and execution behavior.
If tool-call data is available from Step 3:
If tool-call data is not available:
Tool usage data not available in session format.Evidence criteria:
Measure progress quality over the session.
Method:
Evidence criteria:
Fallback statement:
Insufficient data — <reason>.Interpretation guidance:
Inspect observable failures and recoveries.
For each error event found:
If none observed:
No errors observed.Evidence criteria:
Evaluate how clear the initial request was and how much clarification was needed.
Checks:
Evidence criteria:
Fallback statement:
Insufficient data — <reason>.Sanitise all evidence snippets before report generation.
Apply explicit STRIP / KEEP / NOTE rules.
STRIP (replace with [REDACTED]):
sk-, ghp_, Bearer , AKIA).[A-Z_]+=...).KEEP:
NOTE:
[N items redacted for privacy/security].Abort condition:
Create final report content and file target.
Output path:
field-reports/ (create if missing).{subject-slug}--{session-id-short}--{YYYY-MM-DD}.md.session-id-short.Required report section order:
## Context## Task Completion## Instruction Adherence## Tool Usage## Conversation Efficiency## Error Handling## Input Clarity## Lessons Learned## SummaryQuality rules:
Write the report file and terminate the workflow.
Actions:
field-reports/... path.Termination rules:
Use this template structure exactly when building the report body.
# Field Report: <subject-name>
- Session ID: <ses_xxx>
- Date: <YYYY-MM-DD>
- Session Period: <start> to <end>
- Messages Analysed: <count>
## Context
<1-2 paragraphs describing session purpose, subject role, and analysis scope.>
## Task Completion
- Initial request: "<quoted snippet>"
- Expected deliverable: <deliverable>
- Observed outcome: <what was produced>
- Evidence:
- "<snippet 1>"
- "<snippet 2>"
- Assessment: <completion status or Insufficient data - reason>
## Instruction Adherence
- Subject definition source: <path or unavailable>
- Workflow adherence:
1. <step expectation> - <completed|skipped|partial|n/a> - Evidence: "<snippet>"
2. <step expectation> - <completed|skipped|partial|n/a> - Evidence: "<snippet>"
3. <step expectation> - <completed|skipped|partial|n/a> - Evidence: "<snippet>"
- Notes: <gaps, constraints, unavailable evidence>
## Tool Usage
- Tools observed: <list or unavailable message>
- Alignment with subject guidance: <aligned or deviated>
- Effective choices: <specific examples>
- Friction patterns: <retry or misuse patterns>
- Evidence:
- "<snippet>"
## Conversation Efficiency
- Advancing messages: <count>
- Correction/retry messages: <count>
- Overall balance: <predominantly advancing / predominantly correcting / mixed>
- Momentum observations:
- <exchange summary with evidence>
- <exchange summary with evidence>
## Error Handling
- Error 1: <issue>
- Handling: <action>
- Recovery: <success or not>
- Recovery latency (turns): <value>
- Evidence: "<snippet>"
- Error 2: <issue or none>
## Input Clarity
- Initial clarity: <clear/mixed/unclear with evidence>
- Clarifications required: <count and why>
- Turns before productive execution: <count>
- Ambiguity impact: <rework or none>
## Lessons Learned
1. <Actionable recommendation tied to observed evidence>
2. <Actionable recommendation tied to observed evidence>
3. <Actionable recommendation tied to observed evidence>
## Summary
<2-3 sentence synthesis of the most important outcome and next practical adjustment.>
[<N> items redacted for privacy/security]
Required inputs:
ses_... or natural language description.Optional input:
Examples:
/field-report ses_abc123 reason/field-report "last session" crosscheck/reason/field-report "today's session" byfuglien/field-report "session where I used /reason" reason/field-report ses_def456 byfuglien crosscheck/agents/byfuglien.mdNotes:
npx claudepluginhub nicholls-inc/claude-code-marketplace --plugin field-reportGenerates an in-chat session report from the active or named transcript — every tool call, sub-agent delegation, and file edit, with diagnostics for missed parallelism, redundant work, and over-serialization.
Performs comprehensive analysis of Claude Code sessions, examining git history, conversation logs, code changes, and gathering user feedback to generate actionable retrospective reports with insights for continuous improvement.
Analyzes current or recent agent sessions (GitHub Copilot CLI or Claude Code) and generates a diagnostic report. Useful for session feedback, debugging agent behavior, or reviewing build sessions.