From ionworks
Build a comprehensive multi-section data analysis report for battery cells whose measurements live on the Ionworks platform. Use when the user asks to "build a report", "summarize the data", "characterize cells", "create a data overview", "make an analysis PDF" for one or more cell designs — or when they hand over a dataset and ask "what's in it / what do we have / what does the BOL performance look like / how does it age". Produces a markdown + PDF with rate capability, DCIR, OCV, GITT/entropic, aging, and gap-analysis sections. Strongly prefer this skill whenever there is platform-resident measurement data and the user wants either a full report or any subset of these characterization sections.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ionworks:build-data-reportThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You're building a reproducible analysis report for cells whose measurements are already on the Ionworks platform. The report fetches **everything live from the platform** at build time — no local data files. Output is a markdown file with embedded plots, rendered to PDF.
You're building a reproducible analysis report for cells whose measurements are already on the Ionworks platform. The report fetches everything live from the platform at build time — no local data files. Output is a markdown file with embedded plots, rendered to PDF.
The report is structured by protocol family (rate capability, HPPC, GITT, etc.). Each section follows the same shape: pick a representative measurement, show its raw signal in detail, then summarize statistics across all measurements in that family.
Before writing any plot code, find out what protocol families exist for the cells in question. This determines which sections the report will have.
from collections import defaultdict
from ionworks import Ionworks, Navigator, set_dataframe_backend
set_dataframe_backend("pandas")
nav = Navigator(Ionworks())
families = defaultdict(lambda: defaultdict(int))
for cell in cell_specs: # the cell-spec names the user named, e.g. ("CellA", "CellB")
for inst in nav.instances(cell):
for m in nav.measurements(inst.id):
families[cell][proto_family(m, inst)] += 1
Print this — share what's available with the user before deciding the section list. Don't assume; some cells have rich BOL data and no aging, others have heavy aging and minimal BOL. Some families (Pre_cycle, Precon_DCR) are operational overhead and not worth their own section.
Critical rule: every filter, group, and selection uses measurement.protocol.* fields:
protocol.family — Rated_Discharge, HPPC, GITT, Fast_charge, Drive_cycle, Entropic, CC_Cycling, etc.protocol.ambient_temperature_degc — float, may be Noneprotocol.c_rate — float, may be Noneprotocol.mode — "charge" / "discharge" / "profile" / "calendar" / "mixed"protocol.soc_pct — int, may be Nonemeasurement.protocol is the single source of truth for what kind of test this is and under what conditions. Every section of the report — what to include in it, how to group within it, which measurements to compare — comes from this dict. Reach for m.name, file paths, or instance names only when you've confirmed protocol cannot answer the question.
If protocol fields are missing or wrong, fix the protocol — don't work around it. Stop building the report, go back to the processing-pipeline step that classifies protocols, and make family / ambient_temperature_degc / c_rate / mode / soc_pct explicit on every affected measurement. Then re-upload (or patch in place) and resume. Once you've worked around a missing field in analysis code, every future report inherits the workaround and the underlying data stays wrong.
Concrete examples of the fix-the-protocol mindset:
family=HPPC but different pulse widths that shouldn't be compared together. Don't filter by m.name; add a pulse_duration_s (or whatever it actually is) field to protocol and split the report by it.ambient_temperature_degc is None. Don't infer from the folder name; populate the field at processing time.c_rate=0.5 and others have it unset. Don't guess; populate.Some families legitimately don't have all fields — Entropic measurements typically have soc_pct set but no ambient_temperature_degc (the protocol varies SOC at a fixed ambient). Adapt grouping accordingly (group Entropic by SOC, not temperature). The rule is "every field that makes a measurement distinguishable from another measurement of the same family must be populated" — not "every field must always be set."
The report is a selection from a library of section types, not a fixed 12-section template. Pick whatever the inventory in step 1 says is available — a small dataset might be 3 sections; a comprehensive one might be 12+. There's no "complete" report; there's a report that matches the data on hand.
Common section types, in the typical order they appear:
| Type | Source family/families | Skip when |
|---|---|---|
| Data Inventory | all | never (always section 1) |
| Executive Summary | computed | never (always section 2) |
| BOL Discharge Rate Cap | Rated_Discharge, Cap_rated, Dis_cap_rated | no discharge ladder at varying rate |
| BOL Charge Rate Cap | Rated_Charge | no charge ladder |
| Fast Charge | Fast_charge | no Fast_charge measurements |
| Drive Cycle | Drive_cycle | no real-load-profile data |
| Pulse Resistance / DCIR | HPPC | no HPPC |
| Open Circuit Voltage | Rated_Discharge (C/20) ± HPPC rest ± GITT rest | no slow discharge and no rest-based reconstruction |
| GITT | GITT | no GITT |
| Entropic Coefficient | Entropic | no entropic measurements |
| Cycle Life | CC_Cycling, Profile_Cycling, Calendar | no aging campaign |
| Data Quality & Gaps | computed | never (always last) |
You may also need section types not in this list — e.g., formation cycles, abuse tests, EIS spectra, calorimetry. Treat the list as a starting palette, not a contract; the section-types listed are the ones most commonly seen on the platform.
Ordering principles: inventory and exec summary first. Then group by purpose — BOL performance (discharge → charge → application tests like fast charge & drive cycle), then characterisation (DCIR → OCV → GITT → Entropic in order from practical to model-oriented), then aging, then gaps. Application tests follow rate capability because they live in the same conceptual lane (what does the cell do at the terminals); characterisation follows because it goes deeper into structure/physics.
Numbering: clean integers, no 3b, 3c, 5b, 5c. If you find yourself reaching for a letter suffix, renumber instead — a few edits to renumber later sections cost less than confusing the reader.
Cell count agnostic: every section pattern works for 1, 2, or N cells. Single-cell reports skip the comparison subsections; multi-cell reports add one subsection per cell. The report shape doesn't change based on how many cells are in scope.
Every section follows the same shape:
(a) Brief prose explaining what the protocol measures and why it matters for the cell engineer (b) One representative measurement shown in detail (raw V vs t, V vs Q, or a 2×2 grid across temperatures/SOCs) (c) Summary across all measurements in that family — either a multi-line overlay plot or a markdown table
The representative + summary pattern is what makes the report useful to skim and still drillable: a reader sees one example trace, understands the protocol, then sees the across-cells statistics for the same family.
def plot_X(nav: Navigator, spec_name: str, ..., out_path: Path) -> bool:
"""..."""
inst, m = find_one(nav, spec_name, "FamilyName", temperature_C=...)
if m is None:
return False
try:
ts = nav.time_series(m.id)
except Exception:
return False
# ... build plot ...
fig.savefig(out_path)
plt.close(fig)
return True
Returning bool lets write_report() skip the section when data is missing. This is preferable to raising — every cell is different and we expect some sections to be empty for some cells.
Same shape for summary functions:
def summarise_X(nav: Navigator, spec_name: str) -> pd.DataFrame: ...
main() calls every plot/summary function once and stuffs results into a ctx dict. write_report(ctx, path) walks the sections and renders markdown — referencing ctx["plot_name"] (a bool) and ctx["summary_name"] (a DataFrame). Keep the two stages separate; it lets you regenerate the markdown without re-fetching every plot, and lets you test plot functions in isolation.
ctx = {
"rate_plot_tcell": rate_plot_tcell,
"rate_tcell": rate_tcell_df, # raw DataFrame for the executive summary
"fast_charge_summary": fc_df,
# ...
}
write_report(ctx, OUT / "data_report.md")
Use Chrome headless or weasyprint. A separate render_pdf.py script (not part of build_report.py) keeps the data-fetching path independent from the rendering path. Both should be runnable independently with uv run python ....
These patterns recur across most plot functions — extract them as helpers in your script.
Use :class:ionworks.Navigator — it memoises specs / instances / measurements
/ steps / time_series, paginates automatically, and returns listings sorted
by name. Set the dataframe backend to pandas once at the top of the script.
from ionworks import Ionworks, Navigator, set_dataframe_backend
set_dataframe_backend("pandas")
nav = Navigator(Ionworks())
See references/prod_data.md for the report-specific helpers built on top.
def find_one(nav, spec_name, family, temperature_C=None, instance_filter=None):
"""Return first (inst, m) matching the family + optional temp."""
def find_measurements(nav, spec_name, families: tuple[str, ...]):
"""Return all [(inst, m), ...] in the given families."""
find_one is for representative-plot selection; find_measurements is for summary functions.
To isolate a complete CC discharge or charge (excluding partial steps, rest steps, CV tails), filter the steps DataFrame:
Discharge: dV < −1V AND End V < 2.7V AND cap in (2.5, 6.5)Ah AND dur > 500s AND |I| > 0.3A
Charge: dV > 1V AND Start V < 3.5V AND End V > 4.0V AND cap in (2.5, 6.5)Ah AND dur > 200s AND |I| > 0.3A
The voltage limits, capacity range, and duration thresholds depend on the cell (these are for a 5 Ah, 2.5–4.2 V cell). Adjust to ~50% of rated capacity as the lower cap bound and the cell's voltage cutoffs.
See references/step_filtering.md for the canonical implementation and the rationale for each threshold.
When you have many measurements at varying rates and want one curve per C-rate bin, bucket by nearest standard rate ([0.05, 0.1, 0.2, 0.33, 0.5, 1.0, 1.5, 2.0, 3.0]) within ±15%, and keep the first record per bucket.
Family priority within a bucket matters. If a standard family (Rated_Charge, Cap_rated) and an auxiliary family (Fast_charge) both have a step in the same C-rate bucket, prefer the standard family — the auxiliary may have a different voltage cutoff or protocol structure that makes the trace look weird. Sort records by (family_priority, c_rate) before bucketing.
_FAMILY_PRIORITY = {"Rated_Charge": 0, "Rated_Discharge": 0, "Cap_rated": 0,
"Dis_cap_rated": 0, "Fast_charge": 1}
Single-thermocouple cyclers (Arbin) write Temperature [degC]. Multi-channel cyclers (Maccor) write Temperature 1 [degC] ... Temperature N [degC] — preserve each as a separate column at processing time; don't average. For analysis plots that need a single trace, pick the canonical name or fall back to Temperature 1 [degC]:
def _pick_temperature_col(columns):
if "Temperature [degC]" in columns:
return "Temperature [degC]"
for n in (1, 2, 3):
col = f"Temperature {n} [degC]"
if col in columns:
return col
return None
Surface multi-channel layouts in the report — show all channels overlaid on one example measurement so the reader knows what's there, then note which channel the analysis plots use.
For techniques like GITT and Entropic where you want one panel per temperature (or per SOC), use a 2×2 layout with preferred values plus fallback:
preferred = [0, 10, 30, 40]
available = {}
for inst, m in find_measurements(nav, spec_name, (family,)):
T = proto_temp(m, inst)
if T is not None and T not in available:
available[T] = (inst, m)
chosen = []
for T in preferred:
if T in available:
chosen.append((T, *available[T]))
for T, pair in sorted(available.items()):
if len(chosen) >= 4: break
if T not in [c[0] for c in chosen]:
chosen.append((T, *pair))
This preferred-then-fallback selection avoids hard-coding temperatures that may not exist for every cell.
Build the report incrementally. After every new section:
uv run python scripts/analysis/build_report.py end-to-endThe fastest debug loop is a small diagnostic Python snippet that calls one helper (_select_full_steps_at_temp, _representative_discharge_per_rate) and prints what gets chosen for the suspicious case. Inline uv run python -c "..." works well — no need to write throwaway files.
Silently dropped columns. Cycler readers in ionworksdata return canonical columns only. If the raw file has aux thermocouples (Maccor Temp 2..4, Arbin aux probes) and your report's temperature plots are empty, the reader probably dropped them at processing time. Fix in the processing pipeline by re-reading the source for those channels and join_asof-merging them in, not in the analysis layer.
Missing protocol metadata. A measurement without ambient_temperature_degc will silently be excluded from temperature-filtered selections. If a section is sparse, check whether the classifier ran on the affected cohort — proto_family(m, inst) returning None is the giveaway.
Anomalous representative traces. When _representative_discharge_per_rate picks "the wrong" measurement, the cause is usually iteration order (instances sort alphabetically; the first one wins). The fix is a stable sort with explicit priority, not a hack to filter out the offending measurement.
Section ordering drift. It's tempting to add new sections with lettered suffixes (3b, 3c) to avoid renumbering. Resist — a few renumber edits now are cheaper than confusion later. If the new section logically belongs between 3 and 4, renumber 4-onward.
Mixing protocol families in one plot. Rated_Charge (cutoff 4.20 V) and Fast_charge (cutoff 4.12 V) look similar at first glance but produce different traces. Keep them in separate sections. If two families could plausibly be combined, the rule is: combine when the protocol intent is the same (e.g., Rated_Discharge + Cap_rated both measure rate capability), separate when the intent differs (rate capability vs. multi-cycle stress test).
references/section_templates.md — Detailed markdown templates for each of the 12 sections, with the prose conventions and figure-caption stylereferences/plot_patterns.md — Recipes for the recurring plot types (V vs Q multi-rate, T vs Q multi-rate, rate-capability scatter, twin-axes V+T vs time, multi-temperature grid)references/step_filtering.md — Detailed step-filter heuristics with rationale for each thresholdreferences/prod_data.md — Report-specific platform helpers (find_one, find_measurements, proto_family, etc.); the cached accessor itself lives in the SDK as ionworks.Navigatornpx claudepluginhub ionworks/ionworks-skills --plugin ionworksGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.