From lab-notebook-skills
Use when designing a new experiment (read `experiments/NOTES.md` first to avoid repeating known-failed approaches), when a cell has finished and its metrics are committed (append an entry), or when correcting a wrong number in a prior entry (in-place fix is allowed and often required — leaving a wrong number on the page poisons future context; preserve the trail with strikethrough or a correction note when feasible, but accuracy of current evidence comes first) — enforces read-before-design, append-only-after-completion, every number traceable to an actual run (no estimates, no assumptions, no projected numbers), and root-cause hypotheses for every failed variant.
How this skill is triggered — by the user, by Claude, or both
Slash command
/lab-notebook-skills:lab-journal-disciplineThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
`experiments/NOTES.md` is the project's running journal of completed runs — the lab notebook of what has actually been tried. Three operations: read it before designing a new experiment, append to it after a run finishes and its metrics are committed, and correct prior numbers in place when they turn out to be wrong against the actual run artifacts. Future-you treats it as primary evidence abou...
experiments/NOTES.md is the project's running journal of completed runs — the lab notebook of what has actually been tried. Three operations: read it before designing a new experiment, append to it after a run finishes and its metrics are committed, and correct prior numbers in place when they turn out to be wrong against the actual run artifacts. Future-you treats it as primary evidence about what the project has learned, so it must reflect what the runs actually produced — not plans, not guesses, not retroactive reinterpretations of hypotheses or conclusions.
Violating the letter of the rules is violating the spirit of the rules.
experiments/NOTES.md before drafting the designBefore drafting any new experiment design — this skill's "designing a new
experiment" trigger — open experiments/NOTES.md and skim:
The point is to avoid re-running an experiment whose result is already on the page (especially failures). If you're about to propose variant X and NOTES.md already has a section explaining why X regressed and what the root cause was, surface that to the user and propose the next step from there instead.
Append a new entry only after a cell or experiment has actually run and its metrics are committed. Every number in the entry must trace to an actual run — no estimates, no assumptions, no projected numbers, no "approximately" fill-ins. Do not pre-register expected results, do not write the entry while the run is in flight, do not write up an experiment that exists only as a plan.
If a cell crashed before producing metrics, that is also a recordable result — write the entry with what actually happened (e.g. "OOM at step 3000, did not complete") rather than projected numbers.
Corrections are allowed and often required. A number that turns out to be wrong (typo, re-run produced different value, judge / metric definition changed, data bug discovered later) must be corrected — leaving a known-wrong number on the page poisons every future read of the journal, because the next designer (human or agent) will treat it as evidence and build the next experiment on top of it.
In-place editing is permitted. Accuracy of current evidence beats strict immutability.
Preferred form, when the trail is useful to preserve:
- accuracy: ~~0.823~~ → 0.847 *(corrected 2026-04-29: original was the
pre-fix run; see `expNN_*/eNNx/results_v2.json`)*
The strikethrough keeps the original visible, the arrow gives the corrected value, and the parenthetical names the date, reason, and the artifact the new number comes from.
Acceptable when context-pollution is the dominant concern (e.g. a headline metric in the running summary table that downstream readers will skim first): replace the number outright and add a brief footnote or appended "Corrections" entry noting what changed and why.
What is not acceptable in either form:
The constraint is evidence, not immutability: every number must trace to an actual run artifact, and every correction must name the artifact the new number comes from. The audit trail is a nice-to-have — accuracy of the current state is the must-have.
Each entry covers one experiment (one expNN_* dir) and includes:
| Section | Content |
|---|---|
| Goal | What question this experiment was trying to answer. |
| Hypothesis | What you expected to see and why (1–3 sentences). |
| Method | What was actually run — which cells, which knobs varied, which were fixed. Reference the cell dirs by name. |
| Results table | One row per cell, columns for the metrics you care about, with a marked baseline / control row for comparison. |
| Key findings | The 2–5 things that would change someone's design choices going forward. |
| Failed variants | Any cell that regressed vs control. Include a root-cause hypothesis for each failure ("why this failed"), not just the numbers. This is the highest-value content in the journal — it is what prevents re-running the same losing experiment. |
| Conclusion / shipped? | Whether anything from this experiment was integrated into the root pipeline, and which cell won. |
| Files | Pointers to the cell dirs and the result JSONs. |
A worked example of a notebook with multiple entries lives at examples/NOTES.md.
Markdown. New entries appended at the bottom (or grouped by experiment number — pick one convention per project and keep it). Prior entries are edited only to correct numbers against the actual run artifacts, never to reinterpret, reword hypotheses, or polish conclusions. Tables are encouraged for results — they're easy to skim and easy to diff. A single running summary table at the top of the file ("Pipeline Evolution" or similar) that captures the headline metric of each experiment is very useful for spotting trends across experiments.
| Excuse / mistake | Reality |
|---|---|
| "I'll write the NOTES.md entry now while I set up the run, and fill in the numbers later" | Forbidden. NOTES.md is append-only-after-completion. A draft entry that gets numbers patched in later quietly turns into "what I expected" rather than "what I observed." Wait until the run finishes. |
| "v6 will probably hit ~0.05 better, let me note that" | Speculation never goes in NOTES.md. Hypotheses go in the experiment's own README before running; observed results go in NOTES.md after. |
| "The prior number is wrong but I'll leave it and just append a correction below" | Half-right. Append-only is the fallback, not the goal. A wrong number in a results table will be picked up by the next reader who skims; fix it in place (strikethrough → corrected, with date and artifact reference), and only fall back to an appended note when in-place editing would be misleading or destructive. |
| "I don't have the exact number handy, I'll put ~0.82 and update later" | Forbidden. Every number traces to a specific run artifact. If you don't have the artifact in front of you, don't write the entry yet. |
| "The prior hypothesis sounds dumb in hindsight, let me reword it" | Forbidden. Corrections are for numbers that turn out to be wrong against actual evidence, not for retroactively making yourself look smarter. The hypothesis stays. |
| "I'll skip reading NOTES.md, I already know what I want to try" | Read it anyway. The whole point of the journal is that prior failures aren't always intuitive. Five minutes of reading saves a wasted run. |
If you find yourself about to do any of these, STOP, surface the situation to the user, and propose a non-destructive alternative:
experiments/NOTES.md before the experiment has actually run and produced metricsexperiments/NOTES.mdThe rules look pedantic. They exist because every weakening creates a path to non-reproducible results.
experiment-layout — where new experiments gocell-source-isolation — what files the entry's "Files" section should point atnpx claudepluginhub nchc-bio/nchc-marketplace --plugin lab-notebook-skillsProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.