From datascience
Jupyter notebook structure and organization standards. Auto-triggered when creating or modifying Jupyter notebooks (.ipynb or Jupytext .py). Enforces section order, GPU selection, import grouping, configuration patterns, and rich output formatting.
How this skill is triggered — by the user, by Claude, or both
Slash command
/datascience:notebook-standardsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Patterns for Jupyter notebook creation.
Patterns for Jupyter notebook creation.
CUDA_VISIBLE_DEVICES BEFORE any torch/tf/jax importset_seed(42) for random, numpy, torchimport os
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
os.environ["TOKENIZERS_PARALLELISM"] = "false"
%load_ext autoreload
%autoreload 2
# Standard library
import os
from pathlib import Path
# Data processing
import numpy as np
import polars as pl
# Deep learning
import torch
import torch.nn as nn
# Rich console output
import rich.jupyter as rich
from rich.progress import Progress, BarColumn
ALL imports in this cell. Never import later.
All hyperparameters defined here with inline # comment per field. End with a Rich render that groups fields into sections so a scrolling reader can scan structure without reading code. Canonical template:
from rich import print as rprint
# Model
MODEL_NAME = "answerdotai/ModernBERT-base" # 149M params, 8192 token context
MAX_TOKEN_LENGTH = 512 # cap input sequences
# Training
BATCH_SIZE = 32 # per device
EPOCHS = 3
LR = 2e-5
WEIGHT_DECAY = 0.01
WARMUP_RATIO = 0.1
# Paths
DATA_PATH = Path("../data/processed/train.parquet")
OUTPUT_DIR = Path("../models/v1")
# Device - resolved at runtime; GPU info shown in render below
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
gpu_name = torch.cuda.get_device_name(0) if torch.cuda.is_available() else "N/A"
rprint(f"""[bold cyan]Configuration[/bold cyan]
[dim]{"─" * 40}[/dim]
[bold]Model[/bold]
Name: [yellow]{MODEL_NAME}[/yellow]
Max token length: [yellow]{MAX_TOKEN_LENGTH}[/yellow]
[bold]Training[/bold]
Batch size: [yellow]{BATCH_SIZE}[/yellow] [dim](per device)[/dim]
Epochs: [yellow]{EPOCHS}[/yellow]
Learning rate: [yellow]{LR}[/yellow]
Weight decay: [yellow]{WEIGHT_DECAY}[/yellow]
Warmup ratio: [yellow]{WARMUP_RATIO}[/yellow]
[bold]Paths[/bold]
Data: [cyan]{DATA_PATH}[/cyan]
Output: [cyan]{OUTPUT_DIR}[/cyan]
[bold]Device[/bold]
Using: [green]{device}[/green]
GPU: [cyan]{gpu_name}[/cyan]
""")
Colour grammar (cross-link to datascience:rich-output for full palette):
[bold]Section name[/bold] - sub-section headers within the render[yellow]value[/yellow] - numeric hyperparameters[cyan]value[/cyan] - paths, model IDs, identifiers[green]value[/green] - active device / runtime state[dim]hint[/dim] - parenthetical context, dividersGPU rule (mandatory when CUDA is used): the rendered Configuration MUST include the GPU's torch.cuda.get_device_name(0) in a [bold]Device[/bold] sub-section. Catches the silent wrong-device bug where CUDA_VISIBLE_DEVICES was set wrong but device="cuda" still works on the wrong GPU. See GPU-SETUP.md for the full Device-block template (adds compute capability + memory).
Minimal variant (for notebooks without sub-sections, e.g. quick data-loading scripts):
rprint(f"""[white]Configuration[/white]
Model: [cyan]{MODEL_NAME}[/cyan]
Batch size: [yellow]{BATCH_SIZE}[/yellow]
Device: [green]{device}[/green]
GPU: [cyan]{gpu_name}[/cyan]
""")
Use this when the notebook has <5 hyperparameters and no natural sub-sections. The sectioned template above is the default.
Each section MUST have a 1-2 sentence overview directly below its ## Section Name header, BEFORE the first code cell. Bullet lists (3-5 items) acceptable when the content is naturally listy. The overview answers "what does this section do, and why" so a scrolling reader can decide whether to expand the code or skip. Empty sections without overviews fail the standard.
Sole exception: the file-level Header section (the very top of the notebook) - the header itself IS the overview for the notebook as a whole.
## Configuration ->
Training hyperparameters for the ModernBERT contrastive run. Lower temperature increases discrimination but risks gradient instability; larger effective batch size improves contrastive learning but requires more memory.
## Data Loading ->
Loads the cached parquet from
data/processed/. Schema is validated against the training contract; rows with null targets are dropped here so downstream cells assume clean input.
## Evaluation ->
Reports the four metrics tracked across iterations:
- Accuracy on the held-out test split
- F1 macro to weight rare classes equally
- Latency p95 measured on a 1k-row sample
- Memory peak captured via
torch.cuda.max_memory_allocated
The overview lives in the SAME markdown cell as the header (so you don't need a third cell type). Pattern:
## Configuration
Training hyperparameters for the ModernBERT contrastive run. Lower
temperature increases discrimination but risks gradient instability.
$f(x) = ax + b$, display $$P(A|B) = \frac{P(B|A) P(A)}{P(B)}$$. JupyterLab renders both via MathJax. USE these for any actual mathematical content$ means "dollar", escape as \$5 so MathJax does not eat everything between the two unescaped dollars.md files OUTSIDE notebooks (READMEs, SKILL.md, docs/): workspace rule applies - escape always with \$, no equations expected (the rendered Markdown surfaces don't run MathJax)rich.print() with semantic colors. Single multiline call. Never multiple individual prints for related output. See datascience:rich-output.
npx claudepluginhub stellarshenson/claude-code-plugins --plugin datascienceProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.