Skill

pandas-consistency

Write, review, refactor, or debug Python code that uses pandas (DataFrames, Series, pd.read_csv, groupby, merge, pivot) using one canonical, modern idiom set. Use this skill whenever code creates or manipulates DataFrames, cleans or reshapes tabular data, fixes a SettingWithCopyWarning, migrates off deprecated pandas APIs (df.append, inplace=True, applymap, chained indexing), or when the user asks "why did my assignment not stick," "why is this slow," or "is this the right pandas way." Trigger it even when the user just says "load this CSV and ..." or shows a stack trace mentioning pandas — without saying the word "pandas idioms."

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/pandas-consistency:pandas-consistency

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

pandas is stable and well documented, yet generated code drifts between three eras of the

SKILL.md

93 lines · ~1.5k tokens

Stats

Stars0

MaintenanceGood

Last CommitJun 12, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

pandas — consistent, modern idioms

pandas is stable and well documented, yet generated code drifts between three eras of the API: pre-1.0 patterns (chained indexing, df.ix), the deprecated-but-familiar middle era (df.append, inplace=True everywhere), and modern 2.x copy-on-write style. This skill pins one canonical idiom set — pandas 2.x semantics — so every snippet you produce or review follows the same rules instead of mixing eras.

Canonical idioms — always X, never Y

Always	Never	Why
`df.loc[mask, "col"] = value`	`df[mask]["col"] = value`	Chained indexing assigns to a temporary; under copy-on-write it never updates `df`.
`df.loc[...]` / `df.iloc[...]`	`df.ix[...]`	`ix` was removed in 1.0.
`pd.concat([df1, df2], ignore_index=True)`	`df1.append(df2)`	`DataFrame.append` removed in 2.0.
`df = df.drop(columns=["a"])` (reassign)	`df.drop("a", axis=1, inplace=True)`	`inplace=True` saves nothing, blocks method chaining, and is being phased out.
`df["c"] = df["a"] * df["b"]` (vectorized)	`for i, row in df.iterrows(): ...`	`iterrows` is orders of magnitude slower and yields copies — writes to `row` are lost.
`df.map(func)` (elementwise, 2.1+)	`df.applymap(func)`	`applymap` deprecated in 2.1 → renamed `DataFrame.map`.
`pd.NA` / `df["col"].isna()`	`value == np.nan` comparisons	`NaN != NaN`; equality checks silently return False.
`df.groupby("k", observed=True)` for categoricals	relying on the old default	Default flipped to `observed=True` in 2.x; be explicit.
`pd.to_datetime(s, format="...")` or `format="mixed"`	bare `pd.to_datetime(s)` on messy strings	Silent element-wise format guessing was removed; mixed formats now raise.
explicit `dtype=` / `parse_dates=` in `read_csv`	letting inference decide IDs and dates	Leading zeros, int64 overflow, and object-dtype dates are silent corruption.
`df.copy()` when you mean an independent frame	slicing and mutating	Under copy-on-write, mutating a slice never propagates — and pre-CoW it sometimes did. Be explicit either way.

Method chaining is the house style for transformation pipelines:

result = (
    pd.read_csv("sales.csv", parse_dates=["date"], dtype={"store_id": "string"})
    .rename(columns=str.lower)
    .assign(revenue=lambda d: d["units"] * d["unit_price"])
    .query("revenue > 0")
    .groupby("store_id", as_index=False)
    .agg(total_revenue=("revenue", "sum"), n_orders=("revenue", "size"))
    .sort_values("total_revenue", ascending=False)
)

Pitfalls that produce silently wrong results

Chained-indexing assignment "works" in some pre-2.x environments and is a no-op in others. Always route writes through a single .loc[rows, cols] call.
SettingWithCopyWarning is a symptom, not the disease: find the slice-then-mutate and replace it with .loc on the original or an explicit .copy().
Merge row explosion: duplicated keys on both sides multiply rows. Use df.merge(other, on="k", validate="one_to_one") (or "many_to_one", etc.) to assert the expected cardinality, and check indicator=True when debugging.
groupby drops NaN keys by default; pass dropna=False if NaN groups matter.
object dtype hides mixed types (ints + strings sort/compare strangely). Prefer nullable dtypes: "string", "Int64", "boolean", or df.convert_dtypes().
Timezone traps: tz_localize attaches a zone to naive timestamps; tz_convert translates an aware one. Localizing twice or converting naive raises — or worse, shifts.
fillna then compare: filling sentinel values like 0/"" changes aggregation results; fill late, or use nullable dtypes end-to-end.
axis confusion: axis=0/"index" aggregates down columns; axis=1/"columns" works across a row. Spell it as the string form in reviews.

Version notes

Target pandas 2.x. The key breaking line is 2.0 (removed append/ix, datetime parsing strictness, copy-on-write introduced; CoW becomes the only behavior in 3.0). If the user is pinned to 1.x, the same canonical idioms still run — they are backward-compatible — so do not write era-mixed code; just avoid 2.1+-only names (DataFrame.map) and note the difference.

Workflow

Establish the frame contract first: expected columns, dtypes, index, and key uniqueness. Load with explicit dtype=/parse_dates=.
Write transformations as vectorized, chained, reassigning steps (df = df... or one pipeline). Reach for .apply/loops only after confirming no vectorized form exists.
Route every write through a single .loc/.iloc indexer; add .copy() when forking.
Validate structure-changing ops: validate= on merges, shape checks after filters, observed=/dropna= decisions made explicitly on groupbys.
When reviewing existing code, flag any "Never" column pattern above and rewrite it in the canonical form rather than patching around it.

For the fuller migration map (old API → modern API), expanded gotcha explanations, and more worked examples, read references/pandas-patterns.md.

pandas-consistency

Invocation

Context Preview

SKILL.md

pandas-consistency

Invocation

Context Preview

SKILL.md

pandas — consistent, modern idioms

Canonical idioms — always X, never Y

Pitfalls that produce silently wrong results

Version notes

Workflow

Similar Skills

pandas — consistent, modern idioms

Canonical idioms — always X, never Y

Pitfalls that produce silently wrong results

Version notes

Workflow

Similar Skills