From r-package-skills
Use when code loads or uses collapse (library(collapse), collapse::), performing fast grouped or weighted statistics in R, or seeking faster alternatives to dplyr aggregation
How this skill is triggered — by the user, by Claude, or both
Slash command
/r-package-skills:r-collapseThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**collapse provides C/C++-based high-performance grouped and weighted statistics.** 50-100x faster than dplyr for grouped operations, matches data.table speed while working with any data frame type (tibbles, data.tables, xts).
collapse provides C/C++-based high-performance grouped and weighted statistics. 50-100x faster than dplyr for grouped operations, matches data.table speed while working with any data frame type (tibbles, data.tables, xts).
Core principle: Fast aggregation, transformation, and panel data operations through vectorized C code.
Read references/API.md before writing code.
references/API.md - Complete function referencereferences/collapse-for-tidyverse-users.md - Migration guide and patternsreferences/collapse-documentation.md - Core concepts and usagereferences/collapse-and-sf.md - Working with spatial datareferences/collapse-object-handling.md - Data structure handlingUse collapse when:
Don't use:
vs Alternatives:
| Scenario | Use This |
|---|---|
| Large grouped stats | collapse |
| Weighted computations | collapse |
| sf manipulation | dplyr |
| Reference semantics | data.table |
| Complex joins | data.table |
| Arbitrary group functions | dplyr |
| Task | Function/Example |
|---|---|
| Grouped stats | fmean(), fsum(), fsd(), fmedian() |
| Aggregation | collap(df, ~ by, list(fmean, fsd)) |
| Transform | ftransform(), fmutate() |
| Selection | fselect(), fsubset() (~100x faster) |
| Time series | flag(), fdiff(), fgrowth() |
| Panel data | fwithin(), fbetween(), qsu() |
| Grouping | fgroup_by(), GRP() |
library(collapse)
# Basic: grouped mean (50-100x faster than dplyr)
data |> fgroup_by(category) |> fmean()
# Weighted aggregation
data |> fgroup_by(region) |> fmean(w = weight_col)
# Multiple stats at once
collap(data, ~ category, list(fmean, fsd, fmedian))
# TRA transformations (key differentiator - single C pass)
data |> fgroup_by(id) |> fmean(TRA = "-") # Demean: subtract group mean
data |> fgroup_by(id) |> fsd(TRA = "/") # Scale: divide by group SD
data |> fgroup_by(id) |> fmean(TRA = "fill") # Fill: replace NA with group mean
# See references/API.md for full TRA options ("-", "/", "fill", "-+", "replace")
| Mistake | Fix |
|---|---|
Using group_by() with collapse functions | Use fgroup_by() or pass g = GRP(groupvar) |
collap() applies to ALL numeric columns | Explicitly select columns before calling |
Expecting na.rm = FALSE default | collapse defaults to na.rm = TRUE |
fwithin()/fbetween() collapse rows | They return same # rows (centered/group means) |
| Global options affect behavior | Set arguments explicitly in package code |
Ignoring sort = FALSE speedup | Add sort = FALSE when order doesn't matter (3x faster) |
See references/ for API reference, vignette content (tidyverse comparison, sf integration, object handling, development guidelines), and panel data patterns.
Validator: lib/r-validators/numerical-validator.R
Resources: Docs
npx claudepluginhub arthurgailes/r-package-skills --plugin r-package-skillsModern tidyverse patterns for R including pipes, joins, grouping, purrr, and stringr. Use when writing tidyverse R code.
Provides pandas API patterns for DataFrame operations, data cleaning, aggregation, merging, and performance optimization. Useful for generating pandas code in data loading, manipulation, or profiling workflows.
Use when code loads or uses duckplyr (library(duckplyr), duckplyr::), processing large datasets with dplyr syntax, working with Parquet files in R, or needing lazy evaluation for bigger-than-memory data