Split, Apply, Combine
data-wrangling
reproducibility
teaching
A walkthrough of six approaches to the split-apply-combine pattern in R, from explicit column-by-column summaries to parallel execution with furrr, using the Palmer Penguins dataset as a running example.
Split, Apply, Combine works through a common research workflow problem: running the same analysis across multiple subsets of a dataset. Starting from fully explicit code and moving through for loops, across(), group_modify(), map(), and finally furrr::future_map(), the post shows how the choice of tool reflects a tradeoff between specificity and generality. It closes with a timing comparison and a practical decision guide for knowing when to reach for each approach.