
Split a data frame by a grouping column and write each group to a CSV file
Source:R/write-by-group.R
write_by_group.RdSplits a data frame by a single grouping column and writes each group to a separate CSV file. Optionally writes a manifest file listing the output files, their group values, and row counts.
Arguments
- data
A data frame or tibble to split and save.
- group_col
A string. The name of the column to group by.
- output_dir
A string or
NULL. Path to the directory where output files will be written. Created if it does not exist. IfNULL, the user must supply a path explicitly.- manifest
A logical. Whether to write a
manifest.csvfile tooutput_dirlisting the output files, group values, and row counts. Defaults toFALSE.
Details
Output filenames are derived from the group values of group_col.
Values are sanitized before use as filenames: converted to lowercase,
spaces and special characters replaced with -, consecutive dashes
collapsed, and leading/trailing dashes stripped.
If manifest = TRUE, a manifest.csv is written to output_dir
containing three columns: group_value, n_rows, and file_path.
Note: output_dir has no default value. Always supply an explicit path
to avoid writing files to unexpected locations. Use tempdir() for
temporary output during testing or exploration.
Examples
# \donttest{
# Split a small data frame by group and write to a temp directory
data <- data.frame(
species = c("Adelie", "Adelie", "Gentoo"),
mass = c(3750, 3800, 5000)
)
write_by_group(data, group_col = "species", output_dir = tempdir())
#> ✔ Written "Adelie" (2 rows) to /tmp/Rtmpip9DJO/adelie.csv
#> ✔ Written "Gentoo" (1 rows) to /tmp/Rtmpip9DJO/gentoo.csv
# Same but also write a manifest
write_by_group(data, group_col = "species",
output_dir = tempdir(), manifest = TRUE)
#> ✔ Written "Adelie" (2 rows) to /tmp/Rtmpip9DJO/adelie.csv
#> ✔ Written "Gentoo" (1 rows) to /tmp/Rtmpip9DJO/gentoo.csv
#> ✔ Manifest written to /tmp/Rtmpip9DJO/manifest.csv
# }