Preparing data is the most time consuming part of of data analysis
dplyr is a tool box for working with data in tibbles/data frames
- The most import data manipulation operations are covered
- Selection and manipulation
- Selection and manipulation of
- observations,
- variables and
- values.
- Summarizing
- Grouping
- Joining and intersecting tibbles
- Selection and manipulation
- In a workflow typically follows reshaping operations from
tidyr - Fast, by writing key pieces in C++ (using
Rcpp) - Standard interfaces to database (
dbplyr) ordata.table(dtplyr).
Preparing data is the most time consuming part of of data analysis!
Essential part of understanding data, hard to avoid.


