This guided practical will demonstrate that the tidyverse allows to compute summary statistics and visualize datasets easily. Those datasets are already compile in a tidy tibble
, cleaning steps will come in future prracticals.
datasauRus
packagedatasauRus
installedlibrary(datasauRus)
there is no package called ‘datasauRus’
appears, it means that the package needs to be installed. Use this:install.packages("datasauRus")
Since we are dealing with a tibble
, we can just type
datasaurus_dozen
only the first 10 rows are displayed.
dataset | x | y |
---|---|---|
dino | 55.3846 | 97.1795 |
dino | 51.5385 | 96.0256 |
dino | 46.1538 | 94.4872 |
dino | 42.8205 | 91.4103 |
dino | 40.7692 | 88.3333 |
dino | 38.7179 | 84.8718 |
dino | 35.6410 | 79.8718 |
dino | 33.0769 | 77.5641 |
dino | 28.9744 | 74.4872 |
dino | 26.1538 | 71.4103 |
base version, using either dim()
, ncol()
and nrow()
tidyverse version
datasaurus_dozen
to the ds_dozen
object. This aims at populating the Global Environmentsummarise(ds_dozen, n = n_distinct(dataset))
## # A tibble: 1 x 1
## n
## <int>
## 1 13
dataset
count
in dplyr
does the group_by()
by the specified column + summarise(n = n())
which returns the number of observation per defined group.
x
& y
column. For this, you need to group_by()
the appropriate column and then summarise()
summarise()
you can define as many new columns as you wish. No need to call it for every single variable.
x
& y
column in a same waysummarise_if
so we exclude the dataset
column and compute the othersds_dozen
with ggplot
such the aesthetics are aes(x = x, y = y)
with the geometry geom_point()
ggplot()
and geom_point()
functions must be linked with a + sign
dataset
column%in%
to test if there a match of the left operand in the right one (a vector most probably)
dataset
per facettheme_void
and remove the legendgganimate
, its dependencies will be automatically installed.dataset
variable to the transition_states()
argument layernever trust summary statistics alone; always visualize your data | Alberto Cairo
Authors
from this post