This tutorial is inspired and adapted from the sthda practical guide published under the creative commons license.
Download the mousew dataset from here and load it in R. This dataset contains the weight (in grams) of two strains of mice for both genders.
mousew <- read_csv("data/mousew.csv", col_types = cols(strain = col_character(),
sex = col_character(),
weight = col_double()))

mousew %>%
ggplot(aes(x = weight)) +
geom_histogram(binwidth = 0.5)
position argument.
mousew %>%
ggplot(aes(x = weight, fill = sex)) +
geom_histogram(binwidth = 0.5, position = "identity", alpha = 0.5)

# or position = "dodge" but not very nice...
mousew %>%
# color for coloring the lines too
ggplot(aes(x = weight, fill = sex, colour = sex)) +
geom_density(alpha = 0.5)
mousew %>%
ggplot(aes(x = weight, y = stat(density))) +
geom_histogram(aes(fill = sex), alpha = 0.6, position = "identity", binwidth = 0.5) +
geom_density(aes(color = sex))
ggplot2 automatically adjusts the range of the axis. Try to override this behaviour and let the x axis start at 0geom_* can takes its own data argument that overwrite the one inherited from the ggplot() call. Might be worth summarising by the mean the weight and a 4 rows tibble. Then pass it to geom_vline(). Of note, aesthetics are also not inherited when a new data is specified.
expand_limit() functionstrain and sex), see the labeller argument of facet_wrap()

mousew_summary <- mousew %>%
group_by(strain, sex) %>%
summarise(median_weight = median(weight),
mean_weight = mean(weight),
sd_weight = sd(weight))
mousew %>%
ggplot(aes(x = weight)) +
geom_density(alpha = 0.5, fill = "lightgray") +
geom_vline(data = mousew_summary, aes(xintercept = mean_weight), linetype = "dashed", size = 1, show.legend = FALSE) +
expand_limits(x = 0) +
facet_wrap(~ strain + sex, scales = "free") +
labs(title = "Weight density curves of mice per\nsex and strain",
x = "Weight (in grams)",
y = "Density")
mousew %>%
ggplot(aes(x = weight, fill = sex)) +
geom_density(alpha = 0.5) +
geom_vline(data = mousew_summary, aes(xintercept = mean_weight, color = sex), linetype = "dashed", size = 1, show.legend = FALSE) +
facet_grid(strain ~ .) +
expand_limits(x = 0) +
ggtitle("Weight density curves of mice per\nsex and strain") +
xlab("Weight (in grams)") +
ylab("Density")
mousew %>%
ggplot(aes(x = sex, y = weight, fill = sex)) +
geom_boxplot() +
expand_limits(y = 0)
geom_col() requires a y aesthetetic to map on the continuous variablealpha paramater to give some transparency, will help to spot inconsistencies
mousew %>%
ggplot(aes(x = sex, y = weight, fill = strain), alpha = 0.5) +
geom_col()
position argument for geom_col(), default is stack, alternatives are dodge (side by side) or fill for proportions
mousew %>%
group_by(strain, sex) %>%
summarise(mean_weight = mean(weight), sd_weight = sd(weight)) %>%
ggplot(aes(x = sex, y = mean_weight, fill = strain)) +
geom_col(position = "dodge")
geom_errorbar() and using the standard deviations (sd).position = "dodge" calls the position_dodge() function. Look at the help of this function, one example is describing how to align narrower elements like error bars.width element of geom_errorbar() to reduce the default which is too large.
mousew %>%
group_by(strain, sex) %>%
summarise(mean_weight = mean(weight), sd_weight = sd(weight)) %>%
ggplot(aes(x = sex, y = mean_weight, fill = strain)) +
geom_col(position = "dodge") +
geom_errorbar(aes(ymin = mean_weight - sd_weight, ymax = mean_weight + sd_weight),
width = 0.25, position = position_dodge(width = 0.9))