If you would like to return to information from the previous session, please click here.
As seen in the previous example,
coral reef monitoring data often include numerous levels of spatial
replication, species within functional or higher-level taxonomic groups.
Visualising data using ggplot2
provides a way to retain and
visualise different levels within the data.
Having data in a “tidy data” can facilitate these types of
visualisation and knowing how to “map” aesthetics (i.e. colours, fills)
to certain variables and using facets
can provide ways of
separating out levels within the data.
As part of this lesson, we will learn the basics of mapping aesthetics and faceting as part of the Visualisation of Status & Trends module.
Taking our example from the coral percent cover data from Kenya, we will
# visualise
percent_cover_kenya_time_series %>%
dplyr::filter(level1_code %in% groups_of_interest) %>%
group_by(Year,
Site) %>%
summarise(sd = sd %>% sd(na.rm = TRUE),
mean_cover = mean_cover %>% mean(na.rm = TRUE)) %>%
mutate(sd_upper = mean_cover + sd,
sd_lower = mean_cover - sd) %>%
ggplot(aes(Year, mean_cover)) +
geom_line(aes(colour = Site),
alpha = 0.7) +
geom_point(size = 2.0, alpha = 0.7) +
geom_errorbar(aes(ymax = sd_upper,
ymin = sd_lower,
colour = Site),
alpha = 0.7) +
theme_bw() +
ylab("Mean percent cover") +
ggtitle("Kenya") +
theme(plot.title = element_text(hjust = 0.5))
The result looks something like this:
Not the most beautiful plot on the planet. Largely because of the overlap in some of the mean percent cover of the sites, it is difficult to follow the trends at individual sites.
In some of the later Exercises we will go through some additional ways to improve the visualisation through additional mapping of aesthetics and control over elements of the plot.
Given the overlap in the percent cover of the sites, we can use the
facets
to better visualise individual site trends.
This is simply done in an additional line of code:
percent_cover_kenya_time_series %>%
dplyr::filter(level1_code %in% groups_of_interest) %>%
group_by(Year,
Site) %>%
summarise(sd = sd %>% sd(na.rm = TRUE),
mean_cover = mean_cover %>% mean(na.rm = TRUE)) %>%
mutate(sd_upper = mean_cover + sd,
sd_lower = mean_cover - sd) %>%
ggplot(aes(Year, mean_cover)) +
geom_line(aes(colour = Site),
alpha = 0.7) +
geom_point(size = 2.0, alpha = 0.7) +
geom_errorbar(aes(ymax = sd_upper,
ymin = sd_lower,
colour = Site),
alpha = 0.7) +
theme_bw() +
facet_grid(Site ~ .) +
ylab("Mean percent cover") +
ggtitle("Kenya") +
theme(plot.title = element_text(hjust = 0.5))
Which looks something like this:
One aspect of this particular visualisation, is that we have “double
faceted” the colour of the lines by site and separating sites on
individual lines. That is, the Site
legend (and colour) is
somewhat redundant. What might make more sense for colouring the
stations to visualise the varation at the Station
level:
# visualise
percent_cover_kenya_time_series %>%
dplyr::filter(level1_code %in% groups_of_interest) %>%
mutate(sd_upper = mean_cover + sd,
sd_lower = mean_cover - sd) %>%
ggplot(aes(Year, mean_cover)) +
geom_line(aes(colour = Station),
alpha = 0.7) +
geom_point(size = 1.0, alpha = 0.7) +
theme_bw() +
facet_grid(Site ~ .) +
ylab("Mean percent cover") +
ggtitle("Kenya") +
theme(strip.background = element_blank(),
strip.text.y = element_text(angle = 0),
plot.title = element_text(hjust = 0.5),
legend.position = "none")
Which now gives us something that looks like this:
In order to produce this plot, we have removed the
group_by()
statements to retain the variation in
Station
. We have also removed the error bars (for clarity,
but could be added in), and we have changed the orientation of the
Site
names to make it easier to read.
In the Exercises below we will practise using different aesthetic “mappings” and facets to produce data visualisations that can communicate variations and different levels in the data while maintaining a “tidy” data structure.
As there are so many different aspects that can be controlled in
ggplot2
, we want to go through some additional examples
that will use the basic skills of data manipulation and learn a bit more
of the “grammar of graphics”.
Excercises for Percent Cover data can be found here.
And, exercises for Fish transect data can be found here