Previous steps

If you would like to return to information from the previous session, please click here.

Context

As seen in the previous example, coral reef monitoring data often include numerous levels of spatial replication, species within functional or higher-level taxonomic groups. Visualising data using ggplot2 provides a way to retain and visualise different levels within the data.

Having data in a “tidy data” can facilitate these types of visualisation and knowing how to “map” aesthetics (i.e. colours, fills) to certain variables and using facets can provide ways of separating out levels within the data.

As part of this lesson, we will learn the basics of mapping aesthetics and faceting as part of the Visualisation of Status & Trends module.

Mapping aesthetics

Taking our example from the coral percent cover data from Kenya, we will

  # visualise
    percent_cover_kenya_time_series %>%
      dplyr::filter(level1_code %in% groups_of_interest) %>%
      group_by(Year,
               Site) %>%
      summarise(sd         =         sd %>%   sd(na.rm = TRUE),
                mean_cover = mean_cover %>% mean(na.rm = TRUE)) %>%
      mutate(sd_upper = mean_cover + sd,
             sd_lower = mean_cover - sd) %>%
    ggplot(aes(Year, mean_cover)) +
      geom_line(aes(colour = Site),
                alpha = 0.7) +
      geom_point(size = 2.0, alpha = 0.7) +
      geom_errorbar(aes(ymax = sd_upper,
                        ymin = sd_lower,
                        colour = Site),
                    alpha = 0.7) +
      theme_bw() +
      ylab("Mean percent cover") +
      ggtitle("Kenya") +
      theme(plot.title = element_text(hjust = 0.5))

The result looks something like this:

Not the most beautiful plot on the planet. Largely because of the overlap in some of the mean percent cover of the sites, it is difficult to follow the trends at individual sites.

In some of the later Exercises we will go through some additional ways to improve the visualisation through additional mapping of aesthetics and control over elements of the plot.

Using facets

Given the overlap in the percent cover of the sites, we can use the facets to better visualise individual site trends.

This is simply done in an additional line of code:

    percent_cover_kenya_time_series %>%
      dplyr::filter(level1_code %in% groups_of_interest) %>%
      group_by(Year,
               Site) %>%
      summarise(sd         =         sd %>%   sd(na.rm = TRUE),
                mean_cover = mean_cover %>% mean(na.rm = TRUE)) %>%
      mutate(sd_upper = mean_cover + sd,
             sd_lower = mean_cover - sd) %>%
    ggplot(aes(Year, mean_cover)) +
      geom_line(aes(colour = Site),
                alpha = 0.7) +
      geom_point(size = 2.0, alpha = 0.7) +
      geom_errorbar(aes(ymax = sd_upper,
                        ymin = sd_lower,
                        colour = Site),
                    alpha = 0.7) +
      theme_bw() +
      facet_grid(Site ~ .) +
      ylab("Mean percent cover") +
      ggtitle("Kenya") +
      theme(plot.title = element_text(hjust = 0.5))

Which looks something like this:

One aspect of this particular visualisation, is that we have “double faceted” the colour of the lines by site and separating sites on individual lines. That is, the Site legend (and colour) is somewhat redundant. What might make more sense for colouring the stations to visualise the varation at the Station level:

  # visualise
    percent_cover_kenya_time_series %>%
      dplyr::filter(level1_code %in% groups_of_interest) %>%
      mutate(sd_upper = mean_cover + sd,
             sd_lower = mean_cover - sd) %>%
    ggplot(aes(Year, mean_cover)) +
      geom_line(aes(colour = Station),
                alpha = 0.7) +
      geom_point(size = 1.0, alpha = 0.7) +
      theme_bw() +
      facet_grid(Site ~ .) +
      ylab("Mean percent cover") +
      ggtitle("Kenya") +
      theme(strip.background = element_blank(),
            strip.text.y     = element_text(angle = 0),
            plot.title       = element_text(hjust = 0.5),
            legend.position  = "none")

Which now gives us something that looks like this:

In order to produce this plot, we have removed the group_by() statements to retain the variation in Station. We have also removed the error bars (for clarity, but could be added in), and we have changed the orientation of the Site names to make it easier to read.

In the Exercises below we will practise using different aesthetic “mappings” and facets to produce data visualisations that can communicate variations and different levels in the data while maintaining a “tidy” data structure.

Next Steps

As there are so many different aspects that can be controlled in ggplot2, we want to go through some additional examples that will use the basic skills of data manipulation and learn a bit more of the “grammar of graphics”.

Excercises for Percent Cover data can be found here.

And, exercises for Fish transect data can be found here