If you would like to return to information from the previous section, please click here.
The compilation of regional data on the status & trends of coral reefs often involves a number of different taxomic levels, groupings and taxonomic standards. One way to help in standardising taxonomy for benthic classes (i.e. corals, macroalgae, sessile invertebrates) and fishes is to use taxonomic databases.
These databases, such as Catalogue of Life and WoRMS, provide higher level taxonomic information, authorities, and can be used to track changes in taxonomy over time.
This wiki page provides an overview of how to link with these
taxonomic databases using the package taxize
.
The package taxize
is available on the ROpenSci platform and can be installed
using devtools::install_github()
:
# install from ropensci for taxonomic databases
install.packages("devtools")
devtools::install_github("ropensci/taxize")
devtools::install_github("ropensci/taxizesoap")
devtools::install_github("cran/XMLSchema")
The additional packages provide additional functionality for accessing the databases.
Most often, users will want to obtain a list of taxa (e.g. from a data object of monitoring data). This can help in summarising data at Family, Order or other higher taxonomic category where data were recorded at different taxonomic levels (e.g. species, genera).
To provide an example, we first create a list of taxa of interest:
# create taxa of interest
taxa_to_get <-
c("Acanthastrea",
"Acropora",
"Astreopora",
"Cespitularia",
"Coscinaraea",
"Cyphastrea",
"Dendronephthya",
"Diploastrea",
"Dipsastraea",
"Echinopora",
"Favia",
"Favites",
"Fungia",
"Galaxea",
"Goniastrea",
"Goniopora")
We then create an empty object to hold the results and loop through the list to extract the taxonomy:
# create empty object to hold results
wio_benthic_taxa_eol <- tibble()
# loop to get taxa # i=2 ## -- for testing -- ##
for(i in 1:length(taxa_to_get)) {
# for(i in c(1:36,
# 38:length(taxa_to_get))) {
# # get col ids
# col_ids <-
# # paste0(taxa_to_get[i]) %>%
# get_eolid(# kingdom = "Animalia",
# sci_com = paste0(taxa_to_get[i]),
# rows = 1)
# # convert to numeric
# col_ids <- col_ids[1] %>% as.numeric()
# get classification
dat <-
classification(sci_id = taxa_to_get[i],
# db = "eol",
db = "worms",
rows = 1)
# add identifier
dat <-
dat[[1]] %>%
mutate(benthic_name = taxa_to_get[i])
# harvest results
wio_benthic_taxa_eol %<>%
bind_rows(dat)
}
Note that in this example, we are using the WoRMS database instead of
the Catalogue of Life (i.e. db = "worms"
). The
rows = 1
parameter automatically selects the first entry in
a list (i.e. for a genus like Pocillopora, there will be
numerous entries related to individual species).
To tidy the information extracted from the taxonomic databases, we will organise the information at taxonomic levels, and set to wide format (i.e. so taxonomic levels are in columns):
# set ranks of interest
ranks_of_interest <-
c("Kingdom",
"Phylum",
"Class",
"Subclass",
"Order",
"Suborder",
"Family",
"Genus")
# filter ranks of interest
wio_benthic_taxa_eol %<>%
dplyr::filter(rank %in% ranks_of_interest)
# set to wide
wio_benthic_taxa_eol %<>%
dplyr::select(name,
rank,
benthic_name) %>%
spread(rank, name)
# put in order
wio_benthic_taxa_eol %<>%
dplyr::select(Kingdom,
Phylum,
Class,
Subclass,
Order,
Suborder,
Family,
Genus,
# species,
benthic_name) %>%
distinct()
After saving the intermediate data object as a *.rda
,
these can be used to link to the main data object (e.g. conserving the
name of the benthic_name
column) in a separate script. This
provides a separation of the extraction of the taxonomic information
separate from other data grooming tasks.
Next, we will have a look at extracting additional information from external databases such as Fishbase, IUCN Redlist