Previous steps

If you would like to return to information from the previous section, please click here.

Context

All coral reef monitoring data cannot be treated the same. The nature of data can vary in terms of the number of replicates, accuracy of measurements, taxonomic detail and expertise of observers. As part of the Data Standards & Reproducible Research training module, we would like to provide an overview of the GCRMN data standards and criteria it uses to assign quality levels. We will use this as part of the course when learning how to deal with data of different taxonomic resolutions (for example) and may also help in improving the data capture and monitoring for your particular region.

The GCRMN data model emphasises data sharing and interoperability based on the Findable Accessible Interoperable Reusable FAIR principles.

The general characteristics of FAIR principles can be summarised as follows:

As one can see in this table, a central element to the FAIR principles is the Metadata, which provides the means to identify, access, interpret and reference the data. As part of this training course, we will be providing the general framework and some tools to document coral reef monitoring data, including the use of version control and documenting the steps of data standardisation and cleaning in code.

In addition, we will be introducing ways to link raw data, standardisation and cleaning with visualisation and reporting, which allows for the reproducibility of data analysis and reports and assist with the Resuable principle.

Data Quality model

The GCRMN data quality model is based on scores of 1 to 3 based on team experience, sample design, level of evidence, and documentation, including:

  1. number of sampling units, of specified types
  2. physical grain of unit measure (ie. spacing/size issues)
  3. taxonomic identification level (how precise)
  4. experience and training of the monitoring team

Each variable (i.e. coral, algae, fish) along criteria in standardised tables to provide the data quality rating.

For example, for the criterion Number of sampling units, of specified types, the number of sampling units is fundamental in estimating variance at the spatial scale of the variable being measured.

For the Sample Resolution criteria, this refers to the size in the unit of measure, such as the spacing between points in a point intersect transect, size classes of fish, group classes for counts, et cetera.

The taxonomic resolution of the data has important implications on how the data can be used to inform the state or trends for variables related to reef health. As GCRMN aggregates data from across numerous scales of time and space, the finest scales of taxonomic resolution may not provide useful additional material for GCRMN reporting. In order to assign a data quality score for taxonomic resolution, GCRMN applies the following template:

Lastly, the experience of monitoring team members is important in determining the quality of the resulting data. The criteria aims to address both formal education and practical experience (e.g. fishermen monitoring fish, experienced and trained diver-monitors) education, years of experience and other such factors.

As part of the data quality model framework, GCRMN intends to review and update these standards through time. As changes in technology, sampling techniques and availability of information resources (e.g. for identifying taxonomic levels, photographic quadrate techniques) can influence data quality, it is important to provide a way to link these standards with particular data sets collected at a given period in time.

The documentation of these criteria as part of the metadata documentation is crucial for those aggregating data at regional or global levels and also provides a useful benchmark for constructing long time series data.

For more information

GCRMN is considering to be registered as part of the GoFair Implementation Network and on the IODE Best Practices repository, as part of its data access objectives.

Next Steps

To better understand the relevant Knowledge management and version control have a look here