Imagine you wanted to build a database of traits. You might start by compiling data from existing datasets, but you'd quickly find that there are many ways to name and measure the same trait, that different studies use different units, or use an outdated name for a species or taxon.
The traits.build
package provides a workflow for harmonising data from
disconnected primary sources and arises from the AusTraits project austraits.org. In 2023 this package was spun out as a separate package from the autraits.build
repository.
The goals of this package are to:
- Enable users to create open-source, harmonised, reproducible databases from disparate datasets.
- Provide a fully transparent workflow, where all decisions on how the data are handled are exposed.
- Offer a relational database structure that fully documents the contextual data essential to interpreting ecological data.
- Offer a straightforward, robust template for building a trait dictionary.
- Offer a database structure that is flexible enough to accommodate the complexities inherent to ecological data.
- Offer a database structure that is underlain by a documented ontology, ensuring each database field is interpretable and interoperable with other databases and data structures.
- Have no dependencies on proprietary software or costs to setup and maintain (beyond person time).
To handle the harmonising of diverse data sources, we use a reproducible workflow to implement the various changes required for each source to reformat it suitable for incorporation in a harmonised compilation. Such changes include restructuring datasets, renaming variables, changing variable units, changing taxon names.
- Familiarity with the R programming language, covered in R for Data Science.
- Data science workflow management techniques.
- How to write functions to prepare data, analyse data, and summarise results in a data analysis project.
- Appreciation of `traits.build`` workflow, including the required file structure.
There are multiple ways to install the traits.build
package itself, and both the latest release and the development version are available.
Type | Source | Command |
---|---|---|
Release | CRAN | coming |
Development | GitHub | remotes::install_github("traitecoevo/traits.build") |
- User manual: in-depth
discussion about how to use
traits.build
. - Reference website: formal documentation of all user-side functions.
Please read the help guide to learn how best to ask for help using traits.build
.
- Please note that the package follows the Contributor Code of Conduct for the AusTraits projects. By contributing to this project you agree to abide by its terms.
A publication describing the traits.build
workflow:
Wenk E, Bal P, Coleman D, Gallagher R, Yang S, Falster D, (2024) Traits.build: A data model, workflow and R package for building harmonised ecological trait databases. Ecological Informatics 83: 102773. DOI: 10.1016/j.ecoinf.2024.102773
A publication describing the biggest database using the traits.build
workflow:
Falster D, Gallagher R, Wenk, E et al. (2021) AusTraits, a curated plant trait database for the Australian flora. Scientific Data 8: 254. DOI: 10.1038/s41597-021-01006-6
Funding: The AusTraits project received investment (https://doi.org/10.47486/TD044, https://doi.org/10.47486/DP720) from the Australian Research Data Commons (ARDC). The ARDC is funded by the National Collaborative Research Infrastructure Strategy (NCRIS).