Skip to content

Commit

Permalink
Add DataPipeline run to examples
Browse files Browse the repository at this point in the history
  • Loading branch information
claireh93 committed Aug 28, 2023
1 parent 5e546f0 commit fb13f27
Show file tree
Hide file tree
Showing 4 changed files with 161 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ makedocs(
"Diversity" => "diversity.md",
"Examples" => "examples.md",
"Africa" => "africa.md"
"Data Pipeline" => "pipeline.md"
]
],
strict=true,
Expand Down
18 changes: 18 additions & 0 deletions docs/src/pipeline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Data pipeline

**DataPipeline.jl** is a [Julia](https://github.com/FAIRDataPipeline/DataPipeline.jl) package that provides functionality for the [FAIR Data Pipeline](https://www.fairdatapipeline.org/). The pipeline is intended to enable tracking of provenance of FAIR (findable, accessible, interoperable and reusable) data. We use it in examples under the `pipeline` folder for a run of a single species in Africa.

See [here](https://www.fairdatapipeline.org/docs/data_registry/installation/) for details on what to install to set up the Data Pipeline. Once installed, the workflow is to initialise the pipeline in the repository, pull in the external data needed for the simulation (as described in the `AfricaRun.yaml`) and run the simulation. The output is also described in the `AfricaRun.yaml`, which is produced from the corresponding run file `AfricaRun.jl`. The output and provenance can then be pushed back to the online [data registry](https://data.fairdatapipeline.org/) to be inspected further.

```
## Initialise the data pipeline in the git repository
fair init
# Pull in any external data described in the yaml
fair pull .\examples\pipeline\AfricaRun.yaml
# Run the simulation described in the yaml
fair run .\examples\pipeline\AfricaRun.yaml
# Stage the code run using the unique identifier
fair add <code-run>
# Push the run and corresponding metadata back to the online registry
fair push
```
105 changes: 105 additions & 0 deletions examples/pipeline/AfricaRun.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
#### SINGLE SPECIES ####
# Code to run single species across Africa with WorldClim data.
using EcoSISTEM
using EcoSISTEM.ClimatePref
using EcoSISTEM.Units
using RasterDataSources
using AxisArrays
using Unitful
using Unitful.DefaultSymbols
using StatsBase
using Plots
using DataPipeline

# Initialise datapipeline
handle = DataPipeline.initialise()

# Download temperature and precipitation data
path = link_read!(handle, "AfricaModel/WorldClim")
newpath = unzip(path)
world = readbioclim(newpath)
africa_temp = world.array[-25°.. 50°, -35° .. 40°, 1]
bio_africa = uconvert.(K, africa_temp .* °C)
bio_africa = Worldclim_bioclim(AxisArray(bio_africa, AxisArrays.axes(africa_temp)))
africa_water = world.array[-25°.. 50°, -35° .. 40°, 12] .* mm
africa_water = Worldclim_bioclim(AxisArray(africa_water, AxisArrays.axes(africa_temp)))
bio_africa_water = WaterBudget(africa_water)

# Find which grid cells are land
active = Array{Bool, 2}(.!isnan.(bio_africa.array))

heatmap(africa_temp')

# Set up initial parameters for ecosystem
numSpecies = 1; grid = size(active); req= 0.1mm; individuals=0; area = 64e6km^2; totalK = 1000.0kJ/km^2

# Set up how much water each species consumes
energy_vec = WaterRequirement(fill(req, numSpecies))

# Set rates for birth and death
birth = 0.6/year
death = 0.6/year
longevity = 1.0
survival = 0.2
boost = 1.0
# Collect model parameters together
param = EqualPop(birth, death, longevity, survival, boost)

# Create kernel for movement
kernel = fill(GaussianKernel(15.0km, 10e-10), numSpecies)
movement = AlwaysMovement(kernel, Torus())

# Create species list, including their temperature preferences, seed abundance and native status
opts = fill(280.0K, numSpecies)
vars = fill(10.0K, numSpecies)
traits = GaussTrait(opts, vars)
native = fill(true, numSpecies)
abun = fill(div(individuals, numSpecies), numSpecies)
sppl = SpeciesList(numSpecies, traits, abun, energy_vec,
movement, param, native)

# Create abiotic environment - with temperature and water resource
abenv = bioclimAE(bio_africa, bio_africa_water, active)

# Set relationship between species and environment (gaussian)
rel = Gauss{typeof(1.0K)}()

# Create ecosystem and fill every active grid square with an individual
eco = Ecosystem(sppl, abenv, rel)
rand_start = findall(active)
for i in rand_start
eco.abundances.grid[1, i[1], i[2]] += 1
end

# Run simulation
times = 10years; timestep = 1month; record_interval = 1month; repeats = 1
lensim = length(0years:record_interval:times)
abuns = zeros(Int64, numSpecies, prod(grid), lensim)
@time simulate_record!(abuns, eco, times, record_interval, timestep);

# Reshape abundances for plotting
abuns = reshape(abuns[1, :, :, 1], grid[1], grid[2], lensim)

# Plot start and end abundances, next to temperature and rainfall
africa_startabun = Float64.(abuns[:, :, 1])
africa_startabun[.!(active)] .= NaN
africa_endabun = Float64.(abuns[:, :, end])
africa_endabun[.!(active)] .= NaN
heatmap(africa_startabun', clim = (0, maximum(abuns)),
background_color = :lightblue, background_color_outside=:white,
grid = false, color = cgrad(:algae, scale = :exp),
layout = (@layout [a b; c d]))
heatmap!(africa_endabun', clim = (0, maximum(abuns)),
background_color = :lightblue, background_color_outside=:white,
grid = false, color = cgrad(:algae, scale = :exp),
subplot = 2)

africa_temp = world.array[-25°.. 50°, -35° .. 40°, 1]
africa_water = world.array[-25°.. 50°, -35° .. 40°, 12]
heatmap!(africa_temp', grid = false, subplot = 3)
heatmap!(africa_water', grid = false, subplot = 4)

path = link_write!(handle, "Africa-plot")
Plots.pdf(path)

DataPipeline.finalise(handle)
37 changes: 37 additions & 0 deletions examples/pipeline/AfricaRun.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
run_metadata:
default_input_namespace: claireh93
description: Africa model inputs
script: |
julia --project=examples examples/pipeline/AfricaRun.jl
register:
- namespace: UCDavis
full_name: University of California Davis
website: https://ror.org/05rrcem69
- namespace: GBIF
full_name: Global Biodiversity Information Facility
website: https://ror.org/05fjyn938

- external_object: AfricaModel/WorldClim
namespace_name: UCDavis
root: https://biogeo.ucdavis.edu/
path: data/worldclim/v2.1/base/wc2.1_10m_bio.zip
title: WorldClim Bioclimatic variables
description: Bioclimatic variables are derived from the monthly temperature and rainfall values in order to generate more biologically meaningful variables.
identifier: https://doi.org/10.1002/joc.5086
file_type: zip
release_date: 2017-03-28T12:00
version: "1.0.0"
primary: True
authors:
- https://ror.org/05rrcem69

write:
- data_product: Africa-plot
description: Plot start and end abundances, next to temperature and rainfall
file_type: pdf
use:
data_product: AfricaModel/Africa-plot



0 comments on commit fb13f27

Please sign in to comment.