A pipeline for generating maps of the species of flowers around a bee colony using drone imagery. The pipeline uses Agisoft Metashape to stitch the drone images together into an orthomosaic, various computer vision algorithms to segment each plant from its background, and a pre-trained random forest classifier to label each plant by its species.
Execute the following commands or download the latest release manually.
git clone https://github.com/beelabhmc/flower_map.git
The pipeline is written as a Snakefile which can be executed via Snakemake. We recommend using version 5.20.1 for reproducibility:
conda create -n snakemake -c bioconda -c conda-forge --no-channel-priority 'snakemake==5.20.1'
We highly recommend you install Snakemake via conda like this so that you can use the --use-conda
flag when calling snakemake
to let it automatically handle all dependencies of the pipeline. Otherwise, you must manually install the dependencies listed in the env files.
Per Snakemake's new installation protocol, the snakemake conda packages and their dependencies should be better managed with mamba, a C++ implementation of conda. The current method to install snakemake version 5.20.1 in a newly created conda environment called snakemake is:
# if you have not yet installed mamba
# conda install -n base -c conda-forge mamba
mamba create -c conda-forge -c bioconda -n snakemake snakemake=5.20.1
Similar to any conda env you created with the default miniconda, you can activate this env with
conda activate snakemake
Our Snakefile assumes that there is a metashape.lic
file containing the Metashape License in the same directory as the run.bash
script. Without this file, the pipeline will attempt to run Metashape unlicensed, which usually fails on import. To create the file, run the following command after activating your snakemake
conda environment:
metashape_LICENSE="your-25-digit-license-key-goes-here" ./run.bash -U create_license
-
Activate snakemake via
conda
:conda activate snakemake
-
Execute the pipeline
Locally:
./run.bash &
or on an SGE cluster:
qsub run.bash
Log files describing the output of the pipeline will be created within the output directory. The log
file contains a basic description of the progress of each rule, while the qlog
file is more detailed.
You must modify the config.yaml file to specify paths to your data. See our wiki for more information.
We recommend that you run snakemake --help
to read about Snakemake's options. For example, to check that the pipeline will be executed correctly before you run it, you can call Snakemake with the -n -p -r
flags. This is also a good way to familiarize yourself with the steps of the pipeline and their inputs and outputs (the latter of which are inputs to the first rule in the pipeline -- ie the all
rule).
Note that Snakemake will not recreate output that it has already generated, unless you request it. If a job fails or is interrupted, subsequent executions of Snakemake will just pick up where it left off. This can also apply to files that you create and provide in place of the files it would have generated.
A Snakefile for running the entire pipeline. It uses overlapping drone imagery to create a map of the species of flowers surrounding a bee colony.
A config file that define options and input for the pipeline. You should start by filling this out.
Various scripts used by the pipeline. See the script README for more information.
An example bash script for executing the pipeline using snakemake
and conda
. Any arguments to this script are passed directly to snakemake
.