Skip to content

Latest commit

 

History

History

kw-detector

HALLO killer whale detector

Below are instructions for building a deep-learning model to detect killer-whale calls, following the approach described here.

Prerequisites

You will need a Python environment with ketos v2.6.1 (or newer) and ketos-scripts.

Installation

Ketos may be installed with pip install ketos whereas ketos-scripts must be obtained from this GitLab repository and compiled from source with

python setup sdist
pip install dist/ketos-scripts-0.0.1.tar.gz 

Configuration files

  • db.yaml: contains the specifications for creating a database file with sound samples for training and testing a deep learning model at detecting killer-whale vocalisations.

  • spec.json: specifies the form in which the sound samples will be stored. In the present case, the sound samples are transformed to spectrograms of 5 second duration.

  • train.yaml: contains the specifications for training the deep learning model.

  • resnet_recipe.json: specifies the architecture of the neural network, in this case a fairly standard ResNet.

Usage

While the audio data required to reproduce steps 1 and 2 below have yet to be made publicly available (we are working on it), the trained models have been included in this repository so you can try out step 3 while you wait for us to release the data.

1. Prepare the data

To create the database file, run the following command from within HALLO-models/kw-detector/,

kt-db --config db.yaml --output_path db.h5

This will create a database file called db.h5 in the same directory.

2. Train the model

Next, to train the deep learning model, run the command

kt-train --config train.yaml

When the training is complete, the model will automatically be saved to output/kw-det_000/kw-det_000-00.kt.

3. Apply the trained model

Finally, to run the model on a set of audio files,

kt-run output/kw-det_000/kw-det_000-00.kt <path>

where <path> is the path to the folder containing the audio files. Killer whale calls detected by the model are saved to a csv file in RavenPro compatible format. Use kt-run -h to see the full list of command-line arguments available for configuring the detector at run time.

If you want to try out the trained model in PAMGuard, you first have to convert it to slightly different format. This is easily accomplished with a few lines of Python code:

from ketos_scripts.utils.nn_utils import import_nn_interface, which_nn_interface
input_path = "output/kw-det_000/kw-det_000-00.kt"
new_path = "kw-det_000-00.ktpb"
name, module_path = which_nn_interface(input_path)
nn_interface = import_nn_interface(name=name, module_path=module_path)
model, audio_repr = nn_interface.load(input_path, load_audio_repr=True)
model.save(new_path, audio_repr=audio_repr[0])

Trained models

The trained model can be downloaded in two formats:

This model was trained on approximately 29,000 killer-whale calls extracted from underwater recordings obtained at Roberts Bank and Boundary Pass in the Salish Sea. More details can be found here.

The figure below shows the performance of the model at detecting killer-whale calls in sections of the Roberts Bank and Boundary Pass recordings that were held out (i.e. not used) during training.

The performance is quantified in terms of,

  • Recall (R): The model's ability to detect a killer-whale call when it is present, where R=0 means that the model is not detecting any of the calls and R=1 means it is detecting all the calls.

  • False-positive probability (FPP): The model's tendency to issue a false alarm when no killer-whale call is in fact present, where FPP=0 means that the model is issuing no false alarms at all and FPP=1 means it is issuing false alarms for every 'empty' sound sample.

Citation

This work was presented at the 181st Meeting of the Acoustical Society of America, held in December 2021 in Seattle, USA.