ESIP Reporting for GSoC 2019

This page is the location for the reporting related to the 2019 GSoC project titled OrcaCNN: Detecting and Classifying Killer Whales from Acoustic Data

Activity:

Established a communication protocol with Jesse
Being very active since the beginning of the project, I have a fair idea of the requirements and the vision of the project. I've communicated the same to Jesse.
We've decided to keep a consistent code style throughout and to use CoLab for pre-processing and further training.
To keep complexity at a minimum and to aid in further template matching process, we decided to pre-process audio files into images as of now.
With the help of the pre-processing steps mentioned and coded out, we will be proceeding the same way.
Once we get the data within a few days, I will be standardizing and visualizing the samples

Next Week's Agenda:

Activity:

Currently, the data is being uploaded into Google Cloud from the drives Jesse received from Dan.
For starters, a pre-processing script has been written with details on the PR and some changes were made to fit into standards (#10, #11).
We are looking to label the incoming data in cloud and to load them efficiently into colab, where further development of the model would take place.
The data has been uploaded around June 2nd.
As discussed with Jesse, a script (#12) was developed which divides the samples into fixed-size chunks and compute their spectrogram while maintaining the directory structure.
I'll be starting the labeling work for now and we are expecting Dan to fill us on the actual categorization of data on 12th June when he'll be back from the field work.

Activity:

The Pre-Processing script was complete with this pull and can be seen here. The Readme has also been updated for new users to catch up on the work.
I'm looking for ways to optimize the Colab notebook for saving of spectrogram images.
Moving to the next stage, I've started the labelling work after checking in with Jesse. Currently, I'm downloading the images of both 1s (which I'll be using) and 3s(just in case 1s doesn't work as expected) chunks which I'll be labelling for the detection and classification model.
Dan still seems to be in the field and he is supposed to provide us the correct labelling for the classification of pods. There's still time, so we can wait for a few more days.

Activity:

For chunkSize of 1s, there seem to be around 161k samples for the 14 years of data Dan provided. I will be labelling them into positive and negative classes hopefully by the end of this week.

Provide feedback