Skip to content

ESIP Reporting for GSoC 2019

Abhishek Singh edited this page Aug 19, 2019 · 25 revisions

This page is the location for the reporting related to the 2019 GSoC project titled OrcaCNN: Detecting and Classifying Killer Whales from Acoustic Data

Student: Abhishek Singh; Mentor: Jesse Lopez; ESIP POC: Annie Bryant Burgess

Community Bonding Period (May 6th - May 27th):

Activity:

  • Established a communication protocol with Jesse
  • Being very active since the beginning of the project, I have a fair idea of the requirements and the vision of the project. I've communicated the same to Jesse.
  • We've decided to keep a consistent code style throughout and to use CoLab for pre-processing and further training.
  • To keep complexity at a minimum and to aid in further template matching process, we decided to pre-process audio files into images as of now.
  • With the help of the pre-processing steps mentioned and coded out, we will be proceeding the same way.
  • Once we get the data within a few days, I will be standardizing and visualizing the samples

Next Week's Agenda:

May 28 - June 10:

Activity:

  • Currently, the data is being uploaded into Google Cloud from the drives Jesse received from Dan.
  • For starters, a pre-processing script has been written with details on the PR and some changes were made to fit into standards (#10, #11).
  • We are looking to label the incoming data in cloud and to load them efficiently into colab, where further development of the model would take place.
  • The data has been uploaded around June 2nd.
  • As discussed with Jesse, a script (#12) was developed which divides the samples into fixed-size chunks and compute their spectrogram while maintaining the directory structure.
  • I'll be starting the labeling work for now and we are expecting Dan to fill us on the actual categorization of data on 12th June when he'll be back from the field work.

June 10 - June 17:

Activity:

  • The Pre-Processing script was complete with this pull and can be seen here. The Readme has also been updated for new users to catch up on the work.
  • I'm looking for ways to optimize the Colab notebook for saving of spectrogram images.
  • Moving to the next stage, I've started the labelling work after checking in with Jesse. Currently, I'm downloading the images of both 1s (which I'll be using) and 3s(just in case 1s doesn't work as expected) chunks which I'll be labelling for the detection and classification model.
  • Dan still seems to be in the field and he is supposed to provide us the correct labelling for the classification of pods. There's still time, so we can wait for a few more days.

June 18 - June 24:

Activity:

  • For chunkSize of 1s, there seem to be around 161k samples for the 14 years of data Dan provided. I will be labelling them into positive and negative classes for the detection model hopefully by the end of this week.

  • The detection model to label out the images is now done and soon I'll be starting with the Detection model on the whole dataset. I'll be mentioning a few things I tried out and failed/succeeded.

    • Having taken only a small subset of around 500 training and 180 validation images, one would expect only a few layers should do the work. So I started small with 3 layers having reLu activation functions and last FC layer with sigmoid with 3 by 3 filters and no padding.
    • The learning rates I used are: [0.01, 0.001, 0.0001 and 3e-4] with optimizers: [Adam, SGD and rmsprop].
    • With a few iterations I found the three optimizers with the varying lr's to not learn anything useful from the data, i.e., they were successful at randomly guessing out the class in the unseen data, which was mostly false. I choose the image size to be 224x224 (most/max images in the data were of size 425x623) without any strong reason.
    • The reason for the above was that my validation accuracies and loss were fluctuating/very volatile. Several reasons could lead to this: my model's lr was too high or the batch size was not sufficient or insufficient model capacity. To rectify this, I started by changing my hyperparameters by keeping the model architecture the same. The result was same.
    • On using a very small network with like 1-2 layers, I found SGD actually generalizing somewhat well compared to the Adam optimizer. But the result was not satifactory.
    • This was when I started going through my data (as I find it important very often), I noticed almost every image was pixelated/kinda blurry and as I read in this paper that this reduces their performance. And since I was also squashing my images to size 224x224. The said things could be one of the reasons of my network not learning well.
    • What I did next was to use input size images of 300x500,(which would increase the number of parameters for my network) and increased the number of layers with filter size 5x5 and introduced padding. It was also necessary to shuffle the training dataset as keras takes them in batches in the same order.
    • Using SGD for this larger network with lr=3e-4, it did not learn well and the validation loss was as I'm guessing stuck in a local minima. Then I used Adam with it's adaptive lr I believed it should work well out, and it did.

**Few links which were very useful: 1 2 3 **

  • As soon as I'm done with labelling, I'll start with the detection model.

June 24 - July 1:

Activity:

  • The labelling has been done, for the detection model. The pos and neg classes have 20,608 and 20,999 samples respectively. Around 8k images of humpbacks were augmented and added in the negative class.
  • The dataset has been divided in the ratio 70:25:5 into training, validation and testing sets.
  • I'll be doing the training and tuning of hyperparameters for this week to find the optimal model for the orca vs humpbacks model.

July 2 - July 15:

Activity:

  • The training and finding the optimal hyperparameters with the given data is being carried out while writing this.
  • Jesse mentioned there is a possibility of getting more humpback samples from MBARI, so I'll be adding those samples too if we get them.
  • I've been training the detection model for quite a few days now (July 2 - July 9), I've reached 95% test accuracy recently using a few conv-conv layers with strides of 3 and 5x5 filter sizes. I've shared some of my approaches here. I am planning to train for more epochs for one of the config's. It shows promising results with ability for more reduction in loss. So, I'm currently trying to train it for more epochs and test it.

July 15 - July 22:

Activity:

  • Meanwhile, last week was fully spent in training and tuning the hyperparameters with testing on unseen samples. The model was so far quite successful in distinguishing between orcas and humpbacks with some false postives.
  • The code was already ready within the start of this phase and after checking in with Jesse, I'll push them.
  • The code also features a method to find the number and start and end time of orca calls in unseen samples.
  • This blog post shares my journey in developing the detection model.
  • I'll start with the creation of pod-classification dataset and also training of it by the end of this week.

July 22 - July 29:

Activity:

  • This week is going to be spent on curating the pod-classification dataset. There are a total of 22 pods in our dataset.
  • As soon as I'm done with the creation of dataset, I'll start with the training of pod-classification model.

July 30 - Aug 8:

Activity:

  • Due to a little delay, the training of pod-classification model shall be started after 3rd of August. The same has been communicated to Jesse.
  • The dataset has been developed with around 250-300 images in each of the 20 pods (AJ22 and AN10 has 0 images). The training and hyperparameter tuning has been simultaneously started on 6th Aug.

Aug 8 - Aug 26 (End of 3rd Phase):

Activity:

  • After running a few iterations, the accuracies were found to stabilise around 61% on unseen data. To prevent overfitting, techniques like BatchNormalization, Regularizers and Dropouts were applied and tested. BatchNormalization and Regularizers seems to have worse affects and were naturally dropped.
  • Due to the distribution of data among 20 classes, 61% of accuracy was much better than our expectations. The final pod-classification code will be pushed soon along with proper documentation.
  • The required milestones have been completed well in time: