Skip to content
Scott Veirs edited this page Jul 27, 2020 · 22 revisions

There are many recordings of killer whales available, but relative to other marine mammal species, there is a paucity of labeled data. For example, many toothed whale species are included in the [Mobysound archive]9https://www.mobysound.org/mobysound.html), but not yet killer whales (as of July, 2020).

This page documents the growing array of labeled data specific to killer whale ecotypes, with a primary focus on Southern Resident Killer Whales, and a secondary focus on other ecotypes of the Northeast Pacific Ocean. Open data sources, including those provided by Orcasound member organizations, are listed first to promote collaboration. Closed data sources are listed in the hope that they become valuable to the open-source and open data community in the future.

Open data sources

Orcasound data

This section contains Orcasound data sets aimed at training machine learning models that detect or classify the signals of killer whales. The primary focus is on binary classification of any Southern Resident Killer Whales SRKW calls (yes/no), but labels may also indicate call type, or whistles, or clicks, and there are also some resources related to Bigg's or transient killer whales.

NOTE: To access these data you cannot use a browser. Instead note the URL and use the AWS Command Line Interface in a terminal window to access the public files. See the Data access via AWS CLI page to learn more about the AWS Command Line Interface. Many of these data are aggregated within the Orcasound "Acoustic Sandbox" (a public S3 bucket).

Closed or restricted data sources

Non-Orcasound labeled data sources (not necessarily or not yet open)

  • SRKWs
    • Orca Behavior Institute (Monika Wieland), historic data from cabled Lime Kiln State Park and some
    • NOAA (Marla Holt, Candice Emmons), mostly autonomous recorders on outer coast WA
    • ONC (Kristen Kanes, Science open data set in 2020?), cabled arrays on outer BC shelf (Barkley Canyon; and Georgia Strait? Early versions were not specific to ecotype?)
    • DFO (James Pilkington), mostly autonomous recorders on outer coast BC (mostly clips? may be specific to ecotype)
    • SMRU/TWM (Jason Wood), some labeled by Alex Harris (30,000 general KWs; 30,000 non-KWs)
    • JASCO (David Hannay? Ruth Joy?), 5 second clips
  • NRKWs
    • OrcaLab (Paul Spong, Helena Symonds), cabled near-shore hydrophones in Johnstone Strait, B.C.
    • [Pacific Wild unlabeled archive](https://soundcloud.com/pacificwild0 (Soundcloud), cabled near-shore hydrophones in central B.C., near Bella Bella.
  • Alaska residents
    • OrcaCNN (Dan Olsen), autonomous recorders with signals from KWs (also Bigg's?)
  • Bigg's (transients)
    • No labeled data (to our knowledge)
    • Raw data sources:
      • U.S. Navy recording of transients in Dabob Bay (2005, ~42 minutes of vocalization, echolocation, percussives; mp3 format from original AIFF...)
      • John Ford contribution to Orcasound open-access data project of transient call types (T1,3,7,8) (Recorded by F. Thomsen on August 25 1996 near Numas I., Queen Charlotte Strait with many calls from T014, T015)
      • Alaskan transients (via Dan Olsen and Hannah Myers)
        • Many recordings of AT1s (only 7 individuals left; unique sounding calls relative to other transients)
        • Gulf of Alaska transients (need to be digitized)