Skip to content

Commit

Permalink
deploy: 5c9d40c
Browse files Browse the repository at this point in the history
  • Loading branch information
hagenw committed Jan 3, 2024
1 parent 330dae8 commit d171bfa
Show file tree
Hide file tree
Showing 40 changed files with 2,779 additions and 34 deletions.
Binary file modified .doctrees/datasets.doctree
Binary file not shown.
Binary file added .doctrees/datasets/air.doctree
Binary file not shown.
Binary file added .doctrees/datasets/cough-speech-sneeze.doctree
Binary file not shown.
Binary file added .doctrees/datasets/crema-d.doctree
Binary file not shown.
Binary file modified .doctrees/datasets/emodb.doctree
Binary file not shown.
Binary file added .doctrees/datasets/micirp.doctree
Binary file not shown.
Binary file added .doctrees/datasets/musan.doctree
Binary file not shown.
Binary file added .doctrees/datasets/vadtoolkit.doctree
Binary file not shown.
Binary file modified .doctrees/environment.pickle
Binary file not shown.
Binary file added _images/air.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/cough-speech-sneeze.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/crema-d.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/emodb.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/micirp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/musan.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/vadtoolkit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions _sources/datasets.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,10 @@ For each dataset, the latest version is shown.
:maxdepth: 1
:hidden:

datasets/air
datasets/cough-speech-sneeze
datasets/crema-d
datasets/emodb
datasets/micirp
datasets/musan
datasets/vadtoolkit
64 changes: 64 additions & 0 deletions _sources/datasets/air.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
.. _air:

air
---

Created by Marco Jeub, Magnus Schäfer, Hauke Krüger, Christoph Matthias Nelke, Christophe Beaugeant, Peter Vary


============= ======================
version `1.4.2 <https://github.com/audeering/air/blob/main/CHANGELOG.md>`__
license `MIT <https://opensource.org/licenses/MIT>`__
source https://www.iks.rwth-aachen.de/en/research/tools-downloads/databases/aachen-impulse-response-database/
usage commercial
languages
format wav
channel 2
sampling rate 48000
bit depth 16
duration 0 days 00:04:43.719958333
files 107
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/air>`__
published 2023-12-21 by audeering-unittest
============= ======================


Description
^^^^^^^^^^^

The Aachen Impulse Response (AIR) database is a set of impulse responses that were measured in a wide variety of rooms. The initial aim of the AIR database was to allow for realistic studies of signal processing algorithms in reverberant environments with a special focus on hearing aids applications. The first version was published in 2009 and offers binaural room impulse responses (BRIR) measured with a dummy head in different locations with different acoustical properties, such as reverberation time and room volume. Besides the evaluation of dereverberation algorithms and perceptual investigations of reverberant speech, this part of the database allows for the investigation of head shadowing influence since all recordings where made with and without the dummy head. In a first update, the database was extended to BRIRs with various azimuth angles between head and desired source. This further allows to investigate (binaural) direction-of-arrival (DOA) algorithms as well as the influence of signal processing algorithms on the binaural cues. Since dereverberation can also be applied to telephone speech, the latest extension includes (dual-channel) impulse responses between the artificial mouth of a dummy head and a mock-up phone. The measurements were carried out in compliance with the ITU standards for both the hand-held and the hands-free position. Additional microphone configurations were added in the latest extension. For the third big extension, the IKS has carried out measurements of binaural room impulse responses in the Aula Carolina Aachen. The former church with a ground area of 570m² and a high ceiling shows very strong reverberation effects. The database will successively be extended to further application scenarios.

Example
^^^^^^^

:file:`data/air_binaural_stairway_1_1_0.wav`

.. image:: ../air.png

.. raw:: html

<p><audio controls src="air/data/air_binaural_stairway_1_1_0.wav"></audio></p>

Tables
^^^^^^

.. csv-table::
:header: ID,Type,Columns
:widths: 20, 10, 70

"brir", "filewise", "room, azimuth"
"phone", "filewise", "room, mode"
"rir", "filewise", "room, distance, reverberation-time"


Schemes
^^^^^^^

.. csv-table::
:header: ID,Dtype,Labels,Mappings

"azimuth", "float", ""
"distance", "float", ""
"mode", "str", "hand-held, hands-free"
"reverberation-time", "float", ""
"room", "str", "aula_carolina, bathroom, booth, corridor, kitchen, lecture, meeting, office, stairway", "floor cover, furniture, room height, room length, room width, wall surface"
59 changes: 59 additions & 0 deletions _sources/datasets/cough-speech-sneeze.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
.. _cough-speech-sneeze:

cough-speech-sneeze
-------------------

Created by S Amiriparian, S Pugachevskiy, N Cummins, D Hantke, J Pohjalainen, G Keren, Schuller, BW


============= ======================
version `2.0.1 <https://github.com/audeering/cough-speech-sneeze/blob/main/CHANGELOG.md>`__
license `CC-BY-4.0 <https://creativecommons.org/licenses/by/4.0/>`__
source Dataset based on the publication of Shahin Amiriparian: "Amiriparian, S., Pugachevskiy, S., Cummins, N., Hantke, S., Pohjalainen, J., Keren, G., Schuller, B., 2017. CAST a database: Rapid targeted large-scale big data acquisition via small-world modelling of social media platforms, in: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp. 340–345. https://doi.org/10.1109/ACII.2017.8273622"
usage commercial
languages
format wav
channel 1
sampling rate 16000, 44100
bit depth 16
duration 0 days 03:02:29.436148526
files 4310
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/cough-speech-sneeze>`__
published 2024-01-02 by audeering
============= ======================


Description
^^^^^^^^^^^

Cough-speech-sneeze: a data set of human sounds This dataset was collected by Dr. Shahin Amiriparian. It contains samples of human speech, coughing, and sneezing collected from YouTube, as well as silence clips. The original publication of this (possibly then extended) dataset is the following: Amiriparian, S., Pugachevskiy, S., Cummins, N., Hantke, S., Pohjalainen, J., Keren, G., Schuller, B., 2017. CAST a database: Rapid targeted large-scale big data acquisition via small-world modelling of social media platforms, in: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp. 340–345. https://doi.org/10.1109/ACII.2017.8273622

Example
^^^^^^^

:file:`coughing/6hw6_4eb_hq_18.41-19.81.wav`

.. image:: ../cough-speech-sneeze.png

.. raw:: html

<p><audio controls src="cough-speech-sneeze/coughing/6hw6_4eb_hq_18.41-19.81.wav"></audio></p>

Tables
^^^^^^

.. csv-table::
:header: ID,Type,Columns
:widths: 20, 10, 70

"files", "filewise", "category, duration"


Schemes
^^^^^^^

.. csv-table::
:header: ID,Dtype,Labels

"category", "str", "coughing, silence, sneezing, speech"
"duration", "time", ""
96 changes: 96 additions & 0 deletions _sources/datasets/crema-d.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
.. _crema-d:

crema-d
-------

Created by Houwei Cao, David G. Cooper, Michael K. Keutmann, Ruben C. Gur, Ani Nenkova, Ragini Verma, Samantha L Moore, Adam Savitt


============= ======================
version `1.2.0 <https://github.com/audeering/crema-d/blob/main/CHANGELOG.md>`__
license `Open Data Commons Open Database License (ODbL) v1.0 <http://opendatacommons.org/licenses/odbl/1.0/>`__
source https://github.com/CheyneyComputerScience/CREMA-D
usage commercial
languages English
format wav
channel 1
sampling rate 16000
bit depth 16
duration 0 days 05:15:21.404187500
files 7441
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/crema-d>`__
published 2024-01-02 by audeering
============= ======================


Description
^^^^^^^^^^^

CREMA-D: Crowd-sourced Emotional Mutimodal Actors Dataset CREMA-D is a data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified). When using the database commercially, the database must be referenced together with its license.

Example
^^^^^^^

:file:`1001/1001_TAI_HAP_XX.wav`

.. image:: ../crema-d.png

.. raw:: html

<p><audio controls src="crema-d/1001/1001_TAI_HAP_XX.wav"></audio></p>

Tables
^^^^^^

.. csv-table::
:header: ID,Type,Columns
:widths: 20, 10, 70

"emotion.categories.desired.dev", "filewise", "emotion, emotion.intensity"
"emotion.categories.desired.test", "filewise", "emotion, emotion.intensity"
"emotion.categories.desired.train", "filewise", "emotion, emotion.intensity"
"emotion.categories.dev", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level"
"emotion.categories.dev.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.dev.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.face.dev", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level"
"emotion.categories.face.dev.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.face.dev.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.face.test", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level"
"emotion.categories.face.test.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.face.test.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.face.train", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level"
"emotion.categories.face.train.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.face.train.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.multimodal.dev", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level"
"emotion.categories.multimodal.dev.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.multimodal.dev.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.multimodal.test", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level"
"emotion.categories.multimodal.test.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.multimodal.test.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.multimodal.train", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level"
"emotion.categories.multimodal.train.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.multimodal.train.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.test", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level"
"emotion.categories.test.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.test.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"emotion.categories.train", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level"
"emotion.categories.train.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement"
"emotion.categories.train.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness"
"files", "filewise", "speaker, corrupted"
"sentence", "filewise", "sentence"


Schemes
^^^^^^^

.. csv-table::
:header: ID,Dtype,Min,Max,Labels,Mappings

"corrupted", "bool", "", "", ""
"emotion", "str", "", "", "anger, disgust, fear, happiness, neutral, no_agreement, sadness"
"emotion.agreement", "float", "", "1", ""
"emotion.intensity", "str", "", "", "high, low, mid, unspecified"
"emotion.level", "float", "", "100", ""
"sentence", "str", "", "", "DFA, IEO, IOM, ITH, ITS, IWL, IWW, MTI, TAI, TIE, TSI, WSI", "✓"
"speaker", "int", "", "", "1001, 1002, 1003, 1004, 1005, 1006, 1007, [...], 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091", "age, ethnicity, race, sex"
"votes", "int", "", "", ""
6 changes: 4 additions & 2 deletions _sources/datasets/emodb.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Created by Felix Burkhardt, Astrid Paeschke, Miriam Rolfes, Walter Sendlmeier, B


============= ======================
version `1.3.0 <https://github.com/audeering/emodb/blob/main/CHANGELOG.md>`__
version `1.4.1 <https://github.com/audeering/emodb/blob/main/CHANGELOG.md>`__
license `CC0-1.0 <https://creativecommons.org/publicdomain/zero/1.0/>`__
source http://emodb.bilderbar.info/download/download.zip
usage unrestricted
Expand All @@ -19,7 +19,7 @@ bit depth 16
duration 0 days 00:24:47.092187500
files 535
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/emodb>`__
published 2022-08-05 by audeering-unittest
published 2023-04-05 by audeering-unittest
============= ======================


Expand All @@ -33,6 +33,8 @@ Example

:file:`wav/13b09La.wav`

.. image:: ../emodb.png

.. raw:: html

<p><audio controls src="emodb/wav/13b09La.wav"></audio></p>
Expand Down
58 changes: 58 additions & 0 deletions _sources/datasets/micirp.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
.. _micirp:

micirp
------

Created by Stewart Tavener (Xaudia.com)


============= ======================
version `1.0.0 <https://github.com/audeering/micirp/blob/main/CHANGELOG.md>`__
license `CC-BY-SA-4.0 <https://creativecommons.org/licenses/by-sa/4.0/>`__
source http://micirp.blogspot.com/
usage commercial
languages
format wav
channel 1
sampling rate 44100, 48000
bit depth 24
duration 0 days 00:00:27.341591837
files 66
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/micirp>`__
published 2023-12-21 by audeering
============= ======================


Description
^^^^^^^^^^^

The Microphone Impulse Response Project (MicIRP) contains impulse response data for vintage microphones. The impulse response files were created using the analysis software Fuzzmeasure. The microphones were tested using a swept-sine method in a small booth, treated with much acoustic foam, placed about 20 to 30 cm from the source. Although the recording system and booth are calibrated regularly with a Beyerdynamic measurement microphone, there are problems comparing, for example, a figure-8 ribbon with an omnidirectional standard, as they will see different amounts of reflections from the side. So, it should be noted that the impulse response files describe the microphones measured in the booth, rather than in free space.

Example
^^^^^^^

:file:`dirs/IR_AKGD12.wav`

.. image:: ../micirp.png

.. raw:: html

<p><audio controls src="micirp/dirs/IR_AKGD12.wav"></audio></p>

Tables
^^^^^^

.. csv-table::
:header: ID,Type,Columns
:widths: 20, 10, 70

"files", "filewise", "manufacturer"


Schemes
^^^^^^^

.. csv-table::
:header: ID,Dtype,Labels

"manufacturer", "str", "AKG, Altec, American, Amperite, Astatic, B&O, BBC, [...], Oktava, RCA, Reslo, STC, Shure, Sony, Telefunken, Toshiba"
77 changes: 77 additions & 0 deletions _sources/datasets/musan.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
.. _musan:

musan
-----

Created by David Snyder, Guoguo Chen, Daniel Povey


============= ======================
version `1.0.0 <https://github.com/audeering/musan/blob/main/CHANGELOG.md>`__
license `CC-BY-4.0 <https://creativecommons.org/licenses/by/4.0/>`__
source http://www.openslr.org/17/
usage commercial
languages ara, zho, dan, nld, eng, fra, deu, heb, hun, ita, jpn, lat, pol, por, rus, spa, tgl
format wav
channel 1
sampling rate 16000
bit depth 16
duration 4 days 13:17:22.582937499
files 2016
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/musan>`__
published 2023-12-20 by audeering-unittest
============= ======================


Description
^^^^^^^^^^^

The goal of this corpus is to provide data for music/speech discrimination, speech/nonspeech detection, and voice activity detection. The corpus is divided into music, speech, and noise portions. In total there are approximately 109 hours of audio. Reference: https://arxiv.org/abs/1510.08484

Example
^^^^^^^

:file:`noise/free-sound/noise-free-sound-0324.wav`

.. image:: ../musan.png

.. raw:: html

<p><audio controls src="musan/noise/free-sound/noise-free-sound-0324.wav"></audio></p>

Tables
^^^^^^

.. csv-table::
:header: ID,Type,Columns
:widths: 20, 10, 70

"files", "filewise", "duration"
"music", "filewise", "genre, vocals, artist, composer"
"music.fma", "filewise", "genre, vocals, artist, composer"
"music.fma-western-art", "filewise", "genre, vocals, artist, composer"
"music.hd-classical", "filewise", "genre, vocals, artist, composer"
"music.jamendo", "filewise", "genre, vocals, artist, composer"
"music.rfm", "filewise", "genre, vocals, artist, composer"
"noise", "filewise", "background_noise"
"noise.free-sound", "filewise", "background_noise"
"noise.sound-bible", "filewise", "background_noise"
"speech", "filewise", "gender, language"
"speech.librivox", "filewise", "gender, language"
"speech.us-gov", "filewise", "gender, language"


Schemes
^^^^^^^

.. csv-table::
:header: ID,Dtype,Labels

"artist", "str", ""
"background_noise", "bool", ""
"composer", "str", ""
"duration", "time", ""
"gender", "str", "female, male"
"genre", "str", ""
"language", "str", "ara, dan, deu, eng, fra, heb, hun, [...], lat, nld, pol, por, rus, spa, tgl, zho"
"vocals", "bool", ""
Loading

0 comments on commit d171bfa

Please sign in to comment.