-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
40 changed files
with
2,779 additions
and
34 deletions.
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
.. _air: | ||
|
||
air | ||
--- | ||
|
||
Created by Marco Jeub, Magnus Schäfer, Hauke Krüger, Christoph Matthias Nelke, Christophe Beaugeant, Peter Vary | ||
|
||
|
||
============= ====================== | ||
version `1.4.2 <https://github.com/audeering/air/blob/main/CHANGELOG.md>`__ | ||
license `MIT <https://opensource.org/licenses/MIT>`__ | ||
source https://www.iks.rwth-aachen.de/en/research/tools-downloads/databases/aachen-impulse-response-database/ | ||
usage commercial | ||
languages | ||
format wav | ||
channel 2 | ||
sampling rate 48000 | ||
bit depth 16 | ||
duration 0 days 00:04:43.719958333 | ||
files 107 | ||
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/air>`__ | ||
published 2023-12-21 by audeering-unittest | ||
============= ====================== | ||
|
||
|
||
Description | ||
^^^^^^^^^^^ | ||
|
||
The Aachen Impulse Response (AIR) database is a set of impulse responses that were measured in a wide variety of rooms. The initial aim of the AIR database was to allow for realistic studies of signal processing algorithms in reverberant environments with a special focus on hearing aids applications. The first version was published in 2009 and offers binaural room impulse responses (BRIR) measured with a dummy head in different locations with different acoustical properties, such as reverberation time and room volume. Besides the evaluation of dereverberation algorithms and perceptual investigations of reverberant speech, this part of the database allows for the investigation of head shadowing influence since all recordings where made with and without the dummy head. In a first update, the database was extended to BRIRs with various azimuth angles between head and desired source. This further allows to investigate (binaural) direction-of-arrival (DOA) algorithms as well as the influence of signal processing algorithms on the binaural cues. Since dereverberation can also be applied to telephone speech, the latest extension includes (dual-channel) impulse responses between the artificial mouth of a dummy head and a mock-up phone. The measurements were carried out in compliance with the ITU standards for both the hand-held and the hands-free position. Additional microphone configurations were added in the latest extension. For the third big extension, the IKS has carried out measurements of binaural room impulse responses in the Aula Carolina Aachen. The former church with a ground area of 570m² and a high ceiling shows very strong reverberation effects. The database will successively be extended to further application scenarios. | ||
|
||
Example | ||
^^^^^^^ | ||
|
||
:file:`data/air_binaural_stairway_1_1_0.wav` | ||
|
||
.. image:: ../air.png | ||
|
||
.. raw:: html | ||
|
||
<p><audio controls src="air/data/air_binaural_stairway_1_1_0.wav"></audio></p> | ||
|
||
Tables | ||
^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Type,Columns | ||
:widths: 20, 10, 70 | ||
|
||
"brir", "filewise", "room, azimuth" | ||
"phone", "filewise", "room, mode" | ||
"rir", "filewise", "room, distance, reverberation-time" | ||
|
||
|
||
Schemes | ||
^^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Dtype,Labels,Mappings | ||
|
||
"azimuth", "float", "" | ||
"distance", "float", "" | ||
"mode", "str", "hand-held, hands-free" | ||
"reverberation-time", "float", "" | ||
"room", "str", "aula_carolina, bathroom, booth, corridor, kitchen, lecture, meeting, office, stairway", "floor cover, furniture, room height, room length, room width, wall surface" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
.. _cough-speech-sneeze: | ||
|
||
cough-speech-sneeze | ||
------------------- | ||
|
||
Created by S Amiriparian, S Pugachevskiy, N Cummins, D Hantke, J Pohjalainen, G Keren, Schuller, BW | ||
|
||
|
||
============= ====================== | ||
version `2.0.1 <https://github.com/audeering/cough-speech-sneeze/blob/main/CHANGELOG.md>`__ | ||
license `CC-BY-4.0 <https://creativecommons.org/licenses/by/4.0/>`__ | ||
source Dataset based on the publication of Shahin Amiriparian: "Amiriparian, S., Pugachevskiy, S., Cummins, N., Hantke, S., Pohjalainen, J., Keren, G., Schuller, B., 2017. CAST a database: Rapid targeted large-scale big data acquisition via small-world modelling of social media platforms, in: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp. 340–345. https://doi.org/10.1109/ACII.2017.8273622" | ||
usage commercial | ||
languages | ||
format wav | ||
channel 1 | ||
sampling rate 16000, 44100 | ||
bit depth 16 | ||
duration 0 days 03:02:29.436148526 | ||
files 4310 | ||
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/cough-speech-sneeze>`__ | ||
published 2024-01-02 by audeering | ||
============= ====================== | ||
|
||
|
||
Description | ||
^^^^^^^^^^^ | ||
|
||
Cough-speech-sneeze: a data set of human sounds This dataset was collected by Dr. Shahin Amiriparian. It contains samples of human speech, coughing, and sneezing collected from YouTube, as well as silence clips. The original publication of this (possibly then extended) dataset is the following: Amiriparian, S., Pugachevskiy, S., Cummins, N., Hantke, S., Pohjalainen, J., Keren, G., Schuller, B., 2017. CAST a database: Rapid targeted large-scale big data acquisition via small-world modelling of social media platforms, in: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp. 340–345. https://doi.org/10.1109/ACII.2017.8273622 | ||
|
||
Example | ||
^^^^^^^ | ||
|
||
:file:`coughing/6hw6_4eb_hq_18.41-19.81.wav` | ||
|
||
.. image:: ../cough-speech-sneeze.png | ||
|
||
.. raw:: html | ||
|
||
<p><audio controls src="cough-speech-sneeze/coughing/6hw6_4eb_hq_18.41-19.81.wav"></audio></p> | ||
|
||
Tables | ||
^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Type,Columns | ||
:widths: 20, 10, 70 | ||
|
||
"files", "filewise", "category, duration" | ||
|
||
|
||
Schemes | ||
^^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Dtype,Labels | ||
|
||
"category", "str", "coughing, silence, sneezing, speech" | ||
"duration", "time", "" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
.. _crema-d: | ||
|
||
crema-d | ||
------- | ||
|
||
Created by Houwei Cao, David G. Cooper, Michael K. Keutmann, Ruben C. Gur, Ani Nenkova, Ragini Verma, Samantha L Moore, Adam Savitt | ||
|
||
|
||
============= ====================== | ||
version `1.2.0 <https://github.com/audeering/crema-d/blob/main/CHANGELOG.md>`__ | ||
license `Open Data Commons Open Database License (ODbL) v1.0 <http://opendatacommons.org/licenses/odbl/1.0/>`__ | ||
source https://github.com/CheyneyComputerScience/CREMA-D | ||
usage commercial | ||
languages English | ||
format wav | ||
channel 1 | ||
sampling rate 16000 | ||
bit depth 16 | ||
duration 0 days 05:15:21.404187500 | ||
files 7441 | ||
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/crema-d>`__ | ||
published 2024-01-02 by audeering | ||
============= ====================== | ||
|
||
|
||
Description | ||
^^^^^^^^^^^ | ||
|
||
CREMA-D: Crowd-sourced Emotional Mutimodal Actors Dataset CREMA-D is a data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified). When using the database commercially, the database must be referenced together with its license. | ||
|
||
Example | ||
^^^^^^^ | ||
|
||
:file:`1001/1001_TAI_HAP_XX.wav` | ||
|
||
.. image:: ../crema-d.png | ||
|
||
.. raw:: html | ||
|
||
<p><audio controls src="crema-d/1001/1001_TAI_HAP_XX.wav"></audio></p> | ||
|
||
Tables | ||
^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Type,Columns | ||
:widths: 20, 10, 70 | ||
|
||
"emotion.categories.desired.dev", "filewise", "emotion, emotion.intensity" | ||
"emotion.categories.desired.test", "filewise", "emotion, emotion.intensity" | ||
"emotion.categories.desired.train", "filewise", "emotion, emotion.intensity" | ||
"emotion.categories.dev", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level" | ||
"emotion.categories.dev.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.dev.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.face.dev", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level" | ||
"emotion.categories.face.dev.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.face.dev.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.face.test", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level" | ||
"emotion.categories.face.test.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.face.test.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.face.train", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level" | ||
"emotion.categories.face.train.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.face.train.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.multimodal.dev", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level" | ||
"emotion.categories.multimodal.dev.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.multimodal.dev.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.multimodal.test", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level" | ||
"emotion.categories.multimodal.test.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.multimodal.test.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.multimodal.train", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level" | ||
"emotion.categories.multimodal.train.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.multimodal.train.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.test", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level" | ||
"emotion.categories.test.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.test.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"emotion.categories.train", "filewise", "emotion.0, emotion.0.level, emotion.1, emotion.1.level, emotion.2, emotion.2.level, emotion.3, emotion.3.level, emotion.4, emotion.4.level" | ||
"emotion.categories.train.gold_standard", "filewise", "emotion, emotion.level, emotion.agreement" | ||
"emotion.categories.train.votes", "filewise", "anger, disgust, fear, happiness, neutral, sadness" | ||
"files", "filewise", "speaker, corrupted" | ||
"sentence", "filewise", "sentence" | ||
|
||
|
||
Schemes | ||
^^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Dtype,Min,Max,Labels,Mappings | ||
|
||
"corrupted", "bool", "", "", "" | ||
"emotion", "str", "", "", "anger, disgust, fear, happiness, neutral, no_agreement, sadness" | ||
"emotion.agreement", "float", "", "1", "" | ||
"emotion.intensity", "str", "", "", "high, low, mid, unspecified" | ||
"emotion.level", "float", "", "100", "" | ||
"sentence", "str", "", "", "DFA, IEO, IOM, ITH, ITS, IWL, IWW, MTI, TAI, TIE, TSI, WSI", "✓" | ||
"speaker", "int", "", "", "1001, 1002, 1003, 1004, 1005, 1006, 1007, [...], 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091", "age, ethnicity, race, sex" | ||
"votes", "int", "", "", "" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
.. _micirp: | ||
|
||
micirp | ||
------ | ||
|
||
Created by Stewart Tavener (Xaudia.com) | ||
|
||
|
||
============= ====================== | ||
version `1.0.0 <https://github.com/audeering/micirp/blob/main/CHANGELOG.md>`__ | ||
license `CC-BY-SA-4.0 <https://creativecommons.org/licenses/by-sa/4.0/>`__ | ||
source http://micirp.blogspot.com/ | ||
usage commercial | ||
languages | ||
format wav | ||
channel 1 | ||
sampling rate 44100, 48000 | ||
bit depth 24 | ||
duration 0 days 00:00:27.341591837 | ||
files 66 | ||
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/micirp>`__ | ||
published 2023-12-21 by audeering | ||
============= ====================== | ||
|
||
|
||
Description | ||
^^^^^^^^^^^ | ||
|
||
The Microphone Impulse Response Project (MicIRP) contains impulse response data for vintage microphones. The impulse response files were created using the analysis software Fuzzmeasure. The microphones were tested using a swept-sine method in a small booth, treated with much acoustic foam, placed about 20 to 30 cm from the source. Although the recording system and booth are calibrated regularly with a Beyerdynamic measurement microphone, there are problems comparing, for example, a figure-8 ribbon with an omnidirectional standard, as they will see different amounts of reflections from the side. So, it should be noted that the impulse response files describe the microphones measured in the booth, rather than in free space. | ||
|
||
Example | ||
^^^^^^^ | ||
|
||
:file:`dirs/IR_AKGD12.wav` | ||
|
||
.. image:: ../micirp.png | ||
|
||
.. raw:: html | ||
|
||
<p><audio controls src="micirp/dirs/IR_AKGD12.wav"></audio></p> | ||
|
||
Tables | ||
^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Type,Columns | ||
:widths: 20, 10, 70 | ||
|
||
"files", "filewise", "manufacturer" | ||
|
||
|
||
Schemes | ||
^^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Dtype,Labels | ||
|
||
"manufacturer", "str", "AKG, Altec, American, Amperite, Astatic, B&O, BBC, [...], Oktava, RCA, Reslo, STC, Shure, Sony, Telefunken, Toshiba" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
.. _musan: | ||
|
||
musan | ||
----- | ||
|
||
Created by David Snyder, Guoguo Chen, Daniel Povey | ||
|
||
|
||
============= ====================== | ||
version `1.0.0 <https://github.com/audeering/musan/blob/main/CHANGELOG.md>`__ | ||
license `CC-BY-4.0 <https://creativecommons.org/licenses/by/4.0/>`__ | ||
source http://www.openslr.org/17/ | ||
usage commercial | ||
languages ara, zho, dan, nld, eng, fra, deu, heb, hun, ita, jpn, lat, pol, por, rus, spa, tgl | ||
format wav | ||
channel 1 | ||
sampling rate 16000 | ||
bit depth 16 | ||
duration 4 days 13:17:22.582937499 | ||
files 2016 | ||
repository `data-public <https://audeering.jfrog.io/artifactory/webapp/#/artifacts/browse/tree/General/data-public/musan>`__ | ||
published 2023-12-20 by audeering-unittest | ||
============= ====================== | ||
|
||
|
||
Description | ||
^^^^^^^^^^^ | ||
|
||
The goal of this corpus is to provide data for music/speech discrimination, speech/nonspeech detection, and voice activity detection. The corpus is divided into music, speech, and noise portions. In total there are approximately 109 hours of audio. Reference: https://arxiv.org/abs/1510.08484 | ||
|
||
Example | ||
^^^^^^^ | ||
|
||
:file:`noise/free-sound/noise-free-sound-0324.wav` | ||
|
||
.. image:: ../musan.png | ||
|
||
.. raw:: html | ||
|
||
<p><audio controls src="musan/noise/free-sound/noise-free-sound-0324.wav"></audio></p> | ||
|
||
Tables | ||
^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Type,Columns | ||
:widths: 20, 10, 70 | ||
|
||
"files", "filewise", "duration" | ||
"music", "filewise", "genre, vocals, artist, composer" | ||
"music.fma", "filewise", "genre, vocals, artist, composer" | ||
"music.fma-western-art", "filewise", "genre, vocals, artist, composer" | ||
"music.hd-classical", "filewise", "genre, vocals, artist, composer" | ||
"music.jamendo", "filewise", "genre, vocals, artist, composer" | ||
"music.rfm", "filewise", "genre, vocals, artist, composer" | ||
"noise", "filewise", "background_noise" | ||
"noise.free-sound", "filewise", "background_noise" | ||
"noise.sound-bible", "filewise", "background_noise" | ||
"speech", "filewise", "gender, language" | ||
"speech.librivox", "filewise", "gender, language" | ||
"speech.us-gov", "filewise", "gender, language" | ||
|
||
|
||
Schemes | ||
^^^^^^^ | ||
|
||
.. csv-table:: | ||
:header: ID,Dtype,Labels | ||
|
||
"artist", "str", "" | ||
"background_noise", "bool", "" | ||
"composer", "str", "" | ||
"duration", "time", "" | ||
"gender", "str", "female, male" | ||
"genre", "str", "" | ||
"language", "str", "ara, dan, deu, eng, fra, heb, hun, [...], lat, nld, pol, por, rus, spa, tgl, zho" | ||
"vocals", "bool", "" |
Oops, something went wrong.