Multi-Device Speech Enhancement for Privacy and Quality

Problem Scenario: There can be multiple speakers in one room which are speaking at the same time. This creates an overlapped speech signal, where one speaker is leaking into the conversation of the other speaker. This degrades the speech quality and means a privacy risk when confidential information is leaked to a different conversation.

Key metrics of solution:

3.7 mio. network parameters offer real-time capability
PESQ score of 3.7 after attenuation
Listening tests confirm that the subjective speech quality is doubled (MUSHRA score 67)
No prior information is needed to identify the targeted speaker
Reduction of mutual information by 60%

A multi-device setup can be used to isolate the dominant speaker by attenuating an undesired speaker. For this we use an adapted convolutional time-domain audio separation network. It uses two microphone inputs, 1. the mixed channel of the speaker to be isolated, 2. the mixed channel of the speaker that needs to be attenuated.

The network consists of two parts, one masking network, where a mask is generated for the undesired speaker. The inverse of that mask is then applied to the targeted speaker channel to remove the undesired speech content. An enhancement block is added to further increase the speech quality.

Listening Example 1

Listening Example for original speech mixture

Listening Example for isolated speaker

Listening Example 2

Listening Example for original speech mixture

Listening Example for isolated speaker

Listening Example 3

Listening Example for original speech mixture

Listening Example for isolated speaker

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
AudioFiles.txt		AudioFiles.txt
IsoNet.pt		IsoNet.pt
PossibleFemaleSpeakers.txt		PossibleFemaleSpeakers.txt
PossibleMaleSpeakers.txt		PossibleMaleSpeakers.txt
SPEAKERS.txt		SPEAKERS.txt
dataloader.py		dataloader.py
main.py		main.py
metafiles.py		metafiles.py
metafiles_creation.txt		metafiles_creation.txt
model.py		model.py
readme.md		readme.md
requirements.txt		requirements.txt
roomacoustics.py		roomacoustics.py
tools.py		tools.py
training.py		training.py
training_list.txt		training_list.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Device Speech Enhancement for Privacy and Quality

Key metrics of solution:

About

Releases

Packages

Languages

Speech-Interaction-Technology-Aalto-U/IsoNet

Folders and files

Latest commit

History

Repository files navigation

Multi-Device Speech Enhancement for Privacy and Quality

Key metrics of solution:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages