Skip to content

Audio-WestlakeU/Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

 
 

Repository files navigation

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

This repository for the official PyTorch implementation of Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement, accepted by InterSpeech 2021.

Introduction

Our work addresses the problem of microphone array generalization for deep-learning-based end-to-end multichannel speech enhancement. We aim to train a unique potentially performing well on unseen microphone arrays. The goal is to make the network learn the universal information for speech enhancement that is available for any array geometry, rather than learn the one-array-dedicated characteristics. To resolve this problem, a single network is trained using data recorded by various VIRTUAL microphone arrays of different geometries using RIR Generator[1] and simulated diffused noise[2]. We design three variants of our recently proposed original NarrowBand Deep Filtering(NBDF) [3] network to cope with the agnostic number of microphones.

figure 1

Key Features

  • Simulated_RIR_Generator
  • Network
    • original NBDF (CP-NBDF)
    • CC-NBDF
    • PW-NBDF
  • Train
  • Inference
  • Evaluation

Get started

(1) Clone:

$ git clone https://github.com/atomicoo/Tacotron2-PyTorch.git

(2) Requirements:

$ pip install -r requirements.txt

RIR Generator [1], coherent multichannel noise generator[2] and wind noise simulator [4] are also required.

Reference

[1] E. A. Habets, “Room impulse response generator,” Technische Universiteit Eindhoven, Tech. Rep, vol. 2, no. 2.4, p. 1, 2006.

[2] E. A. Habets, I. Cohen, and S. Gannot, “Generating nonstationary multisensor signals under a spatial coherence constraint,” The Journal of the Acoustical Society of America, vol. 124, no. 5, pp. 2911–2917, 2008.

[3] X. Li and R. Horaud, “Narrow-band deep filtering for multichannel speech enhancement,” arXiv preprint arXiv:1911.10791, 2019.

[4] D. Mirabilii and E. A. Habets, “Simulating multi-channel wind noise based on the corcos model,” in 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).IEEE,2018, pp. 560–564.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 70.4%
  • MATLAB 29.6%