UDiffSE: Unsupervised Diffusion-based Speech Enhancement

This repository contains the PyTorch implementation for the following paper:

B. Nortier, M. Sadeghi, and R. Serizel, Unsupervised Speech Enhancement with Diffusion-based Generative Models, ICASSP 2024.

Installation

Create a virtual environment using Python 3.8 and install the package dependencies via

pip install -r requirements.txt

We find that the line pypesq==1.2.4 may cause errors in which case we recommend using the alternative suggestion to install pypesq with the command

pip install https://github.com/vBaiCai/python-pesq/archive/master.zip

Training

A diffusion-based clean speech generative model can be trained using train.py:

python train.py --transform_type exponent --format wsj0 --gpus 2 --batch_size 14  --resume_from_checkpoint file/to/last.ckpt

Pretrained checkpoint

A pretrained checkpoint for a clean speech generative model trained on the WSJ0 dataset can be downloaded via this Google drive link.

Demo

A demo of the UDiffSE framework is provided in demo.ipynb. This notebook presents a demonstration of sampling from clean speech prior learned via a diffusion-based generative model, followed by speech enhancement of a test noisy speech signal.

Audio samples

A collection of audio samples that compare the speech enhancement performances of UDiffSE, RVAE [1] and SGMSE+ [2] over the WSJ0-QUT and TCD-TIMIT datasets may be found on UDiffSE's webpage.

Supplementary material

Supplementary material, including additional details, discussions, and parameter studies that serve to expand our work is provided in the docs directory (direct link).

Bibtex

@inproceedings{nortier2023unsupervised,
  title={Unsupervised speech enhancement with diffusion-based generative models},
  author={Nortier, Bern{\'e} and Sadeghi, Mostafa and Serizel, Romain},
  booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2024},
  organization={IEEE}
}

References

[1] S. Leglaive, X. Alameda-Pineda, L. Girin, and R. Horaud, “A recurrent variational autoencoder for speech enhancement,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020.

[2] J. Richter, S. Welker, J.-M. Lemercier, B. Lay, and T. Gerkmann, “Speech enhancement and dereverberation with diffusion-based generative models,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
docs		docs
sgmse		sgmse
src		src
README.md		README.md
demo.ipynb		demo.ipynb
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UDiffSE: Unsupervised Diffusion-based Speech Enhancement

Table of contents

Installation

Training

Pretrained checkpoint

Demo

Audio samples

Supplementary material

Bibtex

References

About

Releases

Packages

Contributors 2

Languages

joanne-b-nortier/UDiffSE

Folders and files

Latest commit

History

Repository files navigation

UDiffSE: Unsupervised Diffusion-based Speech Enhancement

Table of contents

Installation

Training

Pretrained checkpoint

Demo

Audio samples

Supplementary material

Bibtex

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages