Skip to content

A denoising auto-encoder trained on a generative adversarial network (GAN)

Notifications You must be signed in to change notification settings

russell-rozenbaum/GAN-Denoiser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Denoising Adversarial Network (DAN)

Denoising Autoencoder trained on a generative adversarial network (GAN). Autoencoder for use in denoising audio signals.

Overview

An extremely lightweight hybrid denoising AE/GAN. The generator acts as the denoiser. It is currently set up to only 282,177 generator parameters and 10,577 discriminator parameters. This makes it feasible to train on a standard laptop (no need for external GPUs needed!). However, due to how weak it is, it is currently setup to only be capable of removing noise from signals sampled at 256 samples/second. In particular, the signals are only 1 second in length, thus 256 samples long, and are composed of 3 sine waves generated uniformly at random with a range of [10Hz - 64Hz]. The gaussian white noise is thus in the range (0Hz - 128Hz].

Trained denoiser outputs are compared against a lowpass filter. As we'll see, the lowpass filter excels in removing high frequencies, but fails to eliminate noise within the generated-sine-range. The lightweight denoiser fails to remove high frequencies completely, but can significantly reduce frequencies within the generated-sine-range.

Data Creation

The sinusoidal compositions (clean signals) are generated uniformly at random with a range of [10Hz - 64Hz]. This composition of 3 components significantly reduces the likeliehood of training data showing up in the validation set.

The noise is gaussian white noise (GWN), sampling each of the 256 samples (which serve as the amplitude of the signal) within each generated signal from a normal distribution, with a mean of 0 and standard deviation of 0.5.

The clean and noise signals are then mixed into "mixed" signals, with signal-to-noise ratio (SNR) of -2. To later perform reconstruction loss, we combine the i-th generated clean signal with the i-th generated mixed signal, for all generated signals.

A total of 1800 samples are generated, and split into train and validation sets with a ratio of 1600:200 respectively. The following image shows each type of signal from each split:

Created Data

Generator

The Generator is an autoencoder based strongly off of this article by Mathworks. It is tasked with taking as input a noisy signal and outputting a denoised version of that signal.

Generator Architecture

Discriminator

The Discriminator is tasked with distinguishing clean signals apart from faux clean signals (denoised signals produced by the generator).

Discriminator Architecture

Training

The Discriminator is trained in classic GAN fashion, using the BCE loss between clean signals and denoised signals.

The Generator, however, is trained on a hybrid loss function. It uses adversarial loss, just as so in a standard GAN, but also reconstruction loss. The adversarial loss is just BCE loss upon how well it "fools" the discriminator. The reconstruction loss is the L1 Norm between the clean signal and the generated denoised signal. Here is another Mathworks article which strongly incorporates this logic.

Training Process

Training on the same dataset from the created data example above, we trained the generator for 40 epochs, then began training both the generator and discriminator for a set maximum number of epochs (=200) or until patience (=50) was reached. Patience was ultimately reached at epoch 107, the following are examples from the validation set, in the order of clean, mixed, denoised, and low-pass filtered data :

Models Training

Denoiser Output Example 1

Denoiser Output Example 2

Denoiser Output Example 3

Denoiser Output Example 4

Denoiser Output Example 5

Denoiser Output Example 6

Denoiser Output Example 7

Denoiser Output Example 8

From this, we can see the variation in the generator's performance on denoising data from the validation set. For example, in examples 4, 5, and 8 the signal is denoised/reconstructed almost seamlessly, with minimal amounts of noise leftover and little to no changes in the each component's respective, original magnitude. In examples 1, 3, 6, and 7 we can see alterings in the strength of some components' magnitudes, and leaving some strange random-frequency artifacts leftover. In example 2, we see the denoiser completely deteriorate and collapse the signal into just a single sine wave.

Setup

The training can be done on a standard laptop, and takes around 10-15 minutes for 300 epochs (with the current model architectures).

Install dependencies in requirements.txt

Setup a data folder as follows:

GAN-Denoiser
├── data
│   ├── resampled
│   ├── train
│   │   ├── clean
│   │   ├── mixed
│   │   └── noise
│   └── validation
│       ├── clean
│       ├── mixed
│       └── noise
└── ...

This is where created noise, sine, and mixed (noise + sine) signals will be stored.

Run create_data.py This should result in folders being filled with data, and displaying plots for a random signal from each folder

Then run dataset.py to verify that everything is working properly (this won't actually do anything to memory)

Now we can run train_gan.py

Fine-tune hyperparameters (generator architecture, discriminator architecture, data creation parameters, training parameters) as desired.

Citations

As aforementioned, a lot of the confidence in building this model, and the layout of the models and training process itself, is all strongly inspired by this article by Mathworks. In essence, we try to attempt

There's been notable advancements in denoising methods using similar architectural methods, such as deep feature loss as utilized by Zhang et al. 2024. This repo was inherently inspired by this paper, as an attempt to build a foundational understanding of such CNN denoisers.

About

A denoising auto-encoder trained on a generative adversarial network (GAN)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages