Questions about the samplerates and parameter size #4

redmist328 · 2024-01-21T09:09:25Z

Hi!

I want to reproduce this work at 16kHz. I used a frame length and frame shift similar to 22.05kHz, which means the frame shift is 11.6ms (256 points) at 22.05kHz and 10ms (160 points) at 16kHz. The frame length is four times the frame shift. The FFT size is also 1024.

Currently, my model has been trained for 200k steps on a 256-dimensional input, but there are still noticeable phase artifacts. I would like to know if this model is sensitive to the sampling rate or if my current training steps are insufficient.

The paper does not seem to provide information on how many steps the model was trained for, and I'm curious about how many steps are generally sufficient for acceptable results (do I really need to run all 3100 epochs?).

I would be approciate if you could offer your pretrained model params weights file! Thanks a lot!

Also, I noticed that the trained model parameters are very small. The encoder parameter file is only 543kB, and the generator parameter file is only 545kB. Is this normal? It's amazing that such a small number of parameters can achieve this task!

Besides, I noticed my mel loss on val set at 200k steps is about 0.3, which is much higher than the curve you gave. I use 'both' mode to train this project, is this also a sampling rate problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about the samplerates and parameter size #4

Questions about the samplerates and parameter size #4

redmist328 commented Jan 21, 2024 •

edited

Loading

Questions about the samplerates and parameter size #4

Questions about the samplerates and parameter size #4

Comments

redmist328 commented Jan 21, 2024 • edited Loading

redmist328 commented Jan 21, 2024 •

edited

Loading