v2.2.1: Important notice and minor patch release
Vocoder fine-tuning is available
Everything about vocoder training, fine-tuning and research now has its own place: https://github.com/openvpi/SingingVocoders
User can now fine-tune the shared NSF-HiFiGAN vocoder model on their own datasets without much computing resources. In most cases, vocoder fine-tuning can reduce the noise caused by unmatched mel-spectrogram predictions with the ground truth on unseen datasets, improving the final audio quality. See the documentation about how to use custom vocoder models and deploy them to ONNX format in this repository.
Mutual influence between variance modules
A recent research from the developer team found some mutual influence between the duration predictor, the pitch predictor and the variance predictor of a variance model. The findings have been written as formal suggestions into the documentation. Following these suggestions to train your variance models can improve the accuracy and avoid unstable loudness.
Changes and bug fixes
This patch release contains the following changes:
- The pitch expressiveness factor is now exposed by default but can be disabled by
--freeze_expr
- Note glide type can now be frozen by
--freeze_glide
for compatibility with OpenUTAU - Shallow diffusion and FP16 AMP are now enabled by default
- The default
f0_max
configuration value is changed from 800 to 1100 - Model path can be specified by
--ckpt
when exporting custom vocoder model to ONNX - Documentation about preparing and deploying custom vocoders is added and re-organized
- Melody encoder is added to the new variance model architecture graph
The following bugs are fixed:
- A relative path bug caused by custom checkpoint saving directory
- Interpolation error is raised during inference of variance model when all notes are rest
- The breathiness unexpectedly becomes NaN in some rare edge cases
Known issues
When training with DDP, the TensorBoard sometimes raises error and no longer updates after a validation. The temporary solution is adding the option --reload_multifile=true
when launching TensorBoard.
Full change log: v2.2.0...v2.2.1