Global Style Token Module #605

roedoejet · 2024-11-29T22:32:01Z

PR Goal?

Add a Global Style Token Module à la https://arxiv.org/abs/1803.09017

Fixes?

#293

Feedback sought?

Sanity. Check training/synthesis with GST turned on

Priority?

medium

Tests added?

all for model code, so no added tests, but I still need to update existing tests since the commands and schemas have changed.

How to test?

train a new model and try synthesizing with a style reference.

Confidence?

Modelling - medium
I'm medium-confident for the modelling side since it has successfully trained models and seems to result in better models when training with noisy data. That said, I'm not including it by default due to the extra complications at inference time (you have to provide a reference audio)

Versioning - low
Here's an example of a new item in a config. Should I be doing something in the versioning to automatically add use_global_style_token_module=False if Version==1.0 and that key is not found? Should I also be bumping up the version of the config here?

Version change?

minor version bump for FastSpeech2 and change to schemas

Related PRs?

EveryVoiceTTS/FastSpeech2_lightning#100

semanticdiff-com · 2024-11-29T22:32:03Z

Review changes with

Changed Files

File	Status
everyvoice/demo/app.py	2% smaller
everyvoice/model/feature_prediction/FastSpeech2_lightning	0% smaller

joanise · 2024-12-09T22:00:06Z

For config file versioning:

yes, you need a minor bump, touching the schemas automatically require a minor bump
yes, you should write logic so that when we load a model written with an older version, the new parameter is instantiated with the value that keeps the old semantics, i.e. with False in this case. Samuel left hooks in place just for that when we load models.

joanise · 2024-12-09T22:04:56Z

yes, you should write logic so that when we load a model written with an older version, the new parameter is instantiated with the value that keeps the old semantics, i.e. with False in this case. Samuel left hooks in place just for that when we load models

Oh, since you made the field optional and defaulting to False, this may already all happen automatically for you. The test would be to load a model written with 0.2, and write it back with 0.3, if it works as expected it's all good.

joanise · 2024-12-09T22:12:58Z

yes, you need a minor bump, touching the schemas automatically require a minor bump

And let's bump to 0.3.0, without the a element, as discussed at the meeting today.

marctessier · 2024-12-11T20:03:39Z

When I run all tests, I get this error below or see attached log file:

======================================================================
ERROR: test_filelist_language (model.feature_prediction.FastSpeech2_lightning.fs2.tests.test_cli.PrepareSynthesizeDataTest)
Use a different language than the one provided in the filelist.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/gpfs/fs5/nrc/nrc-fs1/ict/others/u/tes001/TxT2SPEECH/EveryVoice_pr605_gst/everyvoice/model/feature_prediction/FastSpeech2_lightning/fs2/tests/test_cli.py", line 117, in test_filelist_language
    data = prepare_synthesize_data(
TypeError: prepare_data() missing 1 required positional argument: 'style_reference'

======================================================================
ERROR: test_filelist_speaker (model.feature_prediction.FastSpeech2_lightning.fs2.tests.test_cli.PrepareSynthesizeDataTest)
Use a different speaker than the one provided in the filelist.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/gpfs/fs5/nrc/nrc-fs1/ict/others/u/tes001/TxT2SPEECH/EveryVoice_pr605_gst/everyvoice/model/feature_prediction/FastSpeech2_lightning/fs2/tests/test_cli.py", line 140, in test_filelist_speaker
    data = prepare_synthesize_data(
TypeError: prepare_data() missing 1 required positional argument: 'style_reference'

======================================================================
ERROR: test_plain_filelist (model.feature_prediction.FastSpeech2_lightning.fs2.tests.test_cli.PrepareSynthesizeDataTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/gpfs/fs5/nrc/nrc-fs1/ict/others/u/tes001/TxT2SPEECH/EveryVoice_pr605_gst/everyvoice/model/feature_prediction/FastSpeech2_lightning/fs2/tests/test_cli.py", line 166, in test_plain_filelist
    data = prepare_synthesize_data(
TypeError: prepare_data() missing 1 required positional argument: 'style_reference'

----------------------------------------------------------------------
Ran 213 tests in 81.990s

FAILED (errors=3)
============ Finished job 3365462 on Wed 11 Dec 2024 02:59:24 PM EST with rc=1

TEST-all.e3365462.txt

marctessier · 2024-12-11T20:41:28Z

I tried to train a FP , it starts but crashed before finishing the first epoch when I run as-is with this : use_global_style_token_module: true

RuntimeError: It looks like your LightningModule has parameters that were not 
used in producing the loss returned by training_step. If this is intentional, 
you must enable the detection of unused parameters in DDP, either by setting the
string value `strategy='ddp_find_unused_parameters_true'` or by setting the flag
in the strategy with `strategy=DDPStrategy(find_unused_parameters=True)`.
Loading EveryVoice modules: 100%|██████████| 4/4 [00:10<00:00,  2.73s/it]   
srun: error: ib14gpu-002: task 0: Exited with exit code 1

gst.e3365525.txt
gst.o3365525.txt

marctessier · 2024-12-11T20:56:31Z

If I use use_global_style_token_module: false training will work...

roedoejet added 3 commits November 29, 2024 22:26

chore: update submodule

bedc4e6

fix(demo): return filepath audio

851cf4a

fix: only include the style reference input if the model supports it

8b1a6aa

roedoejet mentioned this pull request Nov 29, 2024

feat: add basic optional support for global style token module EveryVoiceTTS/FastSpeech2_lightning#100

Open

roedoejet added 2 commits November 29, 2024 22:36

chore: update submodules to match main

f880252

chore: update submodule to include autocompletion fix

1687bc5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global Style Token Module #605

Global Style Token Module #605

roedoejet commented Nov 29, 2024 •

edited

Loading

semanticdiff-com bot commented Nov 29, 2024 •

edited

Loading

joanise commented Dec 9, 2024

joanise commented Dec 9, 2024

joanise commented Dec 9, 2024

marctessier commented Dec 11, 2024

marctessier commented Dec 11, 2024

marctessier commented Dec 11, 2024

Global Style Token Module #605

Are you sure you want to change the base?

Global Style Token Module #605

Conversation

roedoejet commented Nov 29, 2024 • edited Loading

PR Goal?

Fixes?

Feedback sought?

Priority?

Tests added?

How to test?

Confidence?

Version change?

Related PRs?

semanticdiff-com bot commented Nov 29, 2024 • edited Loading

joanise commented Dec 9, 2024

joanise commented Dec 9, 2024

joanise commented Dec 9, 2024

marctessier commented Dec 11, 2024

marctessier commented Dec 11, 2024

marctessier commented Dec 11, 2024

roedoejet commented Nov 29, 2024 •

edited

Loading

semanticdiff-com bot commented Nov 29, 2024 •

edited

Loading