You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to ask the following 2 questions and I hope I can get some help from you
I'm fine-tuning speechtokenizer on my own dataset, I cut the dataset into 3s wav files, but when I set the batchsize equal to 10, the code takes up a whopping 63.72G of video memory, and I'd like to ask if this is reasonable? Is it possible that there is an error in my other settings?
Also I would like to ask you what is the approximate value of the loss convergence after training?I'd like to make a preliminary judgement on how well the model fine-tuning turned out by comparing it to your loss
(Gen Loss ,Mel Error,Q Loss,Distill Loss)
Here's what I'm fine-tuning:
Epoch 0 -- Step 48410: Gen Loss: 192.288; Mel Error:0.343; Q Loss: 5.999; Distill Loss: 0.643; Time cost per step: 3.779s
Epoch 0 -- Step 48420: Gen Loss: 200.187; Mel Error:0.329; Q Loss: 6.648; Distill Loss: 0.627; Time cost per step: 3.761s
Epoch 0 -- Step 48430: Gen Loss: 206.039; Mel Error:0.341; Q Loss: 6.689; Distill Loss: 0.616; Time cost per step: 3.722s
Epoch 0 -- Step 48440: Gen Loss: 178.676; Mel Error:0.358; Q Loss: 5.539; Distill Loss: 0.615; Time cost per step: 3.758s
Epoch 0 -- Step 48450: Gen Loss: 188.434; Mel Error:0.327; Q Loss: 5.698; Distill Loss: 0.625; Time cost per step: 3.734s
Epoch 0 -- Step 48460: Gen Loss: 185.933; Mel Error:0.348; Q Loss: 5.768; Distill Loss: 0.620; Time cost per step: 3.711s
Epoch 0 -- Step 48470: Gen Loss: 196.693; Mel Error:0.344; Q Loss: 6.094; Distill Loss: 0.621; Time cost per step: 3.733s
Epoch 0 -- Step 48480: Gen Loss: 206.974; Mel Error:0.323; Q Loss: 7.111; Distill Loss: 0.607; Time cost per step: 3.739s
Epoch 0 -- Step 48490: Gen Loss: 201.758; Mel Error:0.370; Q Loss: 6.769; Distill Loss: 0.625; Time cost per step: 3.692s
Thanks!
The text was updated successfully, but these errors were encountered:
I would like to ask the following 2 questions and I hope I can get some help from you
I'm fine-tuning speechtokenizer on my own dataset, I cut the dataset into 3s wav files, but when I set the batchsize equal to 10, the code takes up a whopping 63.72G of video memory, and I'd like to ask if this is reasonable? Is it possible that there is an error in my other settings?
Also I would like to ask you what is the approximate value of the loss convergence after training?I'd like to make a preliminary judgement on how well the model fine-tuning turned out by comparing it to your loss
(Gen Loss ,Mel Error,Q Loss,Distill Loss)
Here's what I'm fine-tuning:
Epoch 0 -- Step 48410: Gen Loss: 192.288; Mel Error:0.343; Q Loss: 5.999; Distill Loss: 0.643; Time cost per step: 3.779s
Epoch 0 -- Step 48420: Gen Loss: 200.187; Mel Error:0.329; Q Loss: 6.648; Distill Loss: 0.627; Time cost per step: 3.761s
Epoch 0 -- Step 48430: Gen Loss: 206.039; Mel Error:0.341; Q Loss: 6.689; Distill Loss: 0.616; Time cost per step: 3.722s
Epoch 0 -- Step 48440: Gen Loss: 178.676; Mel Error:0.358; Q Loss: 5.539; Distill Loss: 0.615; Time cost per step: 3.758s
Epoch 0 -- Step 48450: Gen Loss: 188.434; Mel Error:0.327; Q Loss: 5.698; Distill Loss: 0.625; Time cost per step: 3.734s
Epoch 0 -- Step 48460: Gen Loss: 185.933; Mel Error:0.348; Q Loss: 5.768; Distill Loss: 0.620; Time cost per step: 3.711s
Epoch 0 -- Step 48470: Gen Loss: 196.693; Mel Error:0.344; Q Loss: 6.094; Distill Loss: 0.621; Time cost per step: 3.733s
Epoch 0 -- Step 48480: Gen Loss: 206.974; Mel Error:0.323; Q Loss: 7.111; Distill Loss: 0.607; Time cost per step: 3.739s
Epoch 0 -- Step 48490: Gen Loss: 201.758; Mel Error:0.370; Q Loss: 6.769; Distill Loss: 0.625; Time cost per step: 3.692s
Thanks!
The text was updated successfully, but these errors were encountered: