10 batchsize = 63.72G???(All File are 3s wav) #13

coding-sharks · 2024-07-16T03:26:16Z

I would like to ask the following 2 questions and I hope I can get some help from you

I'm fine-tuning speechtokenizer on my own dataset, I cut the dataset into 3s wav files, but when I set the batchsize equal to 10, the code takes up a whopping 63.72G of video memory, and I'd like to ask if this is reasonable? Is it possible that there is an error in my other settings?
Also I would like to ask you what is the approximate value of the loss convergence after training?I'd like to make a preliminary judgement on how well the model fine-tuning turned out by comparing it to your loss
（Gen Loss ，Mel Error，Q Loss，Distill Loss）

Here's what I'm fine-tuning：

Epoch 0 -- Step 48410: Gen Loss: 192.288; Mel Error:0.343; Q Loss: 5.999; Distill Loss: 0.643; Time cost per step: 3.779s
Epoch 0 -- Step 48420: Gen Loss: 200.187; Mel Error:0.329; Q Loss: 6.648; Distill Loss: 0.627; Time cost per step: 3.761s
Epoch 0 -- Step 48430: Gen Loss: 206.039; Mel Error:0.341; Q Loss: 6.689; Distill Loss: 0.616; Time cost per step: 3.722s
Epoch 0 -- Step 48440: Gen Loss: 178.676; Mel Error:0.358; Q Loss: 5.539; Distill Loss: 0.615; Time cost per step: 3.758s
Epoch 0 -- Step 48450: Gen Loss: 188.434; Mel Error:0.327; Q Loss: 5.698; Distill Loss: 0.625; Time cost per step: 3.734s
Epoch 0 -- Step 48460: Gen Loss: 185.933; Mel Error:0.348; Q Loss: 5.768; Distill Loss: 0.620; Time cost per step: 3.711s
Epoch 0 -- Step 48470: Gen Loss: 196.693; Mel Error:0.344; Q Loss: 6.094; Distill Loss: 0.621; Time cost per step: 3.733s
Epoch 0 -- Step 48480: Gen Loss: 206.974; Mel Error:0.323; Q Loss: 7.111; Distill Loss: 0.607; Time cost per step: 3.739s
Epoch 0 -- Step 48490: Gen Loss: 201.758; Mel Error:0.370; Q Loss: 6.769; Distill Loss: 0.625; Time cost per step: 3.692s

Thanks!

gyt1145028706 · 2024-10-17T10:22:25Z

For the memory optimization, you can modify the code at this line to:

dists = -(samples.pow(2).sum(1, keepdim=True) - 2 * samples @ means.t() + means.t().pow(2).sum(0, keepdim=True))

This should reduce memory consumption while maintaining the same functionality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

10 batchsize = 63.72G???(All File are 3s wav) #13

10 batchsize = 63.72G???(All File are 3s wav) #13

coding-sharks commented Jul 16, 2024

gyt1145028706 commented Oct 17, 2024 •

edited

Loading

10 batchsize = 63.72G???(All File are 3s wav) #13

10 batchsize = 63.72G???(All File are 3s wav) #13

Comments

coding-sharks commented Jul 16, 2024

gyt1145028706 commented Oct 17, 2024 • edited Loading

gyt1145028706 commented Oct 17, 2024 •

edited

Loading