Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10 batchsize = 63.72G???(All File are 3s wav) #13

Open
coding-sharks opened this issue Jul 16, 2024 · 1 comment
Open

10 batchsize = 63.72G???(All File are 3s wav) #13

coding-sharks opened this issue Jul 16, 2024 · 1 comment

Comments

@coding-sharks
Copy link

I would like to ask the following 2 questions and I hope I can get some help from you

  1. I'm fine-tuning speechtokenizer on my own dataset, I cut the dataset into 3s wav files, but when I set the batchsize equal to 10, the code takes up a whopping 63.72G of video memory, and I'd like to ask if this is reasonable? Is it possible that there is an error in my other settings?

  2. Also I would like to ask you what is the approximate value of the loss convergence after training?I'd like to make a preliminary judgement on how well the model fine-tuning turned out by comparing it to your loss
    (Gen Loss ,Mel Error,Q Loss,Distill Loss)

Here's what I'm fine-tuning:

Epoch 0 -- Step 48410: Gen Loss: 192.288; Mel Error:0.343; Q Loss: 5.999; Distill Loss: 0.643; Time cost per step: 3.779s
Epoch 0 -- Step 48420: Gen Loss: 200.187; Mel Error:0.329; Q Loss: 6.648; Distill Loss: 0.627; Time cost per step: 3.761s
Epoch 0 -- Step 48430: Gen Loss: 206.039; Mel Error:0.341; Q Loss: 6.689; Distill Loss: 0.616; Time cost per step: 3.722s
Epoch 0 -- Step 48440: Gen Loss: 178.676; Mel Error:0.358; Q Loss: 5.539; Distill Loss: 0.615; Time cost per step: 3.758s
Epoch 0 -- Step 48450: Gen Loss: 188.434; Mel Error:0.327; Q Loss: 5.698; Distill Loss: 0.625; Time cost per step: 3.734s
Epoch 0 -- Step 48460: Gen Loss: 185.933; Mel Error:0.348; Q Loss: 5.768; Distill Loss: 0.620; Time cost per step: 3.711s
Epoch 0 -- Step 48470: Gen Loss: 196.693; Mel Error:0.344; Q Loss: 6.094; Distill Loss: 0.621; Time cost per step: 3.733s
Epoch 0 -- Step 48480: Gen Loss: 206.974; Mel Error:0.323; Q Loss: 7.111; Distill Loss: 0.607; Time cost per step: 3.739s
Epoch 0 -- Step 48490: Gen Loss: 201.758; Mel Error:0.370; Q Loss: 6.769; Distill Loss: 0.625; Time cost per step: 3.692s

Thanks!

@gyt1145028706
Copy link

gyt1145028706 commented Oct 17, 2024

For the memory optimization, you can modify the code at this line to:

dists = -(samples.pow(2).sum(1, keepdim=True) - 2 * samples @ means.t() + means.t().pow(2).sum(0, keepdim=True))

This should reduce memory consumption while maintaining the same functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants