-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
while training full or warnings and errors, the weights are not loaded #671
Comments
I have the same error Env: Error:
|
I also got the same issue, i dont know why the weights are not loaded |
Self Checks
Cloud or Self Hosted
Cloud
Environment Details
linux vm
Steps to Reproduce
while trainind
✔️ Expected Behavior
runs without error and arning, loading correct pretrained weights, i have downloaded the pretrained weights and the path is correct
❌ Actual Behavior
2024-11-08 15:55:29,094][fish_speech.models.text2semantic.llama][INFO] - [rank: 0] Loaded weights with error: _IncompatibleKeys(missing_keys=['embeddings.lora_A', 'embeddings.lora_B', 'codebook_embeddings.lora_A', 'codebook_embeddings.lora_B', 'layers.0.attention.wqkv.lora_A', 'layers.0.attention.wqkv.lora_B', 'layers.0.attention.wo.lora_A', 'layers.0.attention.wo.lora_B', 'layers.0.feed_forward.w1.lora_A', 'layers.0.feed_forward.w1.lora_B', 'layers.0.feed_forward.w3.lora_A', 'layers.0.feed_forward.w3.lora_B', 'layers.0.feed_forward.w2.lora_A', 'layers.0.feed_forward.w2.lora_B', 'layers.1.attention.wqkv.lora_A', 'layers.1.attention.wqkv.lora_B', 'layers.1.attention.wo.lora_A', 'layers.1.attention.wo.lora_B', 'layers.1.feed_forward.w1.lora_A', 'layers.1.feed_forward.w1.lora_B', 'layers.1.feed_forward.w3.lora_A', 'layers.1.feed_forward.w3.lora_B', 'layers.1.feed_forward.w2.lora_A', 'layers.1.feed_forward.w2.lora_B', 'layers.2.attention.wqkv.lora_A', 'layers.2.attention.wqkv.lora_B', 'layers.2.attention.wo.lora_A', 'layers.2.attention.wo.lora_B', 'layers.2.feed_forward.w1.lora_A', 'layers.2.feed_forward.w1.lora_B', 'layers.2.feed_forward.w3.lora_A', 'layers.2.feed_forward.w3.lora_B', 'layers.2.feed_forward.w2.lora_A', 'layers.2.feed_forward.w2.lora_B', 'layers.3.attention.wqkv.lora_A', 'layers.3.attention.wqkv.lora_B', 'layers.3.attention.wo.lora_A', 'layers.3.attention.wo.lora_B', 'layers.3.feed_forward.w1.lora_A', 'layers.3.feed_forward.w1.lora_B', 'layers.3.feed_forward.w3.lora_A', 'layers.3.feed_forward.w3.lora_B', 'layers.3.feed_forward.w2.lora_A', 'layers.3.feed_forward.w2.lora_B', 'layers.4.attention.wqkv.lora_A', 'layers.4.attention.wqkv.lora_B', 'layers.4.attention.wo.lora_A', 'layers.4.attention.wo.lora_B', 'layers.4.feed_forward.w1.lora_A', 'layers.4.feed_forward.w1.lora_B', 'layers.4.feed_forward.w3.lora_A', 'layers.4.feed_forward.w3.lora_B', 'layers.4.feed_forward.w2.lora_A', 'layers.4.feed_forward.w2.lora_B', 'layers.5.attention.wqkv.lora_A', 'layers.5.attention.wqkv.lora_B', 'layers.5.attention.wo.lora_A', 'layers.5.attention.wo.lora_B', 'layers.5.feed_forward.w1.lora_A', 'layers.5.feed_forward.w1.lora_B', 'layers.5.feed_forward.w3.lora_A', 'layers.5.feed_forward.w3.lora_B', 'layers.5.feed_forward.w2.lora_A', 'layers.5.feed_forward.w2.lora_B', 'layers.6.attention.wqkv.lora_A', 'layers.6.attention.wqkv.lora_B', 'layers.6.attention.wo.lora_A', 'layers.6.attention.wo.lora_B', 'layers.6.feed_forward.w1.lora_A', 'layers.6.feed_forward.w1.lora_B', 'layers.6.feed_forward.w3.lora_A', 'layers.6.feed_forward.w3.lora_B', 'layers.6.feed_forward.w2.lora_A', 'layers.6.feed_forward.w2.lora_B', 'layers.7.attention.wqkv.lora_A', 'layers.7.attention.wqkv.lora_B', 'layers.7.attention.wo.lora_A', 'layers.7.attention.wo.lora_B', 'layers.7.feed_forward.w1.lora_A', 'layers.7.feed_forward.w1.lora_B', 'layers.7.feed_forward.w3.lora_A', 'layers.7.feed_forward.w3.lora_B', 'layers.7.feed_forward.w2.lora_A', 'layers.7.feed_forward.w2.lora_B', 'layers.8.attention.wqkv.lora_A', 'layers.8.attention.wqkv.lora_B', 'layers.8.attention.wo.lora_A', 'layers.8.attention.wo.lora_B', 'layers.8.feed_forward.w1.lora_A', 'layers.8.feed_forward.w1.lora_B', 'layers.8.feed_forward.w3.lora_A', 'layers.8.feed_forward.w3.lora_B', 'layers.8.feed_forward.w2.lora_A', 'layers.8.feed_forward.w2.lora_B', 'layers.9.attention.wqkv.lora_A', 'layers.9.attention.wqkv.lora_B', 'layers.9.attention.wo.lora_A', 'layers.9.attention.wo.lora_B', 'layers.9.feed_forward.w1.lora_A', 'layers.9.feed_forward.w1.lora_B', 'layers.9.feed_forward.w3.lora_A', 'layers.9.feed_forward.w3.lora_B', 'layers.9.feed_forward.w2.lora_A', 'layers.9.feed_forward.w2.lora_B', 'layers.10.attention.wqkv.lora_A', 'layers.10.attention.wqkv.lora_B', 'layers.10.attention.wo.lora_A', 'layers.10.attention.wo.lora_B', 'layers.10.feed_forward.w1.lora_A', 'layers.10.feed_forward.w1.lora_B', 'layers.10.feed_forward.w3.lora_A', 'layers.10.feed_forward.w3.lora_B', 'layers.10.feed_forward.w2.lora_A', 'layers.10.feed_forward.w2.lora_B', 'layers.11.attention.wqkv.lora_A', 'layers.11.attention.wqkv.lora_B', 'layers.11.attention.wo.lora_A', 'layers.11.attention.wo.lora_B', 'layers.11.feed_forward.w1.lora_A', 'layers.11.feed_forward.w1.lora_B', 'layers.11.feed_forward.w3.lora_A', 'layers.11.feed_forward.w3.lora_B', 'layers.11.feed_forward.w2.lora_A', 'layers.11.feed_forward.w2.lora_B', 'layers.12.attention.wqkv.lora_A', 'layers.12.attention.wqkv.lora_B', 'layers.12.attention.wo.lora_A', 'layers.12.attention.wo.lora_B', 'layers.12.feed_forward.w1.lora_A', 'layers.12.feed_forward.w1.lora_B', 'layers.12.feed_forward.w3.lora_A', 'layers.12.feed_forward.w3.lora_B', 'layers.12.feed_forward.w2.lora_A', 'layers.12.feed_forward.w2.lora_B', 'layers.13.attention.wqkv.lora_A', 'layers.13.attention.wqkv.lora_B', 'layers.13.attention.wo.lora_A', 'layers.13.attention.wo.lora_B', 'layers.13.feed_forward.w1.lora_A', 'layers.13.feed_forward.w1.lora_B', 'layers.13.feed_forward.w3.lora_A', 'layers.13.feed_forward.w3.lora_B', 'layers.13.feed_forward.w2.lora_A', 'layers.13.feed_forward.w2.lora_B', 'layers.14.attention.wqkv.lora_A', 'layers.14.attention.wqkv.lora_B', 'layers.14.attention.wo.lora_A', 'layers.14.attention.wo.lora_B', 'layers.14.feed_forward.w1.lora_A', 'layers.14.feed_forward.w1.lora_B', 'layers.14.feed_forward.w3.lora_A', 'layers.14.feed_forward.w3.lora_B', 'layers.14.feed_forward.w2.lora_A', 'layers.14.feed_forward.w2.lora_B', 'layers.15.attention.wqkv.lora_A', 'layers.15.attention.wqkv.lora_B', 'layers.15.attention.wo.lora_A', 'layers.15.attention.wo.lora_B', 'layers.15.feed_forward.w1.lora_A', 'layers.15.feed_forward.w1.lora_B', 'layers.15.feed_forward.w3.lora_A', 'layers.15.feed_forward.w3.lora_B', 'layers.15.feed_forward.w2.lora_A', 'layers.15.feed_forward.w2.lora_B', 'layers.16.attention.wqkv.lora_A', 'layers.16.attention.wqkv.lora_B', 'layers.16.attention.wo.lora_A', 'layers.16.attention.wo.lora_B', 'layers.16.feed_forward.w1.lora_A', 'layers.16.feed_forward.w1.lora_B', 'layers.16.feed_forward.w3.lora_A', 'layers.16.feed_forward.w3.lora_B', 'layers.16.feed_forward.w2.lora_A', 'layers.16.feed_forward.w2.lora_B', 'layers.17.attention.wqkv.lora_A', 'layers.17.attention.wqkv.lora_B', 'layers.17.attention.wo.lora_A', 'layers.17.attention.wo.lora_B', 'layers.17.feed_forward.w1.lora_A', 'layers.17.feed_forward.w1.lora_B', 'layers.17.feed_forward.w3.lora_A', 'layers.17.feed_forward.w3.lora_B', 'layers.17.feed_forward.w2.lora_A', 'layers.17.feed_forward.w2.lora_B', 'layers.18.attention.wqkv.lora_A', 'layers.18.attention.wqkv.lora_B', 'layers.18.attention.wo.lora_A', 'layers.18.attention.wo.lora_B', 'layers.18.feed_forward.w1.lora_A', 'layers.18.feed_forward.w1.lora_B', 'layers.18.feed_forward.w3.lora_A', 'layers.18.feed_forward.w3.lora_B', 'layers.18.feed_forward.w2.lora_A', 'layers.18.feed_forward.w2.lora_B', 'layers.19.attention.wqkv.lora_A', 'layers.19.attention.wqkv.lora_B', 'layers.19.attention.wo.lora_A', 'layers.19.attention.wo.lora_B', 'layers.19.feed_forward.w1.lora_A', 'layers.19.feed_forward.w1.lora_B', 'layers.19.feed_forward.w3.lora_A', 'layers.19.feed_forward.w3.lora_B', 'layers.19.feed_forward.w2.lora_A', 'layers.19.feed_forward.w2.lora_B', 'layers.20.attention.wqkv.lora_A', 'layers.20.attention.wqkv.lora_B', 'layers.20.attention.wo.lora_A', 'layers.20.attention.wo.lora_B', 'layers.20.feed_forward.w1.lora_A', 'layers.20.feed_forward.w1.lora_B', 'layers.20.feed_forward.w3.lora_A', 'layers.20.feed_forward.w3.lora_B', 'layers.20.feed_forward.w2.lora_A', 'layers.20.feed_forward.w2.lora_B', 'layers.21.attention.wqkv.lora_A', 'layers.21.attention.wqkv.lora_B', 'layers.21.attention.wo.lora_A', 'layers.21.attention.wo.lora_B', 'layers.21.feed_forward.w1.lora_A', 'layers.21.feed_forward.w1.lora_B', 'layers.21.feed_forward.w3.lora_A', 'layers.21.feed_forward.w3.lora_B', 'layers.21.feed_forward.w2.lora_A', 'layers.21.feed_forward.w2.lora_B', 'layers.22.attention.wqkv.lora_A', 'layers.22.attention.wqkv.lora_B', 'layers.22.attention.wo.lora_A', 'layers.22.attention.wo.lora_B', 'layers.22.feed_forward.w1.lora_A', 'layers.22.feed_forward.w1.lora_B', 'layers.22.feed_forward.w3.lora_A', 'layers.22.feed_forward.w3.lora_B', 'layers.22.feed_forward.w2.lora_A', 'layers.22.feed_forward.w2.lora_B', 'layers.23.attention.wqkv.lora_A', 'layers.23.attention.wqkv.lora_B', 'layers.23.attention.wo.lora_A', 'layers.23.attention.wo.lora_B', 'layers.23.feed_forward.w1.lora_A', 'layers.23.feed_forward.w1.lora_B', 'layers.23.feed_forward.w3.lora_A', 'layers.23.feed_forward.w3.lora_B', 'layers.23.feed_forward.w2.lora_A', 'layers.23.feed_forward.w2.lora_B', 'output.lora_A', 'output.lora_B', 'fast_embeddings.lora_A', 'fast_embeddings.lora_B', 'fast_layers.0.attention.wqkv.lora_A', 'fast_layers.0.attention.wqkv.lora_B', 'fast_layers.0.attention.wo.lora_A', 'fast_layers.0.attention.wo.lora_B', 'fast_layers.0.feed_forward.w1.lora_A', 'fast_layers.0.feed_forward.w1.lora_B', 'fast_layers.0.feed_forward.w3.lora_A', 'fast_layers.0.feed_forward.w3.lora_B', 'fast_layers.0.feed_forward.w2.lora_A', 'fast_layers.0.feed_forward.w2.lora_B', 'fast_layers.1.attention.wqkv.lora_A', 'fast_layers.1.attention.wqkv.lora_B', 'fast_layers.1.attention.wo.lora_A', 'fast_layers.1.attention.wo.lora_B', 'fast_layers.1.feed_forward.w1.lora_A', 'fast_layers.1.feed_forward.w1.lora_B', 'fast_layers.1.feed_forward.w3.lora_A', 'fast_layers.1.feed_forward.w3.lora_B', 'fast_layers.1.feed_forward.w2.lora_A', 'fast_layers.1.feed_forward.w2.lora_B', 'fast_layers.2.attention.wqkv.lora_A', 'fast_layers.2.attention.wqkv.lora_B', 'fast_layers.2.attention.wo.lora_A', 'fast_layers.2.attention.wo.lora_B', 'fast_layers.2.feed_forward.w1.lora_A', 'fast_layers.2.feed_forward.w1.lora_B', 'fast_layers.2.feed_forward.w3.lora_A', 'fast_layers.2.feed_forward.w3.lora_B', 'fast_layers.2.feed_forward.w2.lora_A', 'fast_layers.2.feed_forward.w2.lora_B', 'fast_layers.3.attention.wqkv.lora_A', 'fast_layers.3.attention.wqkv.lora_B', 'fast_layers.3.attention.wo.lora_A', 'fast_layers.3.attention.wo.lora_B', 'fast_layers.3.feed_forward.w1.lora_A', 'fast_layers.3.feed_forward.w1.lora_B', 'fast_layers.3.feed_forward.w3.lora_A', 'fast_layers.3.feed_forward.w3.lora_B', 'fast_layers.3.feed_forward.w2.lora_A', 'fast_layers.3.feed_forward.w2.lora_B', 'fast_output.lora_A', 'fast_output.lora_B'], unexpected_keys=[])
[2024-11-08 15:55:29,099][main][INFO] - [rank: 0] Instantiating callbacks...
[2024-11-08 15:55:29,099][fish_speech.utils.instantiators][INFO] - [rank: 0] Instantiating callback <lightning.pytorch.callbacks.ModelCheckpoint>
[2024-11-08 15:55:29,103][fish_speech.utils.instantiators][INFO] - [rank: 0] Instantiating callback <lightning.pytorch.callbacks.ModelSummary>
[2024-11-08 15:55:29,104][fish_speech.utils.instantiators][INFO] - [rank: 0] Instantiating callback <lightning.pytorch.callbacks.LearningRateMonitor>
[2024-11-08 15:55:29,104][fish_speech.utils.instantiators][INFO] - [rank: 0] Instantiating callback <fish_speech.callbacks.GradNormMonitor>
[2024-11-08 15:55:29,109][main][INFO] - [rank: 0] Instantiating loggers...
[2024-11-08 15:55:29,109][fish_speech.utils.instantiators][INFO] - [rank: 0] Instantiating logger <lightning.pytorch.loggers.tensorboard.TensorBoardLogger>
[2024-11-08 15:55:29,113][main][INFO] - [rank: 0] Instantiating trainer <lightning.pytorch.trainer.Trainer>
[2024-11-08 15:55:29,150][pytorch_lightning.utilities.rank_zero][INFO] - Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.model_summary.ModelSummary'>]. Skipping setting a default
ModelSummary
callback.[2024-11-08 15:55:29,195][pytorch_lightning.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-11-08 15:55:29,196][pytorch_lightning.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-11-08 15:55:29,196][pytorch_lightning.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
[2024-11-08 15:55:29,197][main][INFO] - [rank: 0] Logging hyperparameters!
[2024-11-08 15:55:42,725][main][INFO] - [rank: 0] Starting training!
INFO: Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2
[2024-11-08 15:55:42,798][lightning.fabric.utilities.distributed][INFO] - Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2
[2024-11-08 15:55:48,877][numexpr.utils][INFO] - NumExpr defaulting to 4 threads.
[2024-11-08 15:55:49,063][datasets][INFO] - PyTorch version 2.4.1 available.
[2024-11-08 15:55:49,065][datasets][INFO] - Polars version 1.9.0 available.
[2024-11-08 15:55:49,066][datasets][INFO] - TensorFlow version 2.16.1 available.
[2024-11-08 15:55:49,068][datasets][INFO] - JAX version 0.4.26 available.
2024-11-08 15:56:06.023 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for embeddings.lora_A
2024-11-08 15:56:06.023 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for embeddings.lora_B
2024-11-08 15:56:06.023 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for codebook_embeddings.lora_A
2024-11-08 15:56:06.023 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for codebook_embeddings.lora_B
2024-11-08 15:56:06.023 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.attention.wqkv.lora_A
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.attention.wqkv.lora_B
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.attention.wo.lora_A
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.attention.wo.lora_B
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.feed_forward.w1.lora_A
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.feed_forward.w1.lora_B
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.feed_forward.w3.lora_A
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.feed_forward.w3.lora_B
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.feed_forward.w2.lora_A
2024-11-08 15:56:06.024 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.0.feed_forward.w2.lora_B
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.attention.wqkv.lora_A
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.attention.wqkv.lora_B
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.attention.wo.lora_A
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.attention.wo.lora_B
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.feed_forward.w1.lora_A
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.feed_forward.w1.lora_B
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.feed_forward.w3.lora_A
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.feed_forward.w3.lora_B
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.feed_forward.w2.lora_A
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.1.feed_forward.w2.lora_B
2024-11-08 15:56:06.025 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.attention.wqkv.lora_A
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.attention.wqkv.lora_B
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.attention.wo.lora_A
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.attention.wo.lora_B
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.feed_forward.w1.lora_A
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.feed_forward.w1.lora_B
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.feed_forward.w3.lora_A
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.feed_forward.w3.lora_B
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.feed_forward.w2.lora_A
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.2.feed_forward.w2.lora_B
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.attention.wqkv.lora_A
2024-11-08 15:56:06.026 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.attention.wqkv.lora_B
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.attention.wo.lora_A
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.attention.wo.lora_B
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.feed_forward.w1.lora_A
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.feed_forward.w1.lora_B
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.feed_forward.w3.lora_A
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.feed_forward.w3.lora_B
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.feed_forward.w2.lora_A
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.3.feed_forward.w2.lora_B
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.attention.wqkv.lora_A
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.attention.wqkv.lora_B
2024-11-08 15:56:06.027 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.attention.wo.lora_A
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.attention.wo.lora_B
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.feed_forward.w1.lora_A
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.feed_forward.w1.lora_B
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.feed_forward.w3.lora_A
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.feed_forward.w3.lora_B
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.feed_forward.w2.lora_A
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.4.feed_forward.w2.lora_B
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.attention.wqkv.lora_A
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.attention.wqkv.lora_B
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.attention.wo.lora_A
2024-11-08 15:56:06.028 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.attention.wo.lora_B
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.feed_forward.w1.lora_A
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.feed_forward.w1.lora_B
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.feed_forward.w3.lora_A
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.feed_forward.w3.lora_B
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.feed_forward.w2.lora_A
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.5.feed_forward.w2.lora_B
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.attention.wqkv.lora_A
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.attention.wqkv.lora_B
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.attention.wo.lora_A
2024-11-08 15:56:06.029 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.attention.wo.lora_B
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.feed_forward.w1.lora_A
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.feed_forward.w1.lora_B
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.feed_forward.w3.lora_A
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.feed_forward.w3.lora_B
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.feed_forward.w2.lora_A
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.6.feed_forward.w2.lora_B
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.attention.wqkv.lora_A
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.attention.wqkv.lora_B
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.attention.wo.lora_A
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.attention.wo.lora_B
2024-11-08 15:56:06.030 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.feed_forward.w1.lora_A
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.feed_forward.w1.lora_B
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.feed_forward.w3.lora_A
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.feed_forward.w3.lora_B
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.feed_forward.w2.lora_A
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.7.feed_forward.w2.lora_B
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.attention.wqkv.lora_A
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.attention.wqkv.lora_B
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.attention.wo.lora_A
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.attention.wo.lora_B
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.feed_forward.w1.lora_A
2024-11-08 15:56:06.031 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.feed_forward.w1.lora_B
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.feed_forward.w3.lora_A
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.feed_forward.w3.lora_B
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.feed_forward.w2.lora_A
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.8.feed_forward.w2.lora_B
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.attention.wqkv.lora_A
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.attention.wqkv.lora_B
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.attention.wo.lora_A
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.attention.wo.lora_B
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.feed_forward.w1.lora_A
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.feed_forward.w1.lora_B
2024-11-08 15:56:06.032 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.feed_forward.w3.lora_A
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.feed_forward.w3.lora_B
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.feed_forward.w2.lora_A
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.9.feed_forward.w2.lora_B
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.attention.wqkv.lora_A
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.attention.wqkv.lora_B
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.attention.wo.lora_A
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.attention.wo.lora_B
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.feed_forward.w1.lora_A
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.feed_forward.w1.lora_B
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.feed_forward.w3.lora_A
2024-11-08 15:56:06.033 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.feed_forward.w3.lora_B
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.feed_forward.w2.lora_A
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.10.feed_forward.w2.lora_B
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.attention.wqkv.lora_A
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.attention.wqkv.lora_B
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.attention.wo.lora_A
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.attention.wo.lora_B
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.feed_forward.w1.lora_A
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.feed_forward.w1.lora_B
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.feed_forward.w3.lora_A
2024-11-08 15:56:06.034 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.feed_forward.w3.lora_B
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.feed_forward.w2.lora_A
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.11.feed_forward.w2.lora_B
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.attention.wqkv.lora_A
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.attention.wqkv.lora_B
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.attention.wo.lora_A
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.attention.wo.lora_B
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.feed_forward.w1.lora_A
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.feed_forward.w1.lora_B
2024-11-08 15:56:06.035 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.feed_forward.w3.lora_A
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.feed_forward.w3.lora_B
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.feed_forward.w2.lora_A
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.12.feed_forward.w2.lora_B
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.attention.wqkv.lora_A
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.attention.wqkv.lora_B
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.attention.wo.lora_A
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.attention.wo.lora_B
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.feed_forward.w1.lora_A
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.feed_forward.w1.lora_B
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.feed_forward.w3.lora_A
2024-11-08 15:56:06.036 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.feed_forward.w3.lora_B
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.feed_forward.w2.lora_A
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.13.feed_forward.w2.lora_B
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.attention.wqkv.lora_A
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.attention.wqkv.lora_B
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.attention.wo.lora_A
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.attention.wo.lora_B
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.feed_forward.w1.lora_A
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.feed_forward.w1.lora_B
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.feed_forward.w3.lora_A
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.feed_forward.w3.lora_B
2024-11-08 15:56:06.037 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.feed_forward.w2.lora_A
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.14.feed_forward.w2.lora_B
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.attention.wqkv.lora_A
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.attention.wqkv.lora_B
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.attention.wo.lora_A
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.attention.wo.lora_B
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.feed_forward.w1.lora_A
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.feed_forward.w1.lora_B
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.feed_forward.w3.lora_A
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.feed_forward.w3.lora_B
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.feed_forward.w2.lora_A
2024-11-08 15:56:06.038 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.15.feed_forward.w2.lora_B
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.attention.wqkv.lora_A
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.attention.wqkv.lora_B
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.attention.wo.lora_A
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.attention.wo.lora_B
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.feed_forward.w1.lora_A
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.feed_forward.w1.lora_B
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.feed_forward.w3.lora_A
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.feed_forward.w3.lora_B
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.feed_forward.w2.lora_A
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.16.feed_forward.w2.lora_B
2024-11-08 15:56:06.039 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.attention.wqkv.lora_A
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.attention.wqkv.lora_B
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.attention.wo.lora_A
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.attention.wo.lora_B
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.feed_forward.w1.lora_A
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.feed_forward.w1.lora_B
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.feed_forward.w3.lora_A
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.feed_forward.w3.lora_B
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.feed_forward.w2.lora_A
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.17.feed_forward.w2.lora_B
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.attention.wqkv.lora_A
2024-11-08 15:56:06.040 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.attention.wqkv.lora_B
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.attention.wo.lora_A
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.attention.wo.lora_B
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.feed_forward.w1.lora_A
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.feed_forward.w1.lora_B
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.feed_forward.w3.lora_A
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.feed_forward.w3.lora_B
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.feed_forward.w2.lora_A
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.18.feed_forward.w2.lora_B
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.attention.wqkv.lora_A
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.attention.wqkv.lora_B
2024-11-08 15:56:06.041 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.attention.wo.lora_A
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.attention.wo.lora_B
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.feed_forward.w1.lora_A
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.feed_forward.w1.lora_B
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.feed_forward.w3.lora_A
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.feed_forward.w3.lora_B
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.feed_forward.w2.lora_A
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.19.feed_forward.w2.lora_B
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.attention.wqkv.lora_A
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.attention.wqkv.lora_B
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.attention.wo.lora_A
2024-11-08 15:56:06.042 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.attention.wo.lora_B
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.feed_forward.w1.lora_A
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.feed_forward.w1.lora_B
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.feed_forward.w3.lora_A
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.feed_forward.w3.lora_B
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.feed_forward.w2.lora_A
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.20.feed_forward.w2.lora_B
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.attention.wqkv.lora_A
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.attention.wqkv.lora_B
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.attention.wo.lora_A
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.attention.wo.lora_B
2024-11-08 15:56:06.043 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.feed_forward.w1.lora_A
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.feed_forward.w1.lora_B
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.feed_forward.w3.lora_A
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.feed_forward.w3.lora_B
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.feed_forward.w2.lora_A
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.21.feed_forward.w2.lora_B
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.attention.wqkv.lora_A
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.attention.wqkv.lora_B
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.attention.wo.lora_A
2024-11-08 15:56:06.044 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.attention.wo.lora_B
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.feed_forward.w1.lora_A
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.feed_forward.w1.lora_B
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.feed_forward.w3.lora_A
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.feed_forward.w3.lora_B
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.feed_forward.w2.lora_A
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.22.feed_forward.w2.lora_B
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.attention.wqkv.lora_A
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.attention.wqkv.lora_B
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.attention.wo.lora_A
2024-11-08 15:56:06.045 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.attention.wo.lora_B
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.feed_forward.w1.lora_A
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.feed_forward.w1.lora_B
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.feed_forward.w3.lora_A
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.feed_forward.w3.lora_B
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.feed_forward.w2.lora_A
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for layers.23.feed_forward.w2.lora_B
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for output.lora_A
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for output.lora_B
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_embeddings.lora_A
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_embeddings.lora_B
2024-11-08 15:56:06.046 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.attention.wqkv.lora_A
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.attention.wqkv.lora_B
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.attention.wo.lora_A
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.attention.wo.lora_B
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.feed_forward.w1.lora_A
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.feed_forward.w1.lora_B
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.feed_forward.w3.lora_A
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.feed_forward.w3.lora_B
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.feed_forward.w2.lora_A
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.0.feed_forward.w2.lora_B
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.attention.wqkv.lora_A
2024-11-08 15:56:06.047 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.attention.wqkv.lora_B
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.attention.wo.lora_A
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.attention.wo.lora_B
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.feed_forward.w1.lora_A
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.feed_forward.w1.lora_B
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.feed_forward.w3.lora_A
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.feed_forward.w3.lora_B
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.feed_forward.w2.lora_A
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.1.feed_forward.w2.lora_B
2024-11-08 15:56:06.048 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.attention.wqkv.lora_A
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.attention.wqkv.lora_B
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.attention.wo.lora_A
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.attention.wo.lora_B
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.feed_forward.w1.lora_A
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.feed_forward.w1.lora_B
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.feed_forward.w3.lora_A
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.feed_forward.w3.lora_B
2024-11-08 15:56:06.049 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.feed_forward.w2.lora_A
2024-11-08 15:56:06.050 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.2.feed_forward.w2.lora_B
2024-11-08 15:56:06.050 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.attention.wqkv.lora_A
2024-11-08 15:56:06.050 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.attention.wqkv.lora_B
2024-11-08 15:56:06.050 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.attention.wo.lora_A
2024-11-08 15:56:06.050 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.attention.wo.lora_B
2024-11-08 15:56:06.050 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.feed_forward.w1.lora_A
2024-11-08 15:56:06.050 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.feed_forward.w1.lora_B
2024-11-08 15:56:06.051 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.feed_forward.w3.lora_A
2024-11-08 15:56:06.051 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.feed_forward.w3.lora_B
2024-11-08 15:56:06.051 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.feed_forward.w2.lora_A
2024-11-08 15:56:06.051 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_layers.3.feed_forward.w2.lora_B
2024-11-08 15:56:06.051 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_output.lora_A
2024-11-08 15:56:06.051 | WARNING | fish_speech.models.text2semantic.llama:from_pretrained:424 - No weight for fast_output.lora_B
INFO: Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2
[2024-11-08 15:56:06,333][lightning.fabric.utilities.distributed][INFO] - Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2
[W1108 15:56:06.706582522 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W1108 15:56:06.715281420 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[2024-11-08 15:56:06,345][pytorch_lightning.utilities.rank_zero][INFO] - ----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 2 processes
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
INFO: LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1]
[2024-11-08 15:56:07,062][lightning.pytorch.accelerators.cuda][INFO] - LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1]
[2024-11-08 15:56:07,062][lightning.pytorch.accelerators.cuda][INFO] - LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
[2024-11-08 15:56:17,373][fish_speech.models.text2semantic.lit_module][INFO] - [rank: 0] Set weight decay: 0 for 432 parameters
[2024-11-08 15:56:17,373][fish_speech.models.text2semantic.lit_module][INFO] - [rank: 0] Set weight decay: 0.0 for 61 parameters
INFO:
| Name | Type | Params | Mode
0 | model | DualARTransformer | 499 M | train
1 | model.embeddings | Embedding | 33.0 M | train
2 | model.codebook_embeddings | Embedding | 8.5 M | train
3 | model.layers | ModuleList | 362 M | train
4 | model.norm | RMSNorm | 1.0 K | train
5 | model.output | Linear | 33.0 M | train
6 | model.fast_project_in | Identity | 0 | train
7 | model.fast_embeddings | Embedding | 1.1 M | train
8 | model.fast_layers | ModuleList | 60.4 M | train
9 | model.fast_norm | RMSNorm | 1.0 K | train
10 | model.fast_output | Linear | 1.1 M | train
5.1 M Trainable params
494 M Non-trainable params
499 M Total params
1,998.053 Total estimated model params size (MB)
433 Modules in train mode
0 Modules in eval mode
The text was updated successfully, but these errors were encountered: