Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen model issues & embedding and loss has nan #52

Open
lylcst opened this issue Nov 3, 2023 · 5 comments
Open

Qwen model issues & embedding and loss has nan #52

lylcst opened this issue Nov 3, 2023 · 5 comments

Comments

@lylcst
Copy link

lylcst commented Nov 3, 2023

after a loss backward and optimizer step, then forward the embedding layer output hidden states become inf and loss is nan.

@LMXKO
Copy link

LMXKO commented Nov 28, 2023

+1

@kar9999
Copy link

kar9999 commented Dec 14, 2023

请问是sft阶段还是dpo阶段啊,我在作者的框架下,用sft微调chatglm3 loss也会nan

@lylcst
Copy link
Author

lylcst commented Dec 14, 2023

请问是sft阶段还是dpo阶段啊,我在作者的框架下,用sft微调chatglm3 loss也会nan

dpo

@akshayraghavan21
Copy link

Hi, any update on this? Were you able to fix this issue?

@John-Watson123
Copy link

I've got this problem too when using the model Qwen2.5-7B.
Python Output:
Computing eval metrics: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:20<00:00, 1.26s/it]
Generating samples...: 0%| | 0/1 [00:00<?, ?it/s]Both max_new_tokens (=2048) and max_length(=512) seem to have been set. max_new_tokens will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
Generating samples...: 0%| | 0/1 [00:00<?, ?it/s]
Error executing job with overrides: []
Traceback (most recent call last):
File "/direct-preference-optimization/train.py", line 114, in main
worker_main(0, 1, config, policy, reference_model)
File "
/direct-preference-optimization/train.py", line 44, in worker_main
trainer.train()
File "/direct-preference-optimization/trainers.py", line 320, in train
policy_samples, reference_samples = self.get_batch_samples(local_eval_batch)
File "
/direct-preference-optimization/trainers.py", line 188, in get_batch_samples
policy_output = self.policy.generate(
File "/anaconda3/envs/dpo/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "
/anaconda3/envs/dpo/lib/python3.10/site-packages/transformers/generation/utils.py", line 2024, in generate
result = self._sample(
File "~/anaconda3/envs/dpo/lib/python3.10/site-packages/transformers/generation/utils.py", line 3020, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf, nan or element < 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants