Using Mistral 7B with transformers v4.38.1 on MATH dataset, and facing memory leaks #80

Jayant1234 · 2024-05-04T22:09:33Z

In both the Trainers, Basic, and FSDP, there is an underlying pattern of GPU memory not being freed. Allocation keeps increasing in steps while utilization remains roughly constant.

Does anyone have any suggestions of what might have gone wrong?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Mistral 7B with transformers v4.38.1 on MATH dataset, and facing memory leaks #80

Using Mistral 7B with transformers v4.38.1 on MATH dataset, and facing memory leaks #80

Jayant1234 commented May 4, 2024 •

edited

Loading

Using Mistral 7B with transformers v4.38.1 on MATH dataset, and facing memory leaks #80

Using Mistral 7B with transformers v4.38.1 on MATH dataset, and facing memory leaks #80

Comments

Jayant1234 commented May 4, 2024 • edited Loading

Jayant1234 commented May 4, 2024 •

edited

Loading