-
Notifications
You must be signed in to change notification settings - Fork 277
Issues: dvlab-research/LongLoRA
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl.
#188
opened May 28, 2024 by
masteryqq
When I set
per_device_train_batch_size=2
, the S2-Attn would not shift as expected
#182
opened Mar 1, 2024 by
linhaojia13
merge_lora_weights_and_save_hf_model.py Error while deserializing header: HeaderTooLarge
#172
opened Jan 23, 2024 by
Spongeorge
论文中的evaluate结果,推理时用的attention是shifted sparse attention?还是full attention?
#170
opened Jan 19, 2024 by
zhangxiann
the value of loss is too unstable when supervised-finetune the 7b-100k-ft model
#168
opened Jan 18, 2024 by
seanxuu
Previous Next
ProTip!
Adding no:label will show everything without a label.