Skip to content

Pull requests: bigscience-workshop/Megatron-DeepSpeed

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Bump black from 21.4b0 to 24.3.0 dependencies Pull requests that update a dependency file
#402 opened Mar 20, 2024 by dependabot bot Loading…
Add xPos embeddings
#370 opened Mar 7, 2023 by janEbert Loading…
Fix various small problems
#367 opened Feb 28, 2023 by janEbert Loading…
Bloom model training with AML
#365 opened Feb 21, 2023 by savitamittal1 Loading…
Add UL2 data sampling and pretraining
#358 opened Dec 13, 2022 by janEbert Loading…
Add FlashAttention
#357 opened Dec 12, 2022 by NouamaneTazi Loading…
Enable rocm-support
#353 opened Oct 7, 2022 by luukkonenr Loading…
Add multiple evaluation compat
#336 opened Aug 30, 2022 by Muennighoff Loading…
Prefix LM Eval
#313 opened Jul 16, 2022 by Muennighoff Loading…
Add Bitfit
#311 opened Jul 10, 2022 by Muennighoff Loading…
Tool for CKPT averaging
#310 opened Jul 10, 2022 by Muennighoff Loading…
Enable loading ckpt for t0 finetuning
#309 opened Jul 10, 2022 by Muennighoff Loading…
BigScience Eval Harness
#291 opened Jun 29, 2022 by Muennighoff Loading…
No-ZeRO reshaping
#289 opened Jun 23, 2022 by Muennighoff Loading…
WIP: Shared t5 code
#286 opened Jun 21, 2022 by thomasw21 Loading…
2 of 4 tasks
[WIP] add debug utils
#275 opened Mar 28, 2022 by stas00 Loading…
Sync layer norm
#271 opened Mar 24, 2022 by thomasw21 Draft
Test different layer norm
#270 opened Mar 24, 2022 by thomasw21 Draft
ProTip! What’s not been updated in a month: updated:<2024-10-28.