Skip to content

Commit

Permalink
Ulysses parallelism
Browse files Browse the repository at this point in the history
  • Loading branch information
brunomaga committed Sep 18, 2024
1 parent 0496011 commit d3c0429
Show file tree
Hide file tree
Showing 6 changed files with 314 additions and 365 deletions.

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _posts/2023-08-18-GPT-lite-DeepSpeed-sharding.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
layout: post
title: "Distributed training of a GPT model: sharding, offloading, activation checkpointing, and communication quantization via DeepSpeed"
title: "Distributed GPT model: sharding, offloading, activation checkpointing, and communication quantization via DeepSpeed"
categories: [machine learning, Transformer, GPT, DeepSpeed]
tags: [machinelearning]
---
Expand Down
2 changes: 1 addition & 1 deletion _posts/2023-08-30-GPT-lite-DeepSpeed-pipeline.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
layout: post
title: "Distributed training of a GPT model (part 2): pipeline parallelism via DeepSpeed"
title: "Distributed GPT model (part 2): pipeline parallelism via DeepSpeed"
categories: [machine learning, Transformer, GPT, DeepSpeed]
tags: [machinelearning]
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
layout: post
title: "Distributed training of a GPT model (part 3): Megatron-LM model parallelism from scratch"
title: "Distributed GPT model (part 3): Megatron-LM model parallelism"
categories: [machine learning, Transformer, GPT, DeepSpeed]
tags: [machinelearning]
---
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit d3c0429

Please sign in to comment.