Skip to content

Commit

Permalink
[Perf] Reduce peak memory usage
Browse files Browse the repository at this point in the history
Maintaining multiple names here will cause both to be refcounted which increases the peak memory. This will manifest as more blocks on top of each other in the memory profile.
  • Loading branch information
andoorve authored Nov 14, 2024
1 parent 9d5b4e4 commit 358dd7e
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions vllm/model_executor/models/llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,8 @@ def __init__(
self.act_fn = SiluAndMul()

def forward(self, x):
gate_up, _ = self.gate_up_proj(x)
x = self.act_fn(gate_up)
x, _ = self.gate_up_proj(x)
x = self.act_fn(x)
x, _ = self.down_proj(x)
return x

Expand Down

0 comments on commit 358dd7e

Please sign in to comment.