Skip to content

Commit

Permalink
derivative of softmax is indepedent of max
Browse files Browse the repository at this point in the history
  • Loading branch information
chenyuxyz committed Oct 12, 2024
1 parent f79e05c commit f7b8c41
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion tinygrad/tensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -1682,7 +1682,7 @@ def std_mean(self, axis:Optional[Union[int, Sequence[int]]]=None, keepdim=False,

def _softmax(self, axis, dtype:Optional[DTypeLike]=None):
x = self.cast(dtype) if dtype is not None else self
m = x - x.max(axis=axis, keepdim=True)
m = x - x.max(axis=axis, keepdim=True).detach()
e = m.exp()
return m, e, e.sum(axis=axis, keepdim=True)

Expand Down

0 comments on commit f7b8c41

Please sign in to comment.