-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matmul Python Benchmarks #2088
Matmul Python Benchmarks #2088
Conversation
benchmarks/python/test_matmul.py
Outdated
|
||
def matmul_fusion(fd: FusionDefinition, dtype: DataType) -> None: | ||
# Decide contiguity based on layout | ||
a = fd.define_tensor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should pass a
and b
here instead as torch.Tensor
s then use fd.from_pytorch
to define these. The reason is that this definition as stated will not adapt to different layouts, but we will get the right stride order for the operands from fd.from_pytorch
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That of course assumes we have #2058 fixed first. until then we probably need to pass in the layout or better yet bool
s indicating whether we need to transpose each operand first. For transposed operands we should allocate them like for example randn(k, m)
instead of randn(m, k)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, fd.from_pytorch
, should be able to allocate the inputs correctly, right?
But fd.define_tensor with the contiguity flags will not work.
For future, we should probably have a separate benchmark for baselines (eager / torchcompile). |
Maybe we should do that in another PR since the current one is enough to benchmark aten vs nvfuser using env vars. |
@jacobhinkle Should we merge this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full speed ahead! Thanks for the changes.
Fixes #2082