-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Host benchmarking for a fusion with multiple segments #3307
Conversation
Can we validate the fusion results in the intended number of segments? |
Done. I mis-wrote that it is 11 segments, it is 12 segments with the permute operation.
|
fd.validate(input, [eager_output]) | ||
|
||
# Validate number of segments | ||
_ = fd.execute(input, profile=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self:
Create an issue: Allow fd.validate to accept nvfuser outputs in which case, they are directly compared with the reference outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
!build |
This benchmark uses matmul + pointwise op to create a fusion with 12 segments instead of using
segment_set
to force segmentation.For
host_benchmark_mode='compile'
, the profile is shown below. TheFinding valid segment solutions
pass takes 52 ms