-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add information for coordinating segments in python frontend. #3289
Conversation
* Track inputs, outputs, and extents
!build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the delayed review.
} | ||
TensorView* tv = v->as<TensorView>(); | ||
std::vector<IterDomain*> logical_dom = | ||
TensorDomain::noReductions(tv->getLogicalDomain()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wujingyue is trying to change how we bind IO buffers to kernels. i.e. we might rethink which domain and how we are going to use here.
Not proposing any change, just trying to raise awareness.
std::vector<Val*> extents = getExtents(fusion_); | ||
for (Val* extent : extents) { | ||
int64_t num_extents = (int64_t)extents_fid_.size(); | ||
int64_t extent_fid = -num_extents - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a negative index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All scalars, vectors, and tensors use positive indices. The extents do not exist in the FusionState
, so I used the negative numbers exclusively for the extent scalars.
The extents are the size of iterDomain
in CPP fusion. We don't track those in FusionDefinition
but they can become input arguments to a FusionDefinition
after segmentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the negative number here is just an initialization? does the number carry any meaning or does a global -1
would do it just fine?
sorry I might miss the part where extents_fid_ is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an ordering component to the extent index.
It is used for the same purpose as collecting the extents in prepareRuntimeOrder
.
https://github.com/NVIDIA/Fuser/blob/main/csrc/runtime/fusion_cache_utils.cpp#L199-L208
We're mapping the tensor sizes to the extents like so https://github.com/NVIDIA/Fuser/pull/3025/files#diff-e512bea3b02f75ab1e81b759562879c5867e6e863679d6e7696fa34087dc3dc9R98-R100.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this in a comment listing the use of negative indices to avoid conflict with other indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got'ya. It's hard to figure out the necessity without looking at the actual use. We can keep it as-is and revisit in follow up PRs.
// The extent can already exist in the fusion. However, since scalars cannot | ||
// be passed between segments, always overwrited existing fids. The original | ||
// fusion definition will provide scalar extents. | ||
map_value_to_fid_[extent] = extent_fid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit lost here.
iiuc, the map_value_to_fid_ on other values are mapped from the Val* to their index field in FusionState
. Here looks like we are trying to create a the same thing for each TensorView's logical domain. Where are we creating the python container for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exposing the TensorView's logical domain to the python frontend, but I am tracking it in the FusionState
. We may have to pass the scalar extents of the TensorView's logical domain as an input argument to a fusion segment.
Is it possible to add a test demonstrating what new information the |
2cbb9d4
to
8310ace
Compare
LGTM overall. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<Val*> extents = getExtents(fusion_); | ||
for (Val* extent : extents) { | ||
int64_t num_extents = (int64_t)extents_fid_.size(); | ||
int64_t extent_fid = -num_extents - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got'ya. It's hard to figure out the necessity without looking at the actual use. We can keep it as-is and revisit in follow up PRs.
!build |
Summary
Information Flow
|
Overview
This PR adds information necessary for coordinating segments in the python frontend. Changes pulled from #3025.
PR Details
Implementation Details
FusionState
is a lightweight representation of a CPPFusion
.buildFusionIr
, a CPPFusion
is created from the PythonFusionDefinition
. At this point, theFusionState
creates a mapping from CPPFusion
to itsState
objects.FusionState
is temporary and the CPPFusion
is cached inFusionCache
. The information linking the CPPFusion
and PythonFusionDefinition
is stored inFusionCache
.FusionState
, we look for a cached CPPFusion
. If it exists, we restore the mapping from the data stored inFusionSchedules
.