-
Notifications
You must be signed in to change notification settings - Fork 53
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable segmentation support to python frontend user-scheduling (#3025)
## Overview: The original `FusionDefinition` stores the sequence of sub-fusions and acts as an argument manager. It gathers the input arguments before running the sub-fusion and stores its results. To perform this function, it uses a map from the segment index space to the original index space. This mapping was generated while creating the python definition for each sub-fusion. ## Changes in this PR This PR implements `_execute_segments ` function for user-scheduler segmentation. It is the third PR in a stack, preceded by #3334 and #3335. 1. Implement `_execute_segments ` function in `nvfuser/__init__.py` to orchestrate segments in original fusion. 2. Add `supports_segmentation flag` to `exec_nvfuser`, so segmentation testing is enabled by default for all python tests. ## Example: ### Original Fusion: A reduction + broadcast + pointwise fusion. ```python def nvfuser_fusion_id1(fd : FusionDefinition) -> None : T0 = fd.define_tensor(shape=[-1, -1], contiguity=[True, True], dtype=DataType.Float, is_cpu=False) T1 = fd.define_tensor(shape=[-1, -1], contiguity=[True, True], dtype=DataType.Float, is_cpu=False) T2 = fd.ops.sum(T0, dims=[1], keepdim=False, dtype=DataType.Float) T3 = fd.ops.broadcast(T2, is_broadcast_dim=[False, True]) T4 = fd.ops.add(T1, T3) fd.add_output(T4) ``` ## Step-by-Step execution of `_execute_segments` ### Step 1 before running any segments. #### `map_original_fid_to_value` state: 6 entries | Original Index | Description | | -----------------| --------------- | | 0 | The first tensor argument for the original fusion. | | 1 | The second tensor argument for the original fusion. | | -1 | Extent of axis 0 for first tensor argument. | | -2 | Extent of axis 1 for first tensor argument. | | -3 | Extent of axis 0 for second tensor argument. | | -4 | Extent of axis 1 for second tensor argument. | * Omit extents [-1, -4] from table in future steps because they are not necessary for these segments. ### Step 2 after running the first segment. #### `map_original_fid_to_value` state: 6 entries | Original Index | Description | | -----------------| --------------- | | 1 | The second tensor argument for the original fusion. | | 3 | The broadcasted, reduction tensor, which is the output from the first segment. | * Removed the entry for `T0` because the first tensor argument is not required for second segment. * Added the entry for `T3`, which is the output from the first segment. ### Step 3 after running the second segment. #### `map_original_fid_to_value` state: 5 entries | Original Index | Description | | -----------------| --------------- | | 4 | The pointwise addition, which is the output for the original fusion. | * Removed the entries for `T1` and `T3` because they are not necessary anymore. * Added the entry for `T4`, which is the output from the second segment. ### Step 4 after running all segments. * Return `T4` from `map_original_fid_to_value` as the result for the original fusion.
- Loading branch information
Showing
6 changed files
with
144 additions
and
22 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters