forked from flexflow/flexflow-train
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bert fix2 #6
Open
xinhaoc
wants to merge
82
commits into
merged_bert
Choose a base branch
from
bert_fix2
base: merged_bert
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Bert fix2 #6
Changes from 81 commits
Commits
Show all changes
82 commits
Select commit
Hold shift + click to select a range
2436152
push changes to print dot graph
7030ae0
padded input to 512, added isExact to slice_tensor
280d29a
Add multiprecision support to replicate
lockshaw 8e84401
Add explicit template instantiations for replicate kernels
lockshaw a2ddb91
Fix incorrect instantiations
lockshaw 52cc8e8
Add nop init_task for replicate
lockshaw 9eb530a
Fix replicate init_task registration
lockshaw d7a219c
Hopefully print hip errors
lockshaw f323a6e
Instantiate extra hip replicate kernels
lockshaw 9a75f24
fix
jiazhihao fe61cbe
Merge branch 'BertMLM_fixes' of https://github.com/flexflow/FlexFlow …
jiazhihao c9277f3
debug changs
jiazhihao fe66561
Add slice_tensor fix
9e04302
Merge branch 'BertMLM_fixes' of github.com:flexflow/FlexFlow into Ber…
1177748
Add logging for metrics
lockshaw 5b7cace
Add the cuda metrics hack to hip kernel as well
lockshaw e798a91
Add parallel dim pretty printing
lockshaw 90541cf
[Embedding] bug fix
jiazhihao 63fcde6
Merge branch 'BertMLM_fixes' of https://github.com/flexflow/FlexFlow …
jiazhihao 7862143
Add replica dim to pretty print
lockshaw 9663d96
Merge remote-tracking branch 'refs/remotes/origin/BertMLM_fixes' into…
lockshaw ef43c36
Fix replicate issue with python hack
tnoyola dd8090e
Use local json submodule
lockshaw 0dc6187
ofi conduit-related fixes
0950ac7
Add mpi flags for hip
lockshaw 4b06040
fix fusion bug
jiazhihao 6796b1c
Merge branch 'BertMLM_fixes' of https://github.com/flexflow/FlexFlow …
jiazhihao 99e9f95
increase the max number of regions in a ZeroInitMeta from 64 to 128
jiazhihao 282c44a
support mixed precision
jiazhihao 992dcb9
undo changes to Fused::Transpose
jiazhihao f528774
undo changes to config.linux
jiazhihao a68150d
try to fix layernorm
jiazhihao 2bf9afc
fix typo
jiazhihao f6f7a32
Add possible layernorm fix
lockshaw 5e03b0a
Fix additional layernorm bug due to get_piece_size return size in bytes
lockshaw 53fb8bd
Bugfixes
tnoyola 449a14c
Actually check elementwise_affine
lockshaw c737be6
Revert "Actually check elementwise_affine"
tnoyola a98e09d
Change optimizer to adam with correct hyperparams
lockshaw 66b805e
Merge remote-tracking branch 'refs/remotes/origin/BertMLM_fixes' into…
tnoyola 4bec811
fix training bert model.
xinhaoc 2d28c15
revert changes
xinhaoc 2025d56
fix bert training issue. (#832)
xinhaoc 5f793c1
Improve machine_view hash
lockshaw 2c09397
Fix bugs in improved hashing
lockshaw 862e9d7
fix weight dimension in layernorm
xinhaoc d29bf1d
Merge branch 'BertMLM_fixes' of https://github.com/flexflow/FlexFlow …
xinhaoc 88ad5fa
Merge remote-tracking branch 'origin/master' into BertMLM_fixes
lockshaw 2eee875
fix `preregister_task_variant` issue, linting
goliaro b9d1332
try to run graph_optimize on each node
jiazhihao b5b0815
remove unnecessary file
jiazhihao 94e35d9
fix hip build
xinhaoc ac185e3
Merge branch 'BertMLM_fixes' of https://github.com/flexflow/FlexFlow …
xinhaoc ded175c
bypass simulator creation when only_data_parallel is specified
jiazhihao 1f7e8b7
add nccl prints
jiazhihao 3fb70f6
.
jiazhihao d652b62
rccl
xinhaoc b39528b
fix fuse
xinhaoc 0cf3c8e
fix hip
xinhaoc 17a1c4e
more fix to hip
xinhaoc f65044d
customized kernel for broadcasting add.
xinhaoc bcab56a
dropout
xinhaoc fa1fffc
optimizer
xinhaoc 40d830c
opt
xinhaoc e825526
fix
xinhaoc d2bdb15
fix
xinhaoc 3b9e1c6
.
xinhaoc 9f8bb9e
fix
xinhaoc fb91122
remove print
xinhaoc c162d4c
fix hip
xinhaoc ea79317
fix multinodes
xinhaoc 58d84ed
fix
xinhaoc 01c9d4c
fix
xinhaoc a31f8e9
fix
xinhaoc 9141c46
tp
xinhaoc 8185289
timer
xinhaoc d958805
rmv
xinhaoc 38dfd87
fix tp
xinhaoc 355d4b4
try a fix
xinhaoc 8488ba0
fix hip
xinhaoc 1753e7e
Merge remote-tracking branch 'xinhao/merged_bert' into bert_fix2
xinhaoc d5496e9
update submodule
xinhaoc File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1 @@ | ||
include(FetchContent) | ||
|
||
FetchContent_Declare(json URL https://github.com/nlohmann/json/releases/download/v3.10.5/json.tar.xz) | ||
FetchContent_MakeAvailable(json) | ||
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/deps/json) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule tokenizers-cpp
deleted from
c0fab1
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
changes: | ||
cudnnSetTensorDescriptorFromDomain4SoftMax | ||
try_one_lambda in grpah.cc | ||
|
||
field_space = runtime->create_field_space(lg_ctx in model.cc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is the right version of Legion? Looks like you used the old version of Legion.