Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pt: fix se_a type_one_side performance degradation (#3361)
The code in this PR is ugly, but applying a mask is causing performance degradation for ~3 ms/step. When applying a mask, `aten::nonzero` has a high host time, as it causes host-device synchronization: ![image](https://github.com/deepmodeling/deepmd-kit/assets/9496702/86b3518c-206d-410d-928e-2f605746147c) After fixing: ![image](https://github.com/deepmodeling/deepmd-kit/assets/9496702/af9e86fa-7908-4bbb-ace7-58b4602e167f) See pytorch/pytorch#12461 for more information. Signed-off-by: Jinzhe Zeng <[email protected]>
- Loading branch information