Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu: aarch64: batch_normalization : Expand ARM SVE support in jit_uni_batch_normalization #1918

Merged
merged 2 commits into from
Jun 3, 2024

Conversation

nikhilfujitsu
Copy link
Contributor

@nikhilfujitsu nikhilfujitsu commented May 15, 2024

Description

This commit enhances the existing ARM SVE support in jit_uni_batch_normalization to include additional vector length compatibility. The changes made are for implementation of different ARM SVE vector length.

Major Code changes:

Updated the block size definition to accommodate other SVE vector length.
Added 'OR' and 'AND' conditions to extend support for other SVE vector.
Predicate registers are set according to isa vector length.
ldr and str instructions are replaced by ld1w and st1w respectively. For appropriate load and strore operation considering isa vector length.

Checklist

General
[✓] Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit? Yes
Test output is same with and without this commit.

make test output:
99% tests passed, 1 tests failed out of 103

Total Test time (real) = 36.37 sec

The following tests FAILED:
102 - test_graph_unit_cpu (Failed)
Errors while running CTest
Output from these tests are in: /home/nikhil/TEST/oneDNN/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
make: *** [Makefile:71: test] Error 8

make test_benchdnn_* output:
make: *** No rule to make target 'test_benchdnn_*'. Stop.

make test_benchdnn_bnorm_ci_cpu/fast output.
tests:4445 passed:1232 skipped:2989 mistrusted:224 unimplemented:0 invalid_arguments:0 failed:0 listed:0

cpu: aarch64: Expand ARM SVE support in jit_uni_batch_normalization

Added sve_256 in the implementation list
Update jit_uni_batch_normalization.cpp
Updated the block size definition to accommodate different ISAs.
Added 'OR' conditions to extend support for additonal block_size.
Predicate registers are set according to isa vector length.
ldr and str instruction changed to ldw1 and stw1 respectively. To support load and store operations as per ISA.
@mgouicem mgouicem added the platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 label May 17, 2024
@mgouicem mgouicem requested a review from jondea May 17, 2024 07:46
Copy link
Contributor

@mgouicem mgouicem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution. Tagging @jondea

Copy link
Contributor

@jondea jondea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution, we really appreciate your work on vlen 256. I just have a few comments about whether it could be more general. If it works for an SVE vlen of 256 and 512, why not any length? Then you could remove any hard coded vlen variables, and there could be a single implementation.

@vpirogov vpirogov added this to the v3.6 milestone May 21, 2024
@abhijain1204fujitsu
Copy link

@vpirogov , Hello
Can you please merge the PR as the changes are approved
In case anything is required from our end, do let us know.

@densamoilov densamoilov merged commit 0a1d0fb into oneapi-src:main Jun 3, 2024
10 checks passed
@nikhilfujitsu nikhilfujitsu deleted the batch_norm branch September 3, 2024 10:03
@nikhilfujitsu nikhilfujitsu restored the batch_norm branch September 3, 2024 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants