-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto merge of #16350 - roife:neon-support-for-line-index, r=Veykril
internal: Speedup line index calculation via NEON for aarch64 This commit provides SIMD acceleration (via NEON) for `line-index` library on aarch64 architecture, which improves performance for Apple Silicon users (and potentially for future aarch64-based chips). The algorithm used here follows the same process as the original implementation using SSE2. Most of the vector instructions in SSE2 have corresponding parts in neon. The only issue is that there is no corresponding instruction for `_mm_movemask_epi8` in neon. To address this problem, I referred to the article at https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/porting-x86-vector-bitmask-optimizations-to-arm-neon.
- Loading branch information
Showing
1 changed file
with
112 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters