Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture specific optimizations #83

Open
codekatana opened this issue Oct 4, 2017 · 6 comments
Open

Architecture specific optimizations #83

codekatana opened this issue Oct 4, 2017 · 6 comments

Comments

@codekatana
Copy link

Hello, I would like to know if it is possible to have ARM's SIMD (neon) routines to be added in huff0 and/or FSE encode/decode parts? That way, I can make them run a bit faster on raspberry pi.

@MarcusJohnson91
Copy link

Why not just compile it with clang and tell it to vectorize the loops?

@codekatana
Copy link
Author

Indeed that's a nice way to do it however, wouldn't it be nicer if just like BLAS (openBLAS) we had some hand-coded assembly?

@Cyan4973
Copy link
Owner

Cyan4973 commented Oct 5, 2017

It's a non trivial amount of work, with no guarantee of success.
I'm certainly opened to a patch if someone wants to try it.

@codekatana
Copy link
Author

I agree, Yan. I was going through huff_* and fse* files so as to understand the code and find out possible areas. I was also going through your blog so as to understand zstd and find a suitable area which can be accelerated using SIMD on arm. I would very much appreciate any pointers regarding that.

@MarcusJohnson91
Copy link

@codekatana No, If it was my repo, I'd want to keep the code base as clean as possible.

@codekatana
Copy link
Author

codekatana commented Oct 5, 2017

@bumblebritches57 - Yes, I can understand. Assembly can tend to be hard to read/maintain but in some situations, they provide good results. That's why BLAS libraries do their calculations in assembly and not in high level language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants