Architecture specific optimizations #83

codekatana · 2017-10-04T13:07:39Z

Hello, I would like to know if it is possible to have ARM's SIMD (neon) routines to be added in huff0 and/or FSE encode/decode parts? That way, I can make them run a bit faster on raspberry pi.

MarcusJohnson91 · 2017-10-04T15:23:17Z

Why not just compile it with clang and tell it to vectorize the loops?

codekatana · 2017-10-05T05:25:22Z

Indeed that's a nice way to do it however, wouldn't it be nicer if just like BLAS (openBLAS) we had some hand-coded assembly?

Cyan4973 · 2017-10-05T07:04:58Z

It's a non trivial amount of work, with no guarantee of success.
I'm certainly opened to a patch if someone wants to try it.

codekatana · 2017-10-05T07:54:43Z

I agree, Yan. I was going through huff_* and fse* files so as to understand the code and find out possible areas. I was also going through your blog so as to understand zstd and find a suitable area which can be accelerated using SIMD on arm. I would very much appreciate any pointers regarding that.

MarcusJohnson91 · 2017-10-05T09:18:28Z

@codekatana No, If it was my repo, I'd want to keep the code base as clean as possible.

codekatana · 2017-10-05T09:56:02Z

@bumblebritches57 - Yes, I can understand. Assembly can tend to be hard to read/maintain but in some situations, they provide good results. That's why BLAS libraries do their calculations in assembly and not in high level language.

Cyan4973 added the help needed label Oct 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture specific optimizations #83

Architecture specific optimizations #83

codekatana commented Oct 4, 2017

MarcusJohnson91 commented Oct 4, 2017

codekatana commented Oct 5, 2017

Cyan4973 commented Oct 5, 2017

codekatana commented Oct 5, 2017

MarcusJohnson91 commented Oct 5, 2017

codekatana commented Oct 5, 2017 •

edited

Loading

Architecture specific optimizations #83

Architecture specific optimizations #83

Comments

codekatana commented Oct 4, 2017

MarcusJohnson91 commented Oct 4, 2017

codekatana commented Oct 5, 2017

Cyan4973 commented Oct 5, 2017

codekatana commented Oct 5, 2017

MarcusJohnson91 commented Oct 5, 2017

codekatana commented Oct 5, 2017 • edited Loading

codekatana commented Oct 5, 2017 •

edited

Loading