Using SIMD for dealing with json (and more) at speed #13773

hpvd · 2024-08-07T20:41:38Z

Using SIMD for dealing with json at speed

inspired by postgreSQL (up to 4-fold speedup):
see: https://www.phoronix.com/news/PostgreSQL-Opt-JSON-Esc-SIMD

and since more and more CPUs support AVX512 or its successors:
https://www.phoronix.com/review/simdjson-avx-512
https://simdjson.org/ used by Clickhouse, Apache Doris...
https://github.com/simdjson/simdjson (Apache 2.0 licence)

abhioncbr · 2024-08-09T01:31:09Z

@hpvd, what would you suggest, using the simdjson library for all JSON data handling or something else?

hpvd · 2024-08-09T17:10:01Z

I think this would be a multistep approach. We can look what is possible on https://simdjson.org/ and just pick one place in Pinot and give it a try. In the end we can utilize it in many ways..

siddharthteotia · 2024-08-10T20:21:45Z

@hpvd @abhioncbr - I have been very interested in exploring more wide and holistic use of SIMD in Pinot. Historically, that endeavor has not been successful because of no support in Java for the low level primitives. JNI is of course an option.

For this issue, how are you planning to use SIMD in Pinot code base ? Is it via the JNI bridge that we build over Intel compiler intrinsics or using an abstraction (e.g JDK vector APi available in 14 onwards IIRC) or something else ?

siddharthteotia · 2024-08-10T20:26:07Z

My high level suggestion would be that if there is indeed a possible path to leverage SIMD acceleration in JAVA, rather than doing piece-wise work for a specific scenario, it would be better to first get a handle on how it will be integrated into Pinot code base so that we can also re-use them in more appropriate places (e.g in the query engine). Also need to evaluate the portability aspect as well.

We can look what is possible on https://simdjson.org/ and just pick one place in Pinot and give it a try. In the end we can utilize it in many ways..

Agree with POCing one aspect but when we actually decide to build the feature, it should ideally be done thinking of broader, long term use thinking about everything since we are likely going to introduce platform specific dependencies into the codebase.

hpvd · 2024-08-10T20:52:46Z

@siddharthteotia have you already looked into this one:
https://github.com/simdjson/simdjson-java

hpvd · 2024-08-10T21:23:17Z

We may also look into Apache Doris how they leverage it....

abhioncbr · 2024-08-10T21:43:52Z

Yes, my understanding was also to use the simd Java bindings. As @hpvd suggested, we can explore how jdk based projects are using it and we can take a path forward based on that.

siddharthteotia · 2024-08-11T07:09:47Z

we can explore how jdk based projects are using it and we can take a path forward based on that.

+1. Yes let's do some survey

https://github.com/simdjson/simdjson-java

This is based on incubator version of vector support in JDK (Project Panama by Open JDK AFAIK). Note that the package still says "incubator" so I am not sure of production use / support for this. We have done this in the past where we took a dependency on less than productionized library (Lbuffer) and it proved to be unstable once in a while. Recently we have removed it.

So, I think as a first step it will be good to see if any of the latest versions of JDK actually support it or not before we go way deeper in the POC / performance evaluation with above library

Take a look at project Gandiva (under Arrow) too. We can also build a JNI bridge ourselves.

I think the investment really depends on some value via POC.

Curious if @gortiz / @richardstartin have any advice / suggestions.

hpvd · 2024-08-11T11:45:49Z

this article is already one year old, but pretty interesting: it shows how elastic / lucene leverage SIMD, handle incubating possibilities, show some benchmarks etc.
https://www.elastic.co/de/blog/accelerating-vector-search-simd-instructions

hpvd · 2024-08-11T11:51:42Z

this includes history, state and goals of vector API in java:
https://openjdk.org/jeps/469

kishoreg · 2024-08-11T16:14:32Z

This is a fantastic initiative and +100 on getting native SIMD. Given the pace at which Java is moving, it might be a good idea to slowly extract interfaces where SIMD can benefit. This will allow users/companies to stay with older jdk while other companies can move forward.

we don't want to stuck in the same mode as last time where moving out of Java 8 meant waiting for all users to migrate to Java 8.

hpvd · 2024-08-11T16:42:00Z

jep would be great, if we find a way were the people who want and can (-> no hard internal restrictions, suitable hardware selection..) are able to benefit from new possibilities without having to wait till everybody is ready.

hpvd · 2024-08-11T16:44:20Z

just edited the title to SIMD for dealing with json *(and more)* at speed :-)

gortiz · 2024-08-12T06:33:17Z

Curious if @gortiz / @richardstartin have any advice / suggestions.

I think explorations in this area are very interesting, but AFAIK Panama is not fast enough yet. Last month in JCrete we were discussing about how to access native code efficiently and it looks like nothing has changed (yet). Calling JNI/Panama code per row is prohibitively slow. The good news is that in single-stage engine and in the leaf stages in multi-stage engine these calls can be done at block level, so we should be able to absorb the cost of the JNI call.

hpvd · 2024-08-12T09:43:50Z

good overview and starter:
SIMD Parallel Programming with the Vector API By José Paumard

This session explains the differences between parallel streams and parallel computing, and how SIMD computations are working internaly on simple examples. It then shows the patterns of code that the Vector API is giving along with their performances, and how you can use them to improve your in-memory data processing computations. More advanced techniques are also presented, to go beyond the basic examples.

https://www.youtube.com/watch?v=36DN9sE7ja4

includes usecases and basic speed comparisons:

hpvd · 2024-09-10T10:07:40Z

just to get an understanding how other projects handle this:
for apache lucene, using more SIMD in an easy way is one of the reasons to make java v21 mandatory with the upcoming next major release of lucene (v10, planned for October 01, 2024 see https://github.com/apache/lucene/milestone/2)

Vectorization

Parallelism and concurrency, while distinct, often translate to "splitting a task so that it can be performed more quickly", or "doing more tasks at once". Lucene is continually looking at new algorithms and striving to implement existing ones in more performant and efficient ways. One area that is now more straightforward to us in Java is data level parallelism - the use of SIMD (Single Instruction Multiple Data) vector instructions to boost performance.

Lucene is using the latest JDK Vector API to implement vector distance computations that result in efficient hardware specific SIMD instructions. These instructions, when run on supporting hardware, can perform floating point dot product computations 8 times faster than the equivalent scalar code. This blog contains more specific information on this particular optimization.

With the move to Java 21 minimum, it is a lot more straightforward to see how we can use the JDK Vector API in more places. We're even experimenting with the possibility of calling customized SIMD implementations with FFI, since the overhead of the native call is now quite minimal.

https://www.elastic.co/search-labs/blog/lucene-and-java-moving-forward-together

hpvd · 2024-10-29T09:37:12Z

as expected, lucene changes requirements and v10 now requires Java 21, see https://lucene.apache.org/core/corenews.html#apache-lucenetm-1000-available

hpvd · 2024-10-29T09:41:43Z

just started a list to get an overview of things we are missing with staying using/being compatible to older Java versions,
and determine the right point of time when its maybe worth to drop one or find a way to work around it:
#14325

Jackie-Jiang added the feature request label Aug 7, 2024

hpvd changed the title ~~Using SIMD for dealing with json at speed~~ Using SIMD for dealing with json (and more) at speed Aug 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using SIMD for dealing with json (and more) at speed #13773

Using SIMD for dealing with json (and more) at speed #13773

hpvd commented Aug 7, 2024 •

edited

Loading

abhioncbr commented Aug 9, 2024

hpvd commented Aug 9, 2024

siddharthteotia commented Aug 10, 2024

siddharthteotia commented Aug 10, 2024

hpvd commented Aug 10, 2024

hpvd commented Aug 10, 2024

abhioncbr commented Aug 10, 2024

siddharthteotia commented Aug 11, 2024

hpvd commented Aug 11, 2024 •

edited

Loading

hpvd commented Aug 11, 2024

kishoreg commented Aug 11, 2024

hpvd commented Aug 11, 2024 •

edited

Loading

hpvd commented Aug 11, 2024

gortiz commented Aug 12, 2024

hpvd commented Aug 12, 2024

hpvd commented Sep 10, 2024 •

edited

Loading

hpvd commented Oct 29, 2024

hpvd commented Oct 29, 2024

Using SIMD for dealing with json (and more) at speed #13773

Using SIMD for dealing with json (and more) at speed #13773

Comments

hpvd commented Aug 7, 2024 • edited Loading

abhioncbr commented Aug 9, 2024

hpvd commented Aug 9, 2024

siddharthteotia commented Aug 10, 2024

siddharthteotia commented Aug 10, 2024

hpvd commented Aug 10, 2024

hpvd commented Aug 10, 2024

abhioncbr commented Aug 10, 2024

siddharthteotia commented Aug 11, 2024

hpvd commented Aug 11, 2024 • edited Loading

hpvd commented Aug 11, 2024

kishoreg commented Aug 11, 2024

hpvd commented Aug 11, 2024 • edited Loading

hpvd commented Aug 11, 2024

gortiz commented Aug 12, 2024

hpvd commented Aug 12, 2024

hpvd commented Sep 10, 2024 • edited Loading

hpvd commented Oct 29, 2024

hpvd commented Oct 29, 2024

hpvd commented Aug 7, 2024 •

edited

Loading

hpvd commented Aug 11, 2024 •

edited

Loading

hpvd commented Aug 11, 2024 •

edited

Loading

hpvd commented Sep 10, 2024 •

edited

Loading