-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird setup leading to surprising results? #69
Comments
Are you saying that byte based parsers/generators have an advantage in some tests here?
Noting that the fastest parsers and generators are using "pre-encoded keys" to skip the encoding + escaping of keys (and imo they get a lot of performance advantage from this by for example skipping the byte <-> char encoding/decoding altogether). Some json libraries are doing this on BOTH parsing and generation and some ONLY on generation (but the benefits of using pre-encoded keys can been seen for both parsing and generation and is a function of the ratio of content that is keys vs data). Hence instead of servlet api readers/writers some json libraries prefer servlet inputStream/outputStream and processing bytes because some byte <-> char can be completely skipped altogether. A servlet container can also add additional encoding/decoding like compression of course (and then we get into buffering details). |
More than tests are not comparable (string:stringbuilder ones will likely be slower than byte based ones for ex - assuming rest is iso). I'm not speaking of the impl/preencoding there but really the "stream pipeline". So overall some alignment and both cases should be reported to be relevant IMHO. |
Valid concern. This benchmark was originally designed to test fastest (theoretical) code paths for de/serializing json and help devs pick a json lib in general, not in particular context like servlet-api. We could introduce more tests that would do this sort of apples-to-apples comparison for "char[]" based coders only. Just make to pick the type of input/output that would give the lowest overhead / best perf in the servlet-api context. Best to control this via a new Api flag so we can easily generate distinct results for it. Contributions are welcomed. |
a lot of these libs don't have native integration with servlet libraries, so to use you'd have to directly mess with input/output streams anyway, skipping the servlet readers and writers. |
well, servlet was really just a sample, it is exactly the same with plain files, any network streams etc. basically you have two choices: use byte data and handle at json layer the encoding - works well for built-in encodings (UTF-8 mainly) or delegate to a charset the handling. From a caller perspective reader/writer are always safer if contextuals (but a bit slower - that said we speak of perf negligible as soon as you add any I/O ;)) than byte and multiple encoding layers. As usual there is no silver bullet so depends a bit the context but just wanted to highlight benching in a relevant context can need tuning of the suite and that default results should be taken with caution/review (not blaming, benching in a relevant manner is hard). |
Hi,
Not sure I misread the setup but there are generally two main categories of implementations out there:
Indeed it depends several things and even if byte based ones are supposed to be faster, char based ones are generally relevant when the chain uses char based objects (reader/writers in servlet api for ex) and delegate the char->byte handling to some backbone like servlet container to not worry about the impl details or similarly in the case of files.
The issue with current setup is that depending the provider it uses either bytes based inputs/outputs, or some not optimal converters including
StringBuilder
or plainString
method shortcuts. So overall we can't compare the same chain between providers nor pick the performances aligned with our app use case easily (the switch can >> x2 or /2 the perf).So overall it can be worth testing the different flavors and even using the impl shortcuts when possible which enables to provide directly a
char[]
to the json parser or equivalent when relevant.Romain
The text was updated successfully, but these errors were encountered: