Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tape reading without alloc #226

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Tape reading without alloc #226

wants to merge 1 commit into from

Conversation

poncito
Copy link

@poncito poncito commented Aug 14, 2022

Hi,

This PR is not finished, just a POC to get the discussion started.
So, the idea is to be able to read the tape without allocation. I created a benchmark with those results, where f0 just calls read, and f1 builds the tape, but then uses my non allocating code.

julia> @benchmark f0($str)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  995.800 ns  174.571 μs  ┊ GC (min  max): 0.00%  99.10%
 Time  (median):       1.025 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):     1.144 μs ±   3.819 μs  ┊ GC (mean ± σ):  7.44% ±  2.21%

  ▅██▇▆▅▄▃▂▂▃▃▃▃▃▂▂▁▁    ▁                                      ▂
  ████████████████████████████▇▇▆▇▆▅▆▇▅▆▄▅▆▅▆▆▆▅▇▇▅▄▆▅▅▆▅▆▅▆▅▅▆ █
  996 ns        Histogram: log(frequency) by time       1.51 μs <

 Memory estimate: 2.58 KiB, allocs estimate: 11.

julia> @benchmark f1($reader, $str)
BenchmarkTools.Trial: 10000 samples with 201 evaluations.
 Range (min  max):  394.900 ns   13.140 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     402.363 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   415.439 ns ± 162.861 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁█                                                             
  ██▅▆▄▂▂▃█▆▃▃▂▂▃▅▅▄▄▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  395 ns           Histogram: frequency by time          506 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

The implementation is completely non-mutable, and has two main concepts:

  • Cursor they just contain an Int that is an index of the tape,
  • JSONItem that contain the cursor, the tape and the original string.
    There are some variations of JSONItem, that implement some handy interface given the data they represent:
  • JSONField (that represents a field of an object), key and value can be accessed with key and value
  • JSONObject, that represents an object and is used as a dictionary (and that could be an AbstractDict),
  • JSONArray that acts represents an array, and can be iterated over efficiently.

A few details about the implementation:

  • To be non allocating, the interface has to be type stable and hence, the user is expected to provide expected types. If he does not, then JSONItem are returned.
  • The JSONItem contain the tape, and hence should be allocating. Yet, it's generally possible for the compiler to remove to those. In any case, users can use a low level use of the isbits Cursor to not rely on those compiler optimizations.
  • Objects' fields are not stored in a dictionary. This means that when looking for a key, the tape is walked through. It's possible to continue walking it from a previous cursor, which makes sense when the order of the fields is known to the user. I ran a few benchmarks, and for a reasonnable number of fields, it does not seem to be slower than using a dictionnary. Also, user would be free to iterate over the fields, and store the corresponding cursors in a dictionnary, if it would be more performant for them.
  • Similarly, arrays are not stored in an Vector.

@quinnj Are you interested?

To pursue this PR, I would like to:

  • Rename those objects. The JSON prefix is useless, but you already have an Object and an Array...
  • Implement a proper bound checking. I put some @inbounds in the code this weekend, they are intended for the benchmark, but they are not acceptable (especially for the getindex methods that take a Cursor as an argument).
  • I should raise proper errors when the type in the tape is not compatible with the one requested by the user. Also more importantly, I have not coded the type check for Arrays yet! But that's a complicated one. Typically the eltype in the tape when parsing "[1.0,1.5]" will be Union{Int64,Float64}, where the user would most certainly expect a JSONArray{<:AbstractFloat}. So there's a bit of logic to code here to handle those kind of cases.

To go even further:

  • Parsing the numbers in the tape could be optionnal. If we encoded them in the tape as we do for strings, then we could parse them 'on demand' with this interface. This would have several advantages:
    • faster parsing (when not all fields are consumed),
    • the user could specify if the type is an AbstractFloat or an Integer, eliminating the need to branch on the result of 'modf',
    • it would solve the difficulty stated above when type checking the arrays.
  • Provide a way to deactivate the type checks (for perfomance).

@quinnj
Copy link
Owner

quinnj commented Aug 30, 2022

Hey @poncito! Thanks for opening a PR. Some really interesting ideas here. I've also been thinking here and there about how we can support more lazy/non-allocating queries/workflows. I'm playing with some ideas here. In truth, I'd like to move away from the tape all together, since that itself requires a big upfront allocation and to be held onto via lazy objects. Ideally, I'm hopeful we can figure out a purely functional solution where we/users could make custom JSONBase.AbstractContext to overload different parsing events (objectEnter, objectExit, etc). The idea is that we could then implement lazy querying via a custom context, in addition to regular parsing to a Dict, or even all the StructTypes integration we have in JSON3.jl.

Anyway, I'm not sure if I'll be able push forward on it much in the near term (depends on a few other projects going on), but I'd be happy to discuss some ideas more.

@poncito
Copy link
Author

poncito commented Aug 30, 2022

Hey @quinnj , thanks for your answer. I can dedicate a few hours a week of my free time to develop this kind of stuff. So if you have time to coach me on that, I could propose some code.

My understanding of JSONBase.jl and your explanation is the following:

  • we want to extract the data from the serialized text in different ways. It could be a dict, a struct (with the StructTypes integration), or also a "tape", etc.
  • we you would like a solution that basically separates the "parsing", from the "storage/organisation" of the content. This would remove duplicated code from 'structs.jl' and 'read.jl' for instance. The dict/struct/tape would correspond to different implementations of 'AbstractContext'. Typically, the 'objectValueExit' would either push some stuff on the tape, or insert a pair in a dictionnary.

Am I understanding correctly? I'm really not sure because of your comment on lazy objects. You could also mean that you would like to make the parsing lazy? I'm not sure what it would mean and where the performance would come.

My guess on how to efficiently parse and read a json would be the following dichotomy (and their corresponding 'AbstractContext'?),

  • ordered json:
    provide some kind of schema to the parsing to trigger a compilation of a specialized code for this schema.
    • if the json represents a small serialized stack allocated structure
      • build it directly
    • otherwise
      • use a simplified tape (no need to keep the keys, the types, ...)
  • non ordered json:
    • JSON3's tape may be the most efficient solution type with the kind of reading I propose on this PR.
  • large (enough) non ordered json:
    • dict

From this dichotomy, it appears that not only the "storage" should be abstract, but also the "parsing" (that does not appear in your JSONBase.jl). It is required to exploit the schema. But it can also be used to relax some checks. For instance, if I want to take the risk to suppose that the serialized json is well formed, I could decide to replace

pos + 3 <= len &&
        b            == UInt8('t') &&
        buf[pos + 1] == UInt8('r') &&
        buf[pos + 2] == UInt8('u') &&
        buf[pos + 3] == UInt8('e')

by

b == UInt8('t')

I could also not read the keys, and directly jump ahead from the size of the expected key (and the next delimiter), ...

I can find some free time if you want to chat: [email protected]

@Robert-j7
Copy link

Any updates on this?

I really would like to help since I really need it, the problem is that Im inexperienced, so I might cause some troubles :D

@quinnj
Copy link
Owner

quinnj commented Oct 25, 2022

I've been slowly chipping away at some ideas at https://github.com/quinnj/JSONBase.jl. I've gotten pretty far, but haven't been able to do really thorough benchmarking yet to make sure everything is squared away. There's also some polish, package admin, and testing to do, but I think the fundamentals are in pretty good shape. Happy to chat in that repo with anyone who's interested to collaborate and try things out with me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants