-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce DecodeStreamer for Client.Do #48
base: master
Are you sure you want to change the base?
Conversation
This enables a client to stream values out of the HTTP response. It does so by providing a result that implements DecodeStreamer with something like the following. Error handling has been omitted for clarity. // DecodeStream decodes and processes each record of a JSON array func (streamingDecoder) DecodeStream(r io.Reader) error { dec := json.NewDecoder(r) tok, _ := dec.Token() // want: json.Delim('[') for dec.More() { _ = dec.Decode(&record) // do something with record } tok, _ = dec.Token() // want: json.Delim(']') _, err = dec.Token() // want: io.EOF return nil }
Hello @MichaelUrman Line 144 in aef5369
Should:
become something like:
then |
Hi @jbguerraz, thanks for reviewing! Yes, that For your alternate d.Query.Stream entry point, are you proposing the As tradeoffs, the approach currently in the PR enables the caller to use any JSON parser, and makes it easy to exit early, but by the same token it requires understanding the format of Druid's response. I think your proposed approach requires some use of encoding/json (to declare the callback), but could partially abstract Druid's response formats. A third approach might mix these: a new API calls a callback/interface with an Let me know if you need me to prototype another approach. I'm fairly inexpert in Druid, so can't yet speak towards its other response formats, and whether they're compatible with a |
Hi @jbguerraz and any other members, This is just a ping to see if there's anything I can do to further facilitate landing some sort of data streaming in go-druid. I'm willing to prototype other approaches, or to refine this one, if it will help suit your vision. |
Hello @MichaelUrman |
Thanks! While I strongly expect to need streaming, I'm not blocked on it. So I'd rather you have a chance to get it right, and not have to live with the remnants of the "wrong" implementation. (If I need it before it's ready here, there's always go mod replace. 😄) |
Hello @MichaelUrman |
Apologies for the length of these thoughts. I'm not sure how to condense it in the time I have available. I hope it's at least clear. As always, if you'd like me to prototype some of this, let me know. I believe we're considering some implicit questions here, and would like to try to make them explicit:
Regarding 1: The PR proposes an implicit approach by testing for an interface, and you've proposed an explicit separate call. I'm assuming some of the worry about implicit is a caller that already "accidentally" implements the interface, breaking compatibility. Or that a typo in implementing the interface could result in hard to diagnose lack of its use. One might address these by introducing a new go-druid type to wrap the interface, check specifically for that type. Regarding 2: This PR gave all responsibility to the caller; this is powerful and liberating, but potentially repeats a lot of complexity. You've proposed having go-druid implement the streaming which avoids this. However, having go-druid implement this requires choosing an API for it. So regarding 3: An API for streaming leaves us with several suboptimal options, due to limitations in go's syntax.
I'm skeptical of exposing channels at all, as they are easy to misuse, for instance by not consuming all rows from it. I'm strongly skeptical of having the API accept a channel, as the ownership is backwards: typically the producer creates, sends, and closes the channel; but here the consumer would create it and the producer would send and close. With that framework in mind here's what I think about your proposed By contrast, I don't love how this PR's approach is so implicit and requires both heavy lifting and knowledge of the underlying response format, but I do like its ownership semantics and ability to use other JSON parsers. (Though it seems go-druid already requires knowledge of the response format when creating a struct for result.) (Note that the two proposals can be combined. An explicit go-druid interface-wrapping type can be the building block upon which your proposed additional API is built. Then if the interface-wrapping type is exported, it can be used by those with unusual needs.) |
This enables a client to stream values out of the HTTP response. It does
so by providing a result that implements DecodeStreamer with something
like the following. Error handling has been omitted for clarity.
I'm open to name changes, or adding tests or docs if you point me to where they'd go. I have tested this with an implementation that looks like the above, and it allows me to receive records before the body is closed.