Replies: 1 comment
-
We had some discussions. We formalized potential async op outputs as channels, but in the midst of all this I found out that my initial reasoning was wrong. My thinking was that we would need bespoke streaming ops for some one-shot ops (as in But there is another way! We could design streaming as This means we can propagate the streaming calls entirely to clients. They decide what (and whether) to stream. I think that we can completely ditch op progress too. It's too vague as it is and not applicable in most cases. We should rely on lower level APIs which can be wrapped in higher level ones independently. Progress can be implemented in terms of stream and the domain-specific knowledge it contains (say token 1 of 10, or intermediate image 3/5) By that I mean that defining |
Beta Was this translation helpful? Give feedback.
-
Currently op progress is provided via a callback1. Streaming is not facilitated in any way. Streaming is currently envisioned to be a "pull": calling multiple ops.
Both of these don't play well with servers and the current progress in Acord. Especially the streaming.
Streaming
While "pull"-type streaming is ok for an edge app, it's bad idea for a server. If we propagate the pull to clients, every streamed item would have to add a full ping to its latency. If we hide it from clients, that means we would have a create an entirely different client-server API2. In the context of schemas and interfaces, this is prohibitively expensive.
Streaming is easily designed in an asynchronous API. This was initial version of our Inference API, but we discarded it because it would mean hiding the parallelism from the implementers, and they may come with vastly different needs for this. For now we're keeping the synchronous API as a hard requirement.
How do we deal with streaming then?
(more on this below)
Progress
Op progress is more or less a stream of floating point values. Whatever decisions we make for streaming op results, would likely be applicable to progress as well. Let's keep that in mind when we discuss result streaming. Having a unified solution would be best.
Ideas
... or rather, notes that are not yet complete ideas
getOpStreamResult
.Footnotes
...and our word that it's in the same call stack, but this doesn't really matter. We can remove the requirement with practically no repercussions as long as the calls are not concurrent and the call is synchronous. ↩
Entirely different from the current Inference API ↩
Beta Was this translation helpful? Give feedback.
All reactions