Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming Model API #13

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from
Draft

Streaming Model API #13

wants to merge 19 commits into from

Conversation

hgarrereyn
Copy link

Streaming Model API

This PR introduces a streaming model API for RunInference.

Overview

In addition to the current api:

predictions = examples | RunInference(inference_spec_type)

It is now possible to use a PCollection of inference_spec_type as a side input or run inference on queries directly:

models: PCollection[InferenceSpecType]
predictions = examples | RunInference(models)
queries: PCollection[Tuple[InferenceSpecType, Example]]
predictions = queries | RunInference()

Notes

This PR depends on #11 and #12

This PR introduces beam.GroupIntoBatches and _TemporalJoin which both require stateful DoFn support. This is not supported in Dataflow v1 and currently RunInference is broken in Dataflow v2 due to BEAM-2717

@hgarrereyn
Copy link
Author

@rose-rong-liu @SherylLuo

hgarrereyn added 10 commits August 6, 2020 21:50
Benchmarks showed that TagByOperation was a performance bottleneck* as it
requires disc access per query batch. To mitigate this I implemented
operation caching inside the DoFn. For readability, I also renamed this
operation to "SplitByOperation" as that more accurately describes its
purpose.

On a dataset with 1m examples, TagByOperation took ~25% of the total wall
time. After implementing caching, this was reduced to ~2%.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants