Add RunInference.with_errors() API #14

hgarrereyn · 2020-08-06T02:52:12Z

`with_errors()` API

This PR introduces the RunInference(...).with_errors() API which allows users to catch runtime errors as a separate PCollection stream.

By default, runtime errors (for example invalid model specs or invalid examples) are thrown which can crash a pipeline. If this is not desirable, users can use .with_errors() to catch runtime errors:

inference_result = examples | RunInference(inference_spec).with_errors()
inference_result['errors'] | LogToFile(...)
inference_result['predictions'] | ...

The error output stream has the type: Tuple[Exception, Any] and contains both the original error and whatever object is relevant to the error.

Note: when runtime errors are allowed to be raised, they are raised from their original location (e.g. inside a nested PTransform) which makes debugging easier.

Internal details

RunInferenceImpl is now a class (rather than a function with @beam.ptransform_fn). This enables us to add a with_errors method that can take effect in the expand method.
RunInferenceCore returns a dict containing {'predictions': ..., 'errors': ...} and takes an additional catch_errors: bool = False parameter which indicates whether to catch or allow runtime errors.
Added the _ParDoExceptionWrapper utility which runs beam.ParDo on a provided beam.DoFn and optionally catches exceptions raised in the process() method.
Operation wrapper transforms (e.g. _Classify, _Regress, ...) accept an additional catch_errors: bool = False parameter and return a dict containing {'predictions': ..., 'errors': ...}

Dependencies

This PR depends on #13

hgarrereyn · 2020-08-06T02:52:33Z

@rose-rong-liu @SherylLuo

rose-rong-liu · 2020-08-06T03:06:58Z

tfx_bsl/beam/run_inference.py

+        return beam.pvalue.TaggedOutput(_get_operation_type(batch[0]), batch)
+      else:
+        try:
+          return beam.pvalue.TaggedOutput(_get_operation_type(batch[0]), batch)


Will this throw error? It seems _get_operation_type will return str or unicode

_get_operation_type has return value Text (str/unicode) and beam.pvalue.TaggedOutput accepts either str or unicode for a tag. See: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L328-L343

Benchmarks showed that TagByOperation was a performance bottleneck* as it requires disc access per query batch. To mitigate this I implemented operation caching inside the DoFn. For readability, I also renamed this operation to "SplitByOperation" as that more accurately describes its purpose. On a dataset with 1m examples, TagByOperation took ~25% of the total wall time. After implementing caching, this was reduced to ~2%.

hgarrereyn added 8 commits July 29, 2020 19:19

add _RunInferenceCore

0550177

add tests for _BatchQueries and _RunInferenceCore

1df33ac

Merge remote-tracking branch 'upstream/master' into core

be1b8e8

add shared tag

5d61a61

Merge branch 'shared_tag' into core2

a90087a

add streaming model apis

3906502

remove todo

22e4664

add error stream api

e146b86

googlebot added the cla: yes label Aug 6, 2020

rose-rong-liu reviewed Aug 6, 2020

View reviewed changes

hgarrereyn added 13 commits August 6, 2020 01:01

misc formatting and renaming

a88ecc1

simplify _BatchQueries, remove redundant test

1c167ad

misc formatting

190643e

reorder imports, add cache explanation

76d0a0c

Merge branch 'core' into core2

a9fe62d

Merge remote-tracking branch 'upstream/master' into core2

62c40d4

misc documentation and validation

b3ce5f7

formatting

0c09dca

update metric tracking

4bfd135

Merge branch 'core' into core2

e8c780a

add stateful dofn check

c659d04

Merge branch 'core2' into error_stream

c71a339

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RunInference.with_errors() API #14

Add RunInference.with_errors() API #14

hgarrereyn commented Aug 6, 2020

hgarrereyn commented Aug 6, 2020

rose-rong-liu Aug 6, 2020

hgarrereyn Aug 6, 2020

Add RunInference.with_errors() API #14

Are you sure you want to change the base?

Add RunInference.with_errors() API #14

Conversation

hgarrereyn commented Aug 6, 2020

with_errors() API

Internal details

Dependencies

hgarrereyn commented Aug 6, 2020

rose-rong-liu Aug 6, 2020

Choose a reason for hiding this comment

hgarrereyn Aug 6, 2020

Choose a reason for hiding this comment

`with_errors()` API