Add require_typed_event_stream to compute contexts #16706

smackesey · 2023-09-21T22:16:12Z

Summary & Motivation

Per conversation with @schrockn and @rexledesma, add a require_typed_event_stream switch on OpExecutionContext. This is off by default and has to be explicitly turned on, which we do upstack in ext_protocol. The switch has two effects:

Skips special validation done on returned values from ops, instead wrapping the returned result without alteration in a generator and treating them as if they had been yielded
Causes an error to be thrown when an expected result is missing.

The implementation here is greasy as hell and should be considered temporary. A better solution will take time since the code paths governing op results are complex.

How I Tested These Changes

New unit tests.

smackesey · 2023-09-21T22:16:24Z

Current dependencies on/for this PR:

master
- PR Add require_typed_event_stream to compute contexts #16706 👈
  - PR [ext] Use MaterializeResult for ext_protocol #16624
    - PR [ext] asset check support #16466
      - PR [pipes] databricks unstructured log forwarding #16674

This comment was auto-generated by Graphite.

schrockn

comments inline

python_modules/dagster/dagster/_core/execution/plan/compute_generator.py

python_modules/dagster/dagster/_core/execution/context/system.py

smackesey · 2023-09-22T11:25:26Z

python_modules/dagster/dagster/_core/execution/plan/compute.py

+        else:
+            output_name = None
+        if output_name:
+            emitted_result_names.add(output_name)


Would be very nice to have a universal function you can call to get output name from result object somewhere, not sure how to cover all cases or where to put it at this time.

schrockn

I think this good but absolutely want @alangenfeld 's eyes on this and his approval

alangenfeld · 2023-09-22T15:03:38Z

python_modules/dagster/dagster_tests/execution_tests/test_require_typed_event_stream.py

+def test_explicit_mode_op():
+    @op(out={"a": Out(int), "b": Out(int)})
+    def explicit_mode_op(context: OpExecutionContext):
+        context.set_require_typed_event_stream(error_message=EXTRA_ERROR_MESSAGE)
+
+    with raises_missing_output_error():
+        wrap_op_in_graph_and_execute(explicit_mode_op)
+
+
+def test_explicit_mode_asset():
+    @asset
+    def explicit_mode_asset(context: OpExecutionContext):
+        context.set_require_typed_event_stream(error_message=EXTRA_ERROR_MESSAGE)
+        pass
+
+    with raises_missing_output_error():
+        materialize([explicit_mode_asset])
+
+
+def test_explicit_mode_multi_asset():
+    @multi_asset(specs=[AssetSpec("foo"), AssetSpec("bar")])
+    def explicit_mode_multi_asset(context: OpExecutionContext):
+        context.set_require_typed_event_stream(error_message=EXTRA_ERROR_MESSAGE)
+        yield Output(None, output_name="foo")
+        pass
+
+    with raises_missing_output_error():
+        materialize([explicit_mode_multi_asset])


think its worth having success conditions under test as well ass error conditions

good idea, added many tests

alangenfeld · 2023-09-22T15:20:03Z

python_modules/dagster/dagster/_core/execution/context/system.py

+    @property
+    def has_require_typed_event_stream(self) -> bool:
+        return self._require_typed_event_stream


nit: I find has_require odd, maybe requires_ or has_set_require_

changed to requires_typed_event_stream

alangenfeld · 2023-09-22T15:26:02Z

python_modules/dagster/dagster/_core/execution/plan/compute.py

+            output_name = step_output.output_name
+        elif isinstance(step_output, MaterializeResult):
+            asset_key = (
+                step_output.asset_key
+                or step_context.job_def.asset_layer.asset_key_for_node(step_context.node_handle)
+            )
+            output_name = step_context.job_def.asset_layer.node_output_handle_for_asset(
+                asset_key
+            ).output_name
+        elif isinstance(step_output, AssetCheckEvaluation):
+            output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(
+                step_output.asset_check_handle
+            )
+        elif isinstance(step_output, AssetCheckResult):
+            if step_output.asset_key and step_output.check_name:
+                handle = AssetCheckHandle(step_output.asset_key, step_output.check_name)
+            else:
+                handle = step_output.to_asset_check_evaluation(step_context).asset_check_handle
+            output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(handle)
+        else:
+            output_name = None
+        if output_name:
+            emitted_result_names.add(output_name)


this is the scariest part of this PR since its not gated by the new bool, and I am skeptical that the current print a warning message is under much test coverage

alangenfeld · 2023-09-22T15:28:37Z

python_modules/dagster/dagster/_core/execution/plan/compute.py

+        elif isinstance(step_output, AssetCheckResult):
+            if step_output.asset_key and step_output.check_name:
+                handle = AssetCheckHandle(step_output.asset_key, step_output.check_name)
+            else:
+                handle = step_output.to_asset_check_evaluation(step_context).asset_check_handle
+            output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(handle)


theres custom logic to bypass check handle outputs on 232 to avoid printing the warning message. I think that means currently asset checks would bypass would not trigger this has_require_typed_event_stream check? Needs test

good catch, they were indeed not being caught, fixed and tests added

alangenfeld · 2023-09-22T15:32:30Z

python_modules/dagster/dagster/_core/execution/plan/compute_generator.py

+    # Skip any return-specific validation and treat it like a generator op
+    elif output_defs and context.has_require_typed_event_stream:


I think its worth fleshing this comment out a bit, maybe just explaining more about has_require_typed_event_stream

github-actions · 2023-09-22T19:08:07Z

Deploy preview for dagster-docs ready!

Preview available at https://dagster-docs-cakyddxs1-elementl.vercel.app
https://sean-explicit-mode.dagster.dagster-docs.io

Direct link to changed pages:

approved in comment and alex approved

rexledesma · 2023-09-22T20:56:49Z

@smackesey Can we add a test case around using this API with a multi asset that supports subsetting? Otherwise, I think this API would produce incorrect errors if we integrate it against dagster-dbt.

smackesey force-pushed the sean/explicit-mode branch from de74f48 to c733061 Compare September 21, 2023 23:18

This was referenced Sep 21, 2023

[ext] Use MaterializeResult for ext_protocol #16624

Merged

[ext] asset check support #16466

Merged

[pipes] databricks unstructured log forwarding #16674

Merged

smackesey marked this pull request as ready for review September 21, 2023 23:23

smackesey requested review from schrockn, alangenfeld and rexledesma September 21, 2023 23:23

smackesey force-pushed the sean/explicit-mode branch from c733061 to eafa0ea Compare September 21, 2023 23:27

schrockn previously requested changes Sep 22, 2023

View reviewed changes

python_modules/dagster/dagster/_core/execution/plan/compute_generator.py Outdated Show resolved Hide resolved

python_modules/dagster/dagster/_core/execution/context/system.py Outdated Show resolved Hide resolved

smackesey force-pushed the sean/explicit-mode branch 3 times, most recently from f5469be to e6198f2 Compare September 22, 2023 11:18

smackesey requested a review from schrockn September 22, 2023 11:23

smackesey commented Sep 22, 2023

View reviewed changes

smackesey force-pushed the sean/explicit-mode branch 2 times, most recently from 7124a54 to d18dea9 Compare September 22, 2023 12:33

schrockn reviewed Sep 22, 2023

View reviewed changes

smackesey changed the title ~~Add explicit_mode to compute contexts~~ Add require_typed_event_stream to compute contexts Sep 22, 2023

alangenfeld reviewed Sep 22, 2023

View reviewed changes

smackesey force-pushed the sean/explicit-mode branch from d18dea9 to e9ff404 Compare September 22, 2023 19:04

smackesey requested a review from alangenfeld September 22, 2023 19:06

Add explicit_mode to compute contexts

71b478f

smackesey force-pushed the sean/explicit-mode branch from e9ff404 to 71b478f Compare September 22, 2023 19:19

alangenfeld approved these changes Sep 22, 2023

View reviewed changes

smackesey merged commit 547a624 into master Sep 22, 2023
1 check passed

smackesey deleted the sean/explicit-mode branch September 22, 2023 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add require_typed_event_stream to compute contexts #16706

Add require_typed_event_stream to compute contexts #16706

smackesey commented Sep 21, 2023 •

edited

Loading

smackesey commented Sep 21, 2023 •

edited

Loading

schrockn left a comment

smackesey Sep 22, 2023

schrockn left a comment

alangenfeld Sep 22, 2023

smackesey Sep 22, 2023

alangenfeld Sep 22, 2023

smackesey Sep 22, 2023

alangenfeld Sep 22, 2023

alangenfeld Sep 22, 2023

smackesey Sep 22, 2023 •

edited

Loading

alangenfeld Sep 22, 2023

smackesey Sep 22, 2023

github-actions bot commented Sep 22, 2023 •

edited

Loading

rexledesma commented Sep 22, 2023

		# Skip any return-specific validation and treat it like a generator op
		elif output_defs and context.has_require_typed_event_stream:

Add require_typed_event_stream to compute contexts #16706

Add require_typed_event_stream to compute contexts #16706

Conversation

smackesey commented Sep 21, 2023 • edited Loading

Summary & Motivation

How I Tested These Changes

smackesey commented Sep 21, 2023 • edited Loading

schrockn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schrockn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smackesey Sep 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 22, 2023 • edited Loading

rexledesma commented Sep 22, 2023

smackesey commented Sep 21, 2023 •

edited

Loading

smackesey commented Sep 21, 2023 •

edited

Loading

smackesey Sep 22, 2023 •

edited

Loading

github-actions bot commented Sep 22, 2023 •

edited

Loading