Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add require_typed_event_stream to compute contexts #16706

Merged
merged 1 commit into from
Sep 22, 2023

Conversation

smackesey
Copy link
Collaborator

@smackesey smackesey commented Sep 21, 2023

Summary & Motivation

Per conversation with @schrockn and @rexledesma, add a require_typed_event_stream switch on OpExecutionContext. This is off by default and has to be explicitly turned on, which we do upstack in ext_protocol. The switch has two effects:

  • Skips special validation done on returned values from ops, instead wrapping the returned result without alteration in a generator and treating them as if they had been yielded
  • Causes an error to be thrown when an expected result is missing.

The implementation here is greasy as hell and should be considered temporary. A better solution will take time since the code paths governing op results are complex.

How I Tested These Changes

New unit tests.

@smackesey
Copy link
Collaborator Author

smackesey commented Sep 21, 2023

Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments inline

@smackesey smackesey force-pushed the sean/explicit-mode branch 3 times, most recently from f5469be to e6198f2 Compare September 22, 2023 11:18
@smackesey smackesey requested a review from schrockn September 22, 2023 11:23
else:
output_name = None
if output_name:
emitted_result_names.add(output_name)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be very nice to have a universal function you can call to get output name from result object somewhere, not sure how to cover all cases or where to put it at this time.

@smackesey smackesey force-pushed the sean/explicit-mode branch 2 times, most recently from 7124a54 to d18dea9 Compare September 22, 2023 12:33
Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this good but absolutely want @alangenfeld 's eyes on this and his approval

@smackesey smackesey changed the title Add explicit_mode to compute contexts Add require_typed_event_stream to compute contexts Sep 22, 2023
Comment on lines 24 to 51
def test_explicit_mode_op():
@op(out={"a": Out(int), "b": Out(int)})
def explicit_mode_op(context: OpExecutionContext):
context.set_require_typed_event_stream(error_message=EXTRA_ERROR_MESSAGE)

with raises_missing_output_error():
wrap_op_in_graph_and_execute(explicit_mode_op)


def test_explicit_mode_asset():
@asset
def explicit_mode_asset(context: OpExecutionContext):
context.set_require_typed_event_stream(error_message=EXTRA_ERROR_MESSAGE)
pass

with raises_missing_output_error():
materialize([explicit_mode_asset])


def test_explicit_mode_multi_asset():
@multi_asset(specs=[AssetSpec("foo"), AssetSpec("bar")])
def explicit_mode_multi_asset(context: OpExecutionContext):
context.set_require_typed_event_stream(error_message=EXTRA_ERROR_MESSAGE)
yield Output(None, output_name="foo")
pass

with raises_missing_output_error():
materialize([explicit_mode_multi_asset])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think its worth having success conditions under test as well ass error conditions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, added many tests

Comment on lines 567 to 569
@property
def has_require_typed_event_stream(self) -> bool:
return self._require_typed_event_stream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I find has_require odd, maybe requires_ or has_set_require_

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to requires_typed_event_stream

Comment on lines 207 to 234
output_name = step_output.output_name
elif isinstance(step_output, MaterializeResult):
asset_key = (
step_output.asset_key
or step_context.job_def.asset_layer.asset_key_for_node(step_context.node_handle)
)
output_name = step_context.job_def.asset_layer.node_output_handle_for_asset(
asset_key
).output_name
elif isinstance(step_output, AssetCheckEvaluation):
output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(
step_output.asset_check_handle
)
elif isinstance(step_output, AssetCheckResult):
if step_output.asset_key and step_output.check_name:
handle = AssetCheckHandle(step_output.asset_key, step_output.check_name)
else:
handle = step_output.to_asset_check_evaluation(step_context).asset_check_handle
output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(handle)
else:
output_name = None
if output_name:
emitted_result_names.add(output_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the scariest part of this PR since its not gated by the new bool, and I am skeptical that the current print a warning message is under much test coverage

Comment on lines +220 to +233
elif isinstance(step_output, AssetCheckResult):
if step_output.asset_key and step_output.check_name:
handle = AssetCheckHandle(step_output.asset_key, step_output.check_name)
else:
handle = step_output.to_asset_check_evaluation(step_context).asset_check_handle
output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(handle)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theres custom logic to bypass check handle outputs on 232 to avoid printing the warning message. I think that means currently asset checks would bypass would not trigger this has_require_typed_event_stream check? Needs test

Copy link
Collaborator Author

@smackesey smackesey Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, they were indeed not being caught, fixed and tests added

Comment on lines 270 to 271
# Skip any return-specific validation and treat it like a generator op
elif output_defs and context.has_require_typed_event_stream:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its worth fleshing this comment out a bit, maybe just explaining more about has_require_typed_event_stream

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@smackesey smackesey dismissed schrockn’s stale review September 22, 2023 20:08

approved in comment and alex approved

@smackesey smackesey merged commit 547a624 into master Sep 22, 2023
1 check passed
@smackesey smackesey deleted the sean/explicit-mode branch September 22, 2023 20:09
@rexledesma
Copy link
Contributor

@smackesey Can we add a test case around using this API with a multi asset that supports subsetting? Otherwise, I think this API would produce incorrect errors if we integrate it against dagster-dbt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants