-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test dynamic outputs + partitions #26725
Conversation
added sean and removed myself, I don't really have context on this one |
ef247df
to
28c74fd
Compare
assert result._output_capture == { # noqa: SLF001 | ||
StepOutputHandle( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theres gotta be a better way to express this without using _output_capture
or StepOutputHandle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output_for_node (the current best way of retrieving in-memory outputs) does not support mapping_key right now. We should definitely add support for that but didn't want to do so in this PR
return collect_fruit_vals(vals.collect()) | ||
|
||
with instance_for_test() as instance: | ||
instance.add_dynamic_partitions(dg._check.not_none(partitions_def.name), partitions) # noqa: SLF001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we are adding the dynamic partitions before executing anything. In a real deployment this would happen where, in a sensor that then requests this run?
Is there a similar but slightly different example that we would want to have that calculates and adds the dynamic partitions as part of the run? Is that possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a real deployment it would happen in a sensor. I see the utility of having a run that calculates and adds the dynamic partitions as well as processing, but it is not currently possible. For a run to be partitioned (that is, for all the partitioned properties to be filled out), you need to know the partition keys ahead of time (at least right now)
Summary & Motivation
Proof of concept of current interoperability behavior between dynamic outputs and dynamic partitions. It actually works, I think this would make an interesting / useful github discussion.
There's some hackiness to this approach currently though - for example, the graph-backed asset defines an implicit output, which means that you need to return an output at the end of the line in the graph - in the case of accumulating dynamic partitions; it might not always make sense to return an output in terms of partition keys at all. I think in that case you're left in a weird state:
How I Tested These Changes
New test