blocking check factory method #16612

johannkm · 2023-09-19T03:47:45Z

Quick sketch of a factory method for building a graph asset that materializes, runs checks, and blocks downstreams if the checks fail.

    @asset
    def my_asset():
        ...

    @asset_check(asset="blocking_my_asset")
    def my_check():
        ...

    blocking_asset = build_blocking_asset_check(asset=my_asset, checks=[pass_check, my_check])

Do we have other asset factory methods that I could mimic for all the other asset attributes?

johannkm · 2023-09-19T03:47:57Z

Current dependencies on/for this PR:

master
- PR blocking check factory method #16612 👈

This comment was auto-generated by Graphite.

python_modules/dagster/dagster/_core/definitions/asset_checks.py

johannkm · 2023-09-19T03:50:43Z

python_modules/dagster/dagster/_core/execution/plan/execute_step.py

@@ -131,7 +131,7 @@ def _process_user_event(
        output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(
            asset_check_evaluation.asset_check_handle
        )
-        output = Output(value=None, output_name=output_name)
+        output = Output(value=asset_check_evaluation, output_name=output_name)


Can fetch the check evaluations through the event storage instead, but I turned on io managers for them here. Makes the io manager case nicer

Yeah this is nice.

replaced with fetching via checks table for now, since that works with both the io managed and not managed cases

schrockn

This is quite lovely actually.

❤️ composition over cursed bools

python_modules/dagster/dagster/_core/definitions/asset_checks.py

schrockn · 2023-09-19T17:46:10Z

python_modules/dagster/dagster/_core/execution/plan/execute_step.py

@@ -131,7 +131,7 @@ def _process_user_event(
        output_name = step_context.job_def.asset_layer.get_output_name_for_asset_check(
            asset_check_evaluation.asset_check_handle
        )
-        output = Output(value=None, output_name=output_name)
+        output = Output(value=asset_check_evaluation, output_name=output_name)


Yeah this is nice.

github-actions · 2023-09-20T17:00:18Z

Deploy preview for dagster-docs ready!

Preview available at https://dagster-docs-omtjanny5-elementl.vercel.app
https://johann-09-18-blocking-check-factory-method.dagster.dagster-docs.io

Direct link to changed pages:

github-actions · 2023-09-20T17:00:22Z

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-p1ybp8mrj-elementl.vercel.app
https://johann-09-18-blocking-check-factory-method.core-storybook.dagster-docs.io

Built with commit cc1c7ae.
This pull request is being automatically deployed with vercel-action

github-actions · 2023-09-20T17:03:35Z

Deploy preview for dagit-storybook ready!

✅ Preview
https://dagit-storybook-k59wgll2b-elementl.vercel.app
https://johann-09-18-blocking-check-factory-method.components-storybook.dagster-docs.io

Built with commit cc1c7ae.
This pull request is being automatically deployed with vercel-action

github-actions · 2023-09-24T01:44:08Z

Deploy preview for dagster-university ready!

✅ Preview
https://dagster-university-mhlprncss-elementl.vercel.app
https://johann-09-18-blocking-check-factory-method.dagster-university.dagster-docs.io

Built with commit f7e32d4.
This pull request is being automatically deployed with vercel-action

johannkm · 2023-09-24T02:02:21Z

python_modules/dagster/dagster/_core/definitions/asset_checks.py

+        ins={name: AssetIn(key) for name, key in asset_def.keys_by_input_name.items()},
+        resource_defs=asset_def.resource_defs,
+        metadata=asset_def.metadata_by_key.get(asset_def.key),
+        freshness_policy=asset_def.freshness_policies_by_key.get(asset_def.key),


I don't really like this- easy to miss a field, annoying to test them all. Any recs?

In terms of easy to miss a field and for future proofing, the only way I can really think of is to refactor graph_asset to immediately call a function graph_asset_no_defaults and then write test cases against graph_asset_no_defaults. That way anyone who adds something to graph_asset in the future will break all the tests, which is the desired behavior, as the right fix will undoubtedly be to change some code here.

On the "test them all" front, need more context on what you need/want to test

tests that assert these parameters are threaded through. 👍 to graph_asset_no_defaults

schrockn · 2023-09-24T01:58:53Z

python_modules/dagster/dagster/_core/definitions/asset_checks.py

+    @graph_asset(
+        name=asset_def.key.path[-1],
+        key_prefix=asset_def.key.path[:-1] if len(asset_def.key.path) > 1 else None,
+        check_specs=check_specs,
+        description=asset_def.descriptions_by_key.get(asset_def.key),
+        ins={name: AssetIn(key) for name, key in asset_def.keys_by_input_name.items()}
+    )


As a follow, let's have graph_asset support a bare key argument just like all the other variants so we don't have to do these silly gymnastics around key_prefix

schrockn · 2023-09-24T01:59:32Z

python_modules/dagster/dagster/_core/definitions/asset_checks.py

+    check_output_names = [c.get_python_identifier() for c in check_specs]
+
+    @op(ins={"materialization": In(Any), "check_evaluations": In(Nothing)})
+    def fan_in_checks_and_materialization(context: OpExecutionContext, materialization):


typehint would be very helpful on materialization

^--- unaddressed

I added "materialization": In(asset_out_type). Is there some fancy way to get a type var with the output type of the asset op?

python_modules/dagster/dagster/_core/execution/plan/execute_step.py

updated

schrockn

Remaining question around the differing treatment of asset check evaluations and materializations.

schrockn · 2023-09-24T09:24:49Z

python_modules/dagster/dagster/_core/definitions/asset_checks.py

+    check_output_names = [c.get_python_identifier() for c in check_specs]
+
+    @op(ins={"materialization": In(Any), "check_evaluations": In(Nothing)})
+    def fan_in_checks_and_materialization(context: OpExecutionContext, materialization):


^--- unaddressed

schrockn · 2023-09-24T09:28:05Z

python_modules/dagster/dagster/_core/definitions/asset_checks.py

+            if execution.status != AssetCheckExecutionRecordStatus.SUCCEEDED:
+                raise DagsterAssetCheckFailedError()
+
+    def blocking_asset(**kwargs):


do we not know what the arguments are ahead of time? If so, let's break it out. If we do not, leave a comment as to what these represent.

they're the inputs to the passed in asset op. commented

schrockn · 2023-09-24T09:31:05Z

python_modules/dagster/dagster/_core/definitions/asset_checks.py

+
+    @op(ins={"materialization": In(asset_out_type), "check_evaluations": In(Nothing)})
+    def fan_in_checks_and_materialization(context: OpExecutionContext, materialization):
+        yield Output(materialization)


This is also very worthy of explanation as it is quite confusing at first blush.

You may want to consider--for no other reason that clarifying things for code readers–-a couple alternative approches:

A wrapper class around AssetMaterialization that expresses the intent of the author––i.e. you––that you are shuffling this information between ops on purpose. Without more context, this looks like the author misunderstand concepts in the framework and screwed thing up

Since we ended up just fetching check evaluations from the instance anyways, just do that for the materialization as well.

I bias towards figuring out a way to make 1) work for all the events you are shuffling around via wrapper classes. However I don't have full contect on the asset check eval decisions. If there is a very good reason to do that, then let's make the materialization treatment similar.

added a comment. I think the crux is getting the correct io manager output written, not passing the events around

another go

schrockn · 2023-09-26T00:48:38Z

python_modules/dagster/dagster/_core/definitions/asset_checks.py

+    asset_out_type = asset_def.op.output_defs[0].dagster_type
+
+    @op(ins={"asset_result": In(asset_out_type), "check_evaluations": In(Nothing)})
+    def fan_in_checks_and_asset_result(context: OpExecutionContext, asset_result):


are we not type-hinting asset_result on purpose because of dagster typing machinery?

#16612 (comment)

what would the hint be? The input is the output from the @asset decorated function, I don't know how to turn that in to a type variable. You can get it at runtime with op_def.get_output_annotation()

I think we'd have to do some very fancy generics with @asset down to the AssetsDefinition to here to get an annotation

Got it understood. Just typing asset_result as Any would be helpful. Using the term "result" is a little problematic here since we now have "Result" objects and this was the source of my confusion. I think choosing a different variable name would be helpful. asset_return_value perhaps.

schrockn

Minor naming thing, but in the case I think it is important because the current state is a bit confusing.

rename

johannkm · 2023-09-26T02:50:52Z

renaming fn to build_asset_with_blocking_check from build_blocking_asset_check

schrockn

👍🏻

schrockn · 2023-09-26T02:52:07Z

renaming fn to build_asset_with_blocking_check from build_blocking_asset_check

Hmm why? Seems an odd name for a function the returns an AssetsDefinitions

schrockn

name change kind of a big shift

johannkm · 2023-09-26T02:52:56Z

I think that build_blocking_asset_check was an odd name, sounds like it would return an AssetChecksDefinition

schrockn

Oh I read that the opposite way thinking you were renaming it to build_blocking_asset_check. You are building an asset not a check. 👍🏻

@johannkm

## Summary & Motivation `graph_asset`'s parameters are out-of-sync with `asset`'s on a structural level, and we do not have the discipline or the tests to keep them in sync. This PR adds an explicit `key` which is nice for users and would have made a recent PR of @johannkm's nicer, who had to do some contortions in #16612 around `key_prefix`. This diff also adds documentation for `key` on `@asset`. An upstack PR will consolidate the logic in `@asset` and `@graph_asset` around this stuff into a single helper function, since it is pretty gross, tricky code. ## How I Tested These Changes

johannkm commented Sep 19, 2023

View reviewed changes

johannkm changed the title ~~blocking check factory method~~ [rfc] blocking check factory method Sep 19, 2023

johannkm requested review from rexledesma and schrockn September 19, 2023 04:01

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from c0d8bd2 to 58aa295 Compare September 19, 2023 04:01

schrockn previously requested changes Sep 19, 2023

View reviewed changes

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from 58aa295 to a46586f Compare September 20, 2023 16:55

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch 3 times, most recently from 3d84229 to f7e32d4 Compare September 24, 2023 01:41

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from f7e32d4 to 01d7720 Compare September 24, 2023 02:01

johannkm commented Sep 24, 2023

View reviewed changes

schrockn reviewed Sep 24, 2023

View reviewed changes

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch 3 times, most recently from de4173e to 0540761 Compare September 24, 2023 02:10

johannkm changed the title ~~[rfc] blocking check factory method~~ blocking check factory method Sep 24, 2023

johannkm requested a review from schrockn September 24, 2023 02:20

schrockn previously requested changes Sep 24, 2023

View reviewed changes

schrockn mentioned this pull request Sep 25, 2023

Make @graph_asset support explicit key #16751

Merged

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch 2 times, most recently from cc1c7ae to 446c5d7 Compare September 26, 2023 00:27

johannkm requested a review from schrockn September 26, 2023 00:29

schrockn reviewed Sep 26, 2023

View reviewed changes

schrockn approved these changes Sep 26, 2023

View reviewed changes

schrockn previously requested changes Sep 26, 2023

View reviewed changes

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from 446c5d7 to 9702291 Compare September 26, 2023 02:46

johannkm requested a review from schrockn September 26, 2023 02:46

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from 9702291 to dbdbb56 Compare September 26, 2023 02:48

schrockn approved these changes Sep 26, 2023

View reviewed changes

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from dbdbb56 to 03fd1a9 Compare September 26, 2023 02:51

schrockn requested changes Sep 26, 2023

View reviewed changes

schrockn approved these changes Sep 26, 2023

View reviewed changes

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from 03fd1a9 to 54abeaa Compare September 26, 2023 15:53

johannkm added 4 commits September 26, 2023 12:32

blocking check factory method

9bf52b9

feedback

342c406

fix inputs

eb1f7f0

graph_asset_no_defaults

2fec27e

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from 54abeaa to c7b6e26 Compare September 26, 2023 16:33

rebase on asset decs

ad85f4a

johannkm force-pushed the johann/09-18-blocking_check_factory_method branch from c7b6e26 to ad85f4a Compare September 26, 2023 16:34

johannkm merged commit 4e2db4e into master Sep 26, 2023

johannkm deleted the johann/09-18-blocking_check_factory_method branch September 26, 2023 17:38

blocking check factory method #16612

blocking check factory method #16612

Conversation

johannkm commented Sep 19, 2023 • edited Loading

johannkm commented Sep 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schrockn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 20, 2023 • edited Loading

github-actions bot commented Sep 20, 2023 • edited Loading

github-actions bot commented Sep 20, 2023 • edited Loading

github-actions bot commented Sep 24, 2023

Choose a reason for hiding this comment

schrockn Sep 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schrockn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schrockn left a comment

Choose a reason for hiding this comment

johannkm commented Sep 26, 2023

schrockn left a comment

Choose a reason for hiding this comment

schrockn commented Sep 26, 2023

schrockn left a comment

Choose a reason for hiding this comment

johannkm commented Sep 26, 2023

schrockn left a comment • edited Loading

Choose a reason for hiding this comment

johannkm commented Sep 19, 2023 •

edited

Loading

github-actions bot commented Sep 20, 2023 •

edited

Loading

github-actions bot commented Sep 20, 2023 •

edited

Loading

github-actions bot commented Sep 20, 2023 •

edited

Loading

schrockn Sep 24, 2023 •

edited

Loading

schrockn left a comment •

edited

Loading