-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Custom Actions #10772
[DOCS] Custom Actions #10772
Changes from 24 commits
c86fa34
f9303ac
9da4ef8
94ad6a1
83feb97
683359b
9e34b82
01da2c9
a01d138
5f45b55
e65811f
18d78b2
170d96f
0ae9daf
c3b97da
384c1cf
89411d6
317da9d
9ef9ad1
d1766c6
482ec30
f3f5917
8d740de
68a1acd
d566590
7c06ce7
1f514b7
af5c84d
a0cae68
62cd774
5211c28
12f27f3
555996f
efc740f
37f3c82
482db16
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,16 +25,17 @@ The following table defines the GX Cloud, GX Core, and Community Supported integ | |
| Data Sources<sup>1</sup> | Snowflake<br/>Databricks (SQL)<br/> PostgreSQL<sup>2</sup> | Snowflake<br/>Databricks (SQL)<br/>PostgreSQL<br/>SQLite<br/>BigQuery<br/>Spark<br/>Pandas | MSSQL<br/>MySQL<br/> | | ||
| Configuration Stores<sup>3</sup> | In-app | File system | None | | ||
| Data Doc Stores | In-app | File system | None | | ||
| Actions | Email | Slack <br/>Email <br/>Microsoft Teams | None | | ||
| Credential Stores | Environment variables | Environment variables <br/> YAML<sup>4</sup> | None | | ||
| Orchestrator | Airflow <sup>5</sup> <sup>6</sup> | Airflow <sup>5</sup> <sup>6</sup> | None | | ||
| Actions | Email | Slack <br/>Email <br/>Microsoft Teams <br/>Custom<sup>4</sup> | None | | ||
| Credential Stores | Environment variables | Environment variables <br/> YAML<sup>5</sup> | None | | ||
| Orchestrator | Airflow <sup>6</sup> <sup>7</sup> | Airflow <sup>6</sup> <sup>7</sup> | None | | ||
|
||
<sup>1</sup> We've also seen GX work with the following data sources in the past but we can't guarantee ongoing compatibility. These data sources include Clickhouse, Vertica, Dremio, Teradata, Athena, EMR Spark, AWS Glue, Microsoft Fabric, Trino, Pandas on (S3, GCS, Azure), Databricks (Spark), and Spark on (S3, GCS, Azure).<br/> | ||
<sup>2</sup> Support for BigQuery in GX Cloud will be available in a future release.<br/> | ||
<sup>3</sup> This includes configuration storage for Expectations, Checkpoints, Validation Definitions, and Validation Result<br/> | ||
<sup>4</sup> config_variables.yml<br/> | ||
<sup>5</sup> Although only Airflow is supported, GX Cloud and GX Core should work with any orchestrator that executes Python code.<br/> | ||
<sup>6</sup> Airflow version 2.9.0+ required<br/> | ||
<sup>3</sup> This includes configuration storage for Expectations, Checkpoints, Validation Definitions, and Validation Results.<br/> | ||
<sup>4</sup> We support the general workflow for creating custom Actions but cannot help troubleshoot the domain-specific logic within a custom Action.<br/> | ||
<sup>5</sup> Use `config_variables.yml`.<br/> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note for reviewer: apologies for scope creep but I went ahead and made these footnotes more consistently structured as long as I was editing this content. |
||
<sup>6</sup> Although only Airflow is supported, GX Cloud and GX Core should work with any orchestrator that executes Python code.<br/> | ||
<sup>7</sup> Airflow version 2.9.0+ required.<br/> | ||
|
||
### GX components | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
""" | ||
This is an example script for how to create a custom Action. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note for technical reviewer: this is my first time putting a new code sample under test. I did get some errors with it that I was able to resolve so I think the test is working. But, please let me know if I need to add anything else to make sure this is sufficiently tested. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm so usually these snippet files represent actual tests and I don't think we have any test logic here. At the bottom of the file, could we instantiate an instance of this custom class and assert against its run method?
You may need to create a checkpoint and that can be a little cumbersome so let me know if you want me to jump in and add these details on your behalf. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you - I would appreciate if you'd jump in and add the extra code to make this a sufficient test! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've updated the logic to show a concrete example of what a user might want to do but feel free to modify/critique as you see fit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you! How does this custom calculated percentage of successful expectations across all validation results compare to the If these two things provide the same exact information in different ways, are there advantages to using a custom action instead of the built-in statistics? Depending on how similar these two things are and what if any advantages there are to choosing one option over another in various scenarios, I think we should either acknowledge the similarities and tradeoffs or pick a different concrete example. (this feedback is somewhat similar to the feedback you had for me about the email use case in the intro overlapping a bit with our built in email action.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. connecting some dots: more discussion about this in slack at https://greatexpectationslabs.slack.com/archives/C05V0M18TEJ/p1734482967962709?thread_ts=1734360165.007409&cid=C05V0M18TEJ |
||
|
||
To test, run: | ||
pytest --docs-tests -k "docs_example_create_a_custom_action" tests/integration/test_script_runner.py | ||
""" | ||
|
||
# EXAMPLE SCRIPT STARTS HERE: | ||
|
||
# <snippet name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - full code example"> | ||
|
||
from typing import Literal | ||
|
||
from typing_extensions import override | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note for technical reviewer: I got an error about override not being defined so I added this based on other code I see in the repo. It resolved the error so I think it's right but let me know if there's anything amiss here. |
||
|
||
from great_expectations.checkpoint.actions import ActionContext, ValidationAction | ||
from great_expectations.checkpoint.checkpoint import CheckpointResult | ||
|
||
|
||
# 1. Extend the `ValidationAction` class. | ||
# <snippet name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - extend class"> | ||
class MyCustomAction(ValidationAction): | ||
# </snippet> | ||
|
||
# 2. Set the `type` attribute to a unique string that identifies the Action. | ||
# <snippet name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - set type"> | ||
type: Literal["my_custom_action"] = "my_custom_action" | ||
# </snippet> | ||
|
||
# 3. Override the `run()` method to perform the desired task. | ||
# <snippet name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - override run"> | ||
@override | ||
def run( | ||
self, | ||
checkpoint_result: CheckpointResult, | ||
action_context: ActionContext, # Contains results from prior Actions in the same Checkpoint run. | ||
) -> dict: | ||
self._do_my_custom_action(...) # Domain-specific logic | ||
return {"some": "info"} # Return information about the Action | ||
|
||
def _do_my_custom_action(self): ... | ||
|
||
# </snippet> | ||
|
||
|
||
# </snippet> |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,14 +11,14 @@ import PrereqValidationDefinition from '../_core_components/prerequisites/_valid | |
|
||
A Checkpoint executes one or more Validation Definitions and then performs a set of Actions based on the Validation Results each Validation Definition returns. | ||
|
||
<h2>Prerequisites</h2> | ||
## Prerequisites | ||
|
||
- <PrereqPythonInstalled/>. | ||
- <PrereqGxInstalled/>. | ||
- <PrereqPreconfiguredDataContext/>. In this guide the variable `context` is assumed to contain your Data Context. | ||
- <PrereqValidationDefinition/>. | ||
|
||
### Procedure | ||
## Procedure | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note for reviewer: apologies for scope creep but I noticed some header level issues on this page when I copied it to scaffold the new page, so I went ahead and fixed the problems on this page. |
||
|
||
<Tabs | ||
queryString="procedure" | ||
|
@@ -40,7 +40,7 @@ A Checkpoint executes one or more Validation Definitions and then performs a set | |
|
||
2. Determine the Actions that the Checkpoint will automate. | ||
|
||
After a Checkpoint receives Validation Results from running a Validation Definition, it executes a list of Actions. The returned Validation Results determine what task is performed for each Action. Actions can include updating Data Docs with the new Validation Results or sending alerts when validations fail. The Actions list is executed once for each Validation Definition in a Checkpoint. | ||
After a Checkpoint receives Validation Results from running a Validation Definition, it executes a list of Actions. The returned Validation Results determine what task is performed for each Action. Actions can include updating Data Docs with the new Validation Results, sending alerts when validations fail, or your own [custom logic](/core/trigger_actions_based_on_results/create_a_custom_action.md). The Actions list is executed once for each Validation Definition in a Checkpoint. | ||
|
||
Actions can be found in the `great_expectations.checkpoint` module. All Action class names end with `*Action`. | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
--- | ||
title: Create a custom Action | ||
description: Run custom logic based on Validation Results to integrate with 3rd-party tools and business workflows. | ||
--- | ||
import TabItem from '@theme/TabItem'; | ||
import Tabs from '@theme/Tabs'; | ||
|
||
import PrereqPythonInstalled from '../_core_components/prerequisites/_python_installation.md'; | ||
import PrereqGxInstalled from '../_core_components/prerequisites/_gx_installation.md'; | ||
|
||
Great Expectations provides [Actions for common workflows](/application_integration_support.md#integrations) such as sending emails and updating Data Docs. If these don't meet your needs, you can create a custom Action to integrate with different tools or apply custom business logic based on Validation Results. Example use cases for custom Actions include: | ||
- Opening tickets in an issue tracker when Validation runs fail. | ||
- Sending emails to different teams depending on which Expectations fail. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This has some overlap with our current email action right? Should we perhaps pick another example? |
||
- Running follow-up ETL jobs to fill in missing values. | ||
|
||
A custom Action can do anything that can be done with Python code. | ||
|
||
To create a custom Action, you subclass the `ValidationAction` class, overriding the `type` attribute with a unique name and the `run()` method with custom logic. | ||
|
||
|
||
## Prerequisites | ||
|
||
- <PrereqPythonInstalled/>. | ||
- <PrereqGxInstalled/>. | ||
|
||
## Procedure | ||
|
||
<Tabs | ||
queryString="procedure" | ||
defaultValue="instructions" | ||
values={[ | ||
{value: 'instructions', label: 'Instructions'}, | ||
{value: 'sample_code', label: 'Sample code'} | ||
]} | ||
> | ||
|
||
<TabItem value="instructions" label="Instructions"> | ||
|
||
1. Create a new custom Action class that inherits the `ValidationAction` class. | ||
|
||
```python title="Python" name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - extend class" | ||
``` | ||
|
||
2. Set a unique name for `type`. | ||
|
||
```python title="Python" name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - set type" | ||
``` | ||
|
||
3. Override the `run()` method with the logic for the Action. | ||
|
||
```python title="Python" name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - override run" | ||
``` | ||
|
||
</TabItem> | ||
|
||
<TabItem value="sample_code" label="Sample code"> | ||
|
||
```python title="Python" name="docs/docusaurus/docs/core/trigger_actions_based_on_results/_examples/create_a_custom_action.py - full code example" | ||
``` | ||
|
||
</TabItem> | ||
|
||
</Tabs> | ||
|
||
Now you can use your custom Action like you would any built-in Action. [Create a Checkpoint with Actions](/core/trigger_actions_based_on_results/create_a_checkpoint_with_actions.md) to start automating responses to Validation Results. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -182,10 +182,10 @@ def __new__(cls, clsname, bases, attrs): | |
@public_api | ||
class ValidationAction(BaseModel, metaclass=MetaValidationAction): | ||
""" | ||
ValidationActions define a set of steps to be run after a validation result is produced. | ||
Actions define a set of steps to run after a Validation Result is produced. Subclass `ValidationAction` to create a `custom Action </docs/core/trigger_actions_based_on_results/create_a_custom_action>`_. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Huh that is a little odd - I don't know the exact logic that does this but it might be ingrained in one of the docs libs we use. Perhaps try MD syntax for hyperlinks? Not sure that's supported but worth a shot? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. MD syntax worked! 🎉 |
||
|
||
Through a Checkpoint, one can orchestrate the validation of data and configure notifications, data documentation updates, | ||
and other actions to take place after the validation result is produced. | ||
and other actions to take place after the Validation Result is produced. | ||
""" # noqa: E501 | ||
|
||
class Config: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for product reviewer Please confirm the contribution readiness status for Actions. I tried to ask about this in my docs plan but didn't get an answer so want to double check that I've got the right status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the product review but LGTM from an engineering perspective - the API is stable and the steps to create custom actions are quite simple (as laid out here 🎉 )