-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ETL-616] Implement Great Expectations to run on parquet data (#139)
* initial commit for testing * update sample expectations * add two data types * correct to fitbitdailydata * fix expectation * add complete script * initial cf config and template * correct formatting, refactor triggers * fix job name * refactor gx code, add tests, adjust gx version * refactor gx code, add tests, adjust gx version * make consistent naming * remove hardcoded args * add integration tests, remove null rows code, add dep for urllib3<2 * change to lowercase data type * add prod cf configs, add perm for glue role for shareable artifacts bucket * rename, include prod ver * add test to catch exception * add conditional creation of triggers due to what is available in expectations json * update README for tests, add in testing for our scripts * chain cmd together * update prod * gather tests, correct key_prefix to key, add missing params to prod glue role * remove slash * add gx glue version as var in config
- Loading branch information
Showing
17 changed files
with
1,084 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 changes: 18 additions & 0 deletions
18
config/develop/namespaced/glue-job-run-great-expectations-on-parquet.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
template: | ||
path: glue-job-run-great-expectations-on-parquet.j2 | ||
dependencies: | ||
- develop/glue-job-role.yaml | ||
stack_name: "{{ stack_group_config.namespace }}-glue-job-RunGreatExpectationsParquet" | ||
parameters: | ||
Namespace: {{ stack_group_config.namespace }} | ||
JobDescription: Runs great expectations on a set of data | ||
JobRole: !stack_output_external glue-job-role::RoleArn | ||
TempS3Bucket: {{ stack_group_config.processed_data_bucket_name }} | ||
S3ScriptBucket: {{ stack_group_config.template_bucket_name }} | ||
S3ScriptKey: '{{ stack_group_config.namespace }}/src/glue/jobs/run_great_expectations_on_parquet.py' | ||
GlueVersion: "{{ stack_group_config.great_expectations_job_glue_version }}" | ||
AdditionalPythonModules: "great_expectations~=0.18,urllib3<2" | ||
stack_tags: | ||
{{ stack_group_config.default_stack_tags }} | ||
sceptre_user_data: | ||
dataset_schemas: !file src/glue/resources/table_columns.yaml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 changes: 18 additions & 0 deletions
18
config/prod/namespaced/glue-job-run-great-expectations-on-parquet.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
template: | ||
path: glue-job-run-great-expectations-on-parquet.j2 | ||
dependencies: | ||
- prod/glue-job-role.yaml | ||
stack_name: "{{ stack_group_config.namespace }}-glue-job-RunGreatExpectationsParquet" | ||
parameters: | ||
Namespace: {{ stack_group_config.namespace }} | ||
JobDescription: Runs great expectations on a set of data | ||
JobRole: !stack_output_external glue-job-role::RoleArn | ||
TempS3Bucket: {{ stack_group_config.processed_data_bucket_name }} | ||
S3ScriptBucket: {{ stack_group_config.template_bucket_name }} | ||
S3ScriptKey: '{{ stack_group_config.namespace }}/src/glue/jobs/run_great_expectations_on_parquet.py' | ||
GlueVersion: "{{ stack_group_config.great_expectations_job_glue_version }}" | ||
AdditionalPythonModules: "great_expectations~=0.18,urllib3<2" | ||
stack_tags: | ||
{{ stack_group_config.default_stack_tags }} | ||
sceptre_user_data: | ||
dataset_schemas: !file src/glue/resources/table_columns.yaml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.