-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest/snowflake): always ingest view and external table ddl lineage #12191
fix(ingest/snowflake): always ingest view and external table ddl lineage #12191
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Minor question in include_table_lineage
param description
@@ -163,26 +163,13 @@ class SnowflakeConfig( | |||
default=True, | |||
description="If enabled, populates the snowflake table-to-table and s3-to-snowflake table lineage. Requires appropriate grants given to the role and Snowflake Enterprise Edition or above.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we update this include_table_lineage
description here to emphasize that this enables view lineage too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The view upstream lineage would be ingested always irrespective of this flag.
) | ||
|
||
_include_view_lineage = pydantic_removed_field("include_view_lineage") | ||
_include_view_column_lineage = pydantic_removed_field("include_view_column_lineage") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo we should make this change everywhere
@@ -512,15 +513,14 @@ def get_workunits_internal(self) -> Iterable[MetadataWorkUnit]: | |||
discovered_datasets = discovered_tables + discovered_views | |||
|
|||
if self.config.use_queries_v2: | |||
self.report.set_ingestion_stage("*", "View Parsing") | |||
assert self.aggregator is not None | |||
self.report.set_ingestion_stage("*", VIEW_PARSING) | |||
yield from auto_workunit(self.aggregator.gen_metadata()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly for my own understanding - self.aggregator
is only used for view parsing? and then there's another aggregator within SnowflakeQueriesExtractor
?
if so, should the self.aggregator.gen_metadata()
call happen first and then lineage generation happen afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly for my own understanding - self.aggregator is only used for view parsing? and then there's another aggregator within SnowflakeQueriesExtractor?
Yes, when use_queries_v2 is true
When use_queries_v2 is false
:
- and
include_table_lineage is false
-> then self.aggregator is used for only view lineage - and
include_table_lineage is true
-> then self.aggregator is used for both view and table lineage.
Earlier disabling table lineage via
include_table_lineage
mandated disabling view lineage viainclude_view_lineage
which made it impossible to separate schema and lineage/usage ingestions in separate recipes without missing on view lineage entirely.Stacked on top of #12179
Checklist