Skip to content

Commit

Permalink
Merge branch 'scotthuang/sc-29053/snowflake-iceberg-support' of githu…
Browse files Browse the repository at this point in the history
…b.com:MetaphorData/connectors into scotthuang/sc-29053/snowflake-iceberg-support
  • Loading branch information
elic-eon committed Oct 1, 2024
2 parents 30a7114 + b14e1d7 commit ccb326d
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 2 deletions.
8 changes: 8 additions & 0 deletions metaphor/snowflake/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,14 @@ account_usage_schema: <db_name>.<schema_name>

See [Tag Matcher Config](../common/docs/tag_matcher.md) for more information on the optional `tag_matcher` config.

#### Disable Platform Tags Collection

To stop the crawler from collecting platform tags from Snowflake, set `collect_tags` to `False`:

```yaml
collect_tags: false # Default is true.
```

#### Query Logs

By default, the snowflake connector will fetch a full day's query logs from yesterday, to be analyzed for additional metadata, such as dataset usage and lineage information. To backfill log data, one can set `lookback_days` to the desired value. To turn off query log fetching, set `lookback_days` to 0.
Expand Down
3 changes: 3 additions & 0 deletions metaphor/snowflake/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ class SnowflakeBaseConfig(SnowflakeAuthConfig):
# The fully qualified schema that contains all the account_usage views
account_usage_schema: str = "SNOWFLAKE.ACCOUNT_USAGE"

# Whether to collect platform tags.
collect_tags: bool = True


@dataclass(config=ConnectorConfig)
class SnowflakeConfig(SnowflakeBaseConfig):
Expand Down
9 changes: 7 additions & 2 deletions metaphor/snowflake/extractor.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,10 @@ async def extract(self) -> Collection[ENTITY_TYPES]:

self._fetch_primary_keys(cursor)
self._fetch_unique_keys(cursor)
self._fetch_tags(cursor)

# Only fetch the tags when collect_tags is True
if self._config.collect_tags:
self._fetch_tags(cursor)

datasets = list(self._datasets.values())
tag_datasets(datasets, self._tag_matchers)
Expand Down Expand Up @@ -948,7 +951,9 @@ def _init_dataset(
database=database, schema=schema, table=table
)

dataset.system_tags = SystemTags(tags=[])
# Only initialize this when collect_tags is True
if self._config.collect_tags:
dataset.system_tags = SystemTags(tags=[])

return dataset

Expand Down

0 comments on commit ccb326d

Please sign in to comment.