Skip to content

Commit

Permalink
Merge pull request #87 from get-select/use_comments_and_query_tags
Browse files Browse the repository at this point in the history
Use comments and query tags
  • Loading branch information
NiallRees authored Feb 26, 2023
2 parents 51b91a6 + b452418 commit 7716514
Show file tree
Hide file tree
Showing 12 changed files with 117 additions and 27 deletions.
10 changes: 10 additions & 0 deletions .changes/3.0.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
## dbt-snowflake-monitoring 3.0.0 - February 26, 2023
After attempting to use only query comments or query tags in versions 1 and 2, we've learned we actually need both. The reasons for that are:
* The `is_incremental` macro is only available from within the query tag dbt context, and is needed to determine whether a model run is incremental or not.
* Query tags have a maximum character limit of 2000, which is easily exceeded by a list of refs containing more than a few models.

To upgrade from 2.x.x, follow the latest instructions in the [Quickstart](https://github.com/get-select/dbt-snowflake-monitoring#quickstart). The package is still able to process query metadata generated by previous versions of the package, so no data will be lost on upgrade.

### Breaking Changes

- Use query comments and query tags to avoid query tag character limit ([#87](https://github.com/get-select/dbt-snowflake-monitoring/pull/87))
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html),
and is generated by [Changie](https://github.com/miniscruff/changie).

## dbt-snowflake-monitoring 3.0.0 - February 26, 2023
After attempting to use only query comments or query tags in versions 1 and 2, we've learned we actually need both. The reasons for that are:
* The `is_incremental` macro is only available from within the query tag dbt context, and is needed to determine whether a model run is incremental or not.
* Query tags have a maximum character limit of 2000, which is easily exceeded by a list of refs containing more than a few models.

To upgrade from 2.x.x, follow the latest instructions in the [Quickstart](https://github.com/get-select/dbt-snowflake-monitoring#quickstart). The package is still able to process query metadata generated by previous versions of the package, so no data will be lost on upgrade.

### Breaking Changes

- Use query comments and query tags to avoid query tag character limit ([#87](https://github.com/get-select/dbt-snowflake-monitoring/pull/87))

## dbt-snowflake-monitoring 2.0.2 - February 21, 2023

### Fixes
Expand Down
26 changes: 21 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,23 @@ From the [SELECT](https://select.dev) team, a dbt package to help you monitor Sn

## Quickstart

Grant dbt's role access to the `snowflake` database:
1. Grant dbt's role access to the `snowflake` database:

```sql
grant imported privileges on database snowflake to role your_dbt_role_name;
```

Add the following to your `packages.yml` file:
2. Add the package to your `packages.yml` file:

```yaml
packages:
- package: get-select/dbt_snowflake_monitoring
version: 2.0.2
version: 3.0.0
```
To attribute costs to individual models via the `dbt_metadata` column in the `query_history_enriched` model, query tags are added to all dbt-issued queries. To configure the tags, follow one of the two options below.
3. To attribute costs to individual models via the `dbt_metadata` column in the `query_history_enriched` model, query comments and tags are added to all dbt-issued queries. Both query comments and tags are needed to collect the required metadata for the `dbt_queries` model.

To add the query tags follow one of these two options:

Option 1: If running dbt < 1.2, create a folder named `macros` in your dbt project's top level directory (if it doesn't exist). Inside, make a new file called `query_tags.sql` with the following content:

Expand All @@ -32,7 +34,7 @@ Option 1: If running dbt < 1.2, create a folder named `macros` in your dbt proje
{% endmacro %}
```

Option 2: If running dbt >= 1.2, you can simply configure the dispatch search order in your `dbt_project.yml`.
Option 2: If running dbt >= 1.2, simply configure the dispatch search order in `dbt_project.yml`.

```yaml
dispatch:
Expand All @@ -43,8 +45,22 @@ dispatch:
- dbt
```

4. To configure the query comments, add the following config to `dbt_project.yml`.

```yaml
query-comment:
comment: '{{ dbt_snowflake_monitoring.get_query_comment(node) }}'
append: true # Snowflake removes prefixed comments.
```

That's it! All dbt-issued queries will now be tagged and start appearing in the `dbt_queries.sql` model.

### dbt Cloud URLs

If you're using dbt Cloud, columns `dbt_cloud_job_url` and `dbt_cloud_run_url` can be configured in the `dbt_queries` model. To do so, set the variable `dbt_cloud_account_id`. This id can be retrieved from in between `/deploy/` and `/projects/` in any dbt Cloud URL.

By default, the URL prefix of `https://cloud.getdbt.com/deploy/` is used. If you're using a different region of dbt Cloud, this prefix can be overridden by specifying the `dbt_cloud_url` variable.

## Package Alternatives & Maintenance

Prior to releasing this package, [snowflake-spend](https://gitlab.com/gitlab-data/snowflake_spend) by the Gitlab data team was the only package available for monitoring Snowflake spend. According to their README, the package is currently maintained by the Gitlab data team, but there does not appear to be any active development in it (as of January 2023).
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'dbt_snowflake_monitoring'
version: '2.0.2'
version: '3.0.0'
config-version: 2

profile: dbt_snowflake_monitoring
Expand Down
4 changes: 4 additions & 0 deletions integration_test_project/dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,7 @@ dispatch:

vars:
account_locator: a09e1

query-comment:
comment: '{{ dbt_snowflake_monitoring.get_query_comment(node) }}'
append: true # Snowflake removes prefixed comments.
12 changes: 12 additions & 0 deletions macros/create_merge_objects_udf.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{% macro create_merge_objects_udf(relation) %}

create or replace function {{ relation.database }}.{{ relation.schema }}.merge_objects(obj1 variant, obj2 variant)
returns variant
language javascript
comment = 'Created by dbt-snowflake-monitoring dbt package.'
as
$$
return x = Object.assign(OBJ1, OBJ2)
$$

{% endmacro %}
File renamed without changes.
3 changes: 3 additions & 0 deletions macros/query_comment.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{% macro get_query_comment(node) %}
{{ return(dbt_snowflake_query_tags.get_query_comment(node)) }}
{% endmacro %}
27 changes: 19 additions & 8 deletions models/dbt_queries.sql
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ select
dbt_metadata['materialized']::string as dbt_node_materialized,
dbt_metadata['is_incremental']::boolean as dbt_node_is_incremental,
dbt_metadata['node_alias']::string as dbt_node_alias,
dbt_metadata['node_tags']::array as node_tags,
iff(dbt_snowflake_query_tags_version >= '1.1.3', dbt_metadata['node_refs']::array, []) as dbt_node_refs, -- correct refs available from 1.1.3 onwards
dbt_metadata['node_database']::string as dbt_node_database,
dbt_metadata['node_schema']::string as dbt_node_schema,
Expand All @@ -27,14 +28,24 @@ select
dbt_metadata['dbt_cloud_run_id']::string as dbt_cloud_run_id,
dbt_metadata['dbt_cloud_run_reason_category']::string as dbt_cloud_run_reason_category,
dbt_metadata['dbt_cloud_run_reason']::string as dbt_cloud_run_reason,
min(start_time) over (partition by dbt_invocation_id, dbt_node_id order by start_time asc) as node_start_time,
{% if var('dbt_cloud_account_id', none) -%}
'https://cloud.getdbt.com/next/deploy/' || '{{ var('dbt_cloud_account_id') }}' || '/projects/' || dbt_cloud_project_id || '/jobs/' || dbt_cloud_job_id as dbt_cloud_job_url,
'https://cloud.getdbt.com/next/deploy/' || '{{ var('dbt_cloud_account_id') }}' || '/projects/' || dbt_cloud_project_id || '/runs/' || dbt_cloud_run_id as dbt_cloud_run_url,
{%- else -%}
'Required dbt_cloud_account_id variable not set' as dbt_cloud_job_url, -- noqa
'Required dbt_cloud_account_id variable not set' as dbt_cloud_run_url,
{%- endif %}
case
when dbt_cloud_project_id is not null
then
{% if var('dbt_cloud_account_id', none) -%}
var('dbt_cloud_url', 'https://cloud.getdbt.com/deploy/') || '{{ var('dbt_cloud_account_id') }}' || '/projects/' || dbt_cloud_project_id || '/jobs/' || dbt_cloud_job_id
{%- else -%}
'Required dbt_cloud_account_id variable not set' -- noqa
{%- endif %}
end as dbt_cloud_job_url,
case
when dbt_cloud_project_id is not null
then
{% if var('dbt_cloud_account_id', none) -%}
var('dbt_cloud_url', 'https://cloud.getdbt.com/deploy/') || '{{ var('dbt_cloud_account_id') }}' || '/projects/' || dbt_cloud_project_id || '/runs/' || dbt_cloud_run_id
{%- else -%}
'Required dbt_cloud_account_id variable not set' -- noqa
{%- endif %}
end as dbt_cloud_run_url,
* exclude dbt_metadata
from {{ ref('query_history_enriched') }}
where dbt_metadata is not null
Expand Down
38 changes: 29 additions & 9 deletions models/dbt_queries.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,44 @@ models:
- name: dbt_queries
description: Filtered version of query_history_enriched just for queries issued by dbt. Adds additional dbt-specific columns.
columns:
- name: dbt_version
description: Version of dbt in use.
- name: dbt_target_name
description: The target name for the dbt invocation.
- name: dbt_target_database
description: The target database for the dbt invocation.
- name: dbt_target_schema
description: The target schema for the dbt invocation.
- name: dbt_snowflake_query_tags_version
description: Version of the dbt-snowflake-query-tags package that generated the metadata
- name: dbt_invocation_id
description: The id of the dbt invocation.
- name: dbt_node_id
description: The identifier for the node that the query relates to.
- name: dbt_node_resource_type
description: The resource type of the node that the query relates to.
- name: dbt_node_name
description: The name of the node that the query relates to.
- name: dbt_node_materialized
description: The materialization of the node that the query relates to.
- name: dbt_node_is_incremental
description: The materialization of the node that the query relates to.
description: Boolean describing if the node run was incremental.
- name: dbt_node_alias
description: Alias set for the node.
- name: dbt_node_tags
description: Array of all tags set for the node.
- name: dbt_node_refs
description: Array of all refs used by the node.
- name: dbt_node_database
description: The database configured for the node.
- name: dbt_node_schema
description: The schema configured for the node.
- name: dbt_version
description: Version of dbt in use.
- name: dbt_project_name
description: Name of the dbt project.
- name: dbt_target_name
description: The target name for the dbt invocation.
- name: dbt_target_database
description: The target database for the dbt invocation.
- name: dbt_target_schema
description: The target schema for the dbt invocation.
- name: dbt_node_package_name
description: The package name of the dbt node.
- name: dbt_node_original_file_path
description: The file path of the dbt node.
- name: dbt_cloud_project_id
description: If using dbt Cloud, the ID of the project.
- name: dbt_cloud_job_id
Expand Down
9 changes: 6 additions & 3 deletions models/query_history_enriched.sql
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{{ config(
materialized='incremental',
unique_key=['query_id', 'start_time'],
pre_hook="{{ create_regexp_replace_udf(this) }}"
pre_hook=["{{ create_regexp_replace_udf(this) }}", "{{ create_merge_objects_udf(this) }}"]
) }}

with
Expand All @@ -12,10 +12,13 @@ query_history as (
-- this removes comments enclosed by /* <comment text> */ and single line comments starting with -- and either ending with a new line or end of string
{{ this.database }}.{{ this.schema }}.dbt_snowflake_monitoring_regexp_replace(query_text, $$(/\*(.|\n|\r)*?\*/)|(--.*$)|(--.*(\n|\r))$$, '') as query_text_no_comments,

regexp_substr(query_text, '/\\*\\s({"app":\\s"dbt".*})\\s\\*/', 1, 1, 'ie') as _dbt_json_meta,
try_parse_json(regexp_substr(query_text, '/\\*\\s({"app":\\s"dbt".*})\\s\\*/', 1, 1, 'ie')) as _dbt_json_comment_meta,
case
when try_parse_json(query_tag)['dbt_snowflake_query_tags_version'] is not null then try_parse_json(query_tag)
else try_parse_json(_dbt_json_meta)
end as _dbt_json_query_tag_meta,
case
when _dbt_json_comment_meta is not null or _dbt_json_query_tag_meta is not null then
{{ this.database }}.{{ this.schema }}.merge_objects(coalesce(_dbt_json_comment_meta, {}), coalesce(_dbt_json_query_tag_meta, {}))
end as dbt_metadata

from {{ ref('stg_query_history') }}
Expand Down
2 changes: 1 addition & 1 deletion packages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ packages:
- package: dbt-labs/dbt_utils
version: [">=0.8.0", "<2.0.0"]
- package: get-select/dbt_snowflake_query_tags
version: 1.1.3
version: 2.0.1

0 comments on commit 7716514

Please sign in to comment.