Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add include_data_types argument to generate_model_yaml macro #122

Merged
merged 26 commits into from
Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0c3c737
Add column data types
Apr 7, 2023
e2f5b09
Update tests to include data_type field
Apr 7, 2023
6acccf5
Update macros to produce lowercase strings
Apr 8, 2023
ad019b2
Update generate_source_yaml to lower data_type field
Apr 8, 2023
0de3eb7
Add include_data_types arg to generate_model_yaml
Apr 8, 2023
12f21b4
Update default include_data_types in generate_source
Apr 8, 2023
4ff820a
Update README and changelog
Apr 8, 2023
43e7b7d
source -> model
Apr 8, 2023
a76c139
Use dbt.formt_column or fall back to vendored version
Apr 28, 2023
f638309
Fix typo
Apr 28, 2023
1cebade
Add missing spaces
Apr 28, 2023
39b88ab
Remove args from text_type_value and integer_type_value
Apr 28, 2023
1e55d67
Final newline
Apr 28, 2023
c1f5344
We need these parentheses
Apr 28, 2023
354865f
snowflake wants varchar
Apr 28, 2023
8320981
elif
Apr 28, 2023
e61af4c
Merge branch 'main' into add_types_to_generate_yaml
dbeatty10 Aug 30, 2023
5e08a1a
Vendor the `format_column` macro from dbt-core into dbt-codegen
dbeatty10 Sep 25, 2023
b2f463c
Completely rely on vendored version of the `format_column` macro
dbeatty10 Sep 25, 2023
09ba319
Default `include_data_types=True` for the `generate_source` macro
dbeatty10 Sep 25, 2023
db521b2
Alert to change in behaviors for `generate_source`
dbeatty10 Sep 25, 2023
d5aaa2b
Fix example by using a boolean instead of a string that will not work
dbeatty10 Sep 25, 2023
8b4f258
Example how to configure `include_data_types` to align with previous …
dbeatty10 Sep 25, 2023
2471363
Standardize formatting of default values
dbeatty10 Sep 25, 2023
c3abd44
Fix format_column is undefined error
dbeatty10 Sep 25, 2023
6f6aea6
Enable custom overrides of data type formatting via multiple dispatch
dbeatty10 Sep 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,33 @@
--->

# Unreleased

## 🚨 Breaking change

`include_data_types` parameter added to `generate_model_yaml` and behavior changed for `generate_source`. Both default to `true`
and are lowercase to align with the dbt style guide. Scale & precision are **not** included. Previous logic for `generate_source` defaulted to `false` and the resulting data types were uppercase and included scale & precision ([#122](https://github.com/dbt-labs/dbt-codegen/pull/122)).

[Dispatch](https://docs.getdbt.com/reference/dbt-jinja-functions/dispatch) can be used to utilize the column data type formatting of previous versions. Namely, by adding this macro to your project:
```sql
{% macro default__data_type_format_source(column) %}
{{ return(column.data_type | upper) }}
{% endmacro %}
```

And then adding this within `dbt_project.yml`:
```yaml
dispatch:
- macro_namespace: codegen
search_order: ['my_project', 'codegen']
```


## New features
- Addition of the [create_base_models](macros/create_base_models.sql)
This macro generates a series of terminal commands (appended w) bash script which creates a new file in your dbt project based off the results of the [generate_base_model](macros/generate_base_model.sql) macro. Therefore, instead of outputting in the terminal, it will create the file for you.
- Add `include_data_types` flag to `generate_source` macro ([#76](https://github.com/dbt-labs/dbt-codegen/pull/76))
- Add `get_models` macro in helper macros. This macro retrieves a list of models with specified prefix at the specified directory. It is designed to make creating yamls for multiple models easier.
- Add `include_data_types` flag to `generate_model_yaml` macro ([#122](https://github.com/dbt-labs/dbt-codegen/pull/122))
- Add optional arguments to include database and schema properties in `sources.yml` generated from `generate_source` ([#123](https://github.com/dbt-labs/dbt-codegen/issues/123))

## Fixes
Expand All @@ -28,6 +50,7 @@ This macro generates a series of terminal commands (appended w) bash script whic
## Contributors:
- [@fivetran-joemarkiewicz](https://github.com/fivetran-joemarkiewicz) (#83)
- [@GSokol](https://github.com/GSokol) (#76)
- [@linbug](https://github.com/linbug) (#120)

# dbt-codegen v0.9.0

Expand Down
18 changes: 14 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ source data is in.
column names to your source definition.
* `include_descriptions` (optional, default=False): Whether you want to add
description placeholders to your source definition.
* `include_data_types` (optional, default=False): Whether you want to add data
* `include_data_types` (optional, default=True): Whether you want to add data
linbug marked this conversation as resolved.
Show resolved Hide resolved
types to your source columns definitions.
* `table_pattern` (optional, default='%'): A table prefix / postfix that you
want to subselect from all available tables within a given schema.
Expand Down Expand Up @@ -87,10 +87,16 @@ or
$ dbt run-operation generate_source --args '{"schema_name": "jaffle_shop", "database_name": "raw", "table_names":["table_1", "table_2"]}'
```

Including data types:
or if you want to include column names and data types:

```
$ dbt run-operation generate_source --args '{"schema_name": "jaffle_shop", "generate_columns": "true", "include_data_types": "true"}'
$ dbt run-operation generate_source --args '{"schema_name": "jaffle_shop", "generate_columns": true}'
```

or if you want to include column names without data types (the behavior dbt-codegen <= v0.9.0):

```
$ dbt run-operation generate_source --args '{"schema_name": "jaffle_shop", "generate_columns": true, "include_data_types": false}'
```

2. The YAML for the source will be logged to the command line
Expand Down Expand Up @@ -208,6 +214,7 @@ schema.yml file.
### Arguments:
* `model_names` (required): The model(s) you wish to generate YAML for.
* `upstream_descriptions` (optional, default=False): Whether you want to include descriptions for identical column names from upstream models.
* `include_data_types` (optional, default=True): Whether you want to add data types to your model column definitions.

### Usage:
1. Create a model.
Expand Down Expand Up @@ -241,10 +248,13 @@ version: 2

models:
- name: customers
description: ""
columns:
- name: customer_id
data_type: integer
description: ""
- name: customer_name
data_type: text
description: ""
```

Expand All @@ -255,7 +265,7 @@ This macro generates the SQL for a given model with all references pulled up int

### Arguments:
* `model_name` (required): The model you wish to generate SQL with import CTEs for.
* `leading_commas` (optional, default = false): Whether you want your commas to be leading (vs trailing).
* `leading_commas` (optional, default=False): Whether you want your commas to be leading (vs trailing).

### Usage:
1. Create a model with your original SQL query
Expand Down
6 changes: 3 additions & 3 deletions integration_tests/macros/integer_type_value.sql
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{%- macro integer_type_value() -%}
{%- if target.type == "snowflake" -%}
NUMBER(38,0)
number
{%- elif target.type == "bigquery" -%}
INT64
int64
{%- else -%}
INTEGER
integer
{%- endif -%}
{%- endmacro -%}
12 changes: 6 additions & 6 deletions integration_tests/macros/text_type_value.sql
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
{%- macro text_type_value(text_length) -%}
{%- if target.type == "redshift" -%}
CHARACTER VARYING({{ text_length }})
{%- macro text_type_value() -%}
{%- if target.type == "redshift"-%}
character varying
{%- elif target.type == "snowflake" -%}
CHARACTER VARYING(16777216)
varchar
{%- elif target.type == "bigquery" -%}
STRING
string
{%- else -%}
TEXT
text
{%- endif -%}
{%- endmacro -%}
3 changes: 2 additions & 1 deletion integration_tests/tests/test_generate_model_struct_yaml.sql
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
) %}

{% set actual_source_yaml = codegen.generate_model_yaml(
model_names=['model_struct']
model_names=['model_struct'],
include_data_types=False
)
%}

Expand Down
2 changes: 2 additions & 0 deletions integration_tests/tests/test_generate_model_yaml.sql
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@ models:
description: ""
columns:
- name: col_a
data_type: {{ integer_type_value() }}
description: ""

- name: col_b
data_type: {{ text_type_value() }}
description: ""

{% endset %}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{% set actual_model_yaml = codegen.generate_model_yaml(
model_names=['data__a_relation','data__b_relation']
model_names=['data__a_relation','data__b_relation'],
include_data_types=False
)
%}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{% set actual_model_yaml = codegen.generate_model_yaml(
model_names=['child_model'],
upstream_descriptions=True
upstream_descriptions=True,
include_data_types=False
)
%}

Expand Down
12 changes: 6 additions & 6 deletions integration_tests/tests/test_generate_source_all_args.sql
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ sources:
data_type: {{ integer_type_value() }}
description: ""
- name: col_b
data_type: {{ text_type_value(1) }}
data_type: {{ text_type_value() }}
description: ""

- name: data__b_relation
Expand All @@ -43,26 +43,26 @@ sources:
data_type: {{ integer_type_value() }}
description: ""
- name: col_b
data_type: {{ text_type_value(1) }}
data_type: {{ text_type_value() }}
description: ""

- name: data__campaign_analytics
description: ""
columns:
- name: source
data_type: {{ text_type_value(8) }}
data_type: {{ text_type_value() }}
description: ""
- name: medium
data_type: {{ text_type_value(8) }}
data_type: {{ text_type_value() }}
description: ""
- name: source_medium
data_type: {{ text_type_value(2) }}
data_type: {{ text_type_value() }}
description: ""
- name: analytics
data_type: {{ integer_type_value() }}
description: ""
- name: col_x
data_type: {{ text_type_value(1) }}
data_type: {{ text_type_value() }}
description: ""

{% endset %}
Expand Down
3 changes: 2 additions & 1 deletion integration_tests/tests/test_generate_source_some_tables.sql
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
database_name=target.database,
table_names=['data__a_relation'],
generate_columns=True,
include_descriptions=True
include_descriptions=True,
include_data_types=False
) %}


Expand Down
11 changes: 7 additions & 4 deletions macros/generate_model_yaml.sql
Original file line number Diff line number Diff line change
@@ -1,23 +1,26 @@
{% macro generate_column_yaml(column, model_yaml, column_desc_dict, parent_column_name="") %}
{% macro generate_column_yaml(column, model_yaml, column_desc_dict, include_data_types, parent_column_name="") %}
{% if parent_column_name %}
{% set column_name = parent_column_name ~ "." ~ column.name %}
{% else %}
{% set column_name = column.name %}
{% endif %}

{% do model_yaml.append(' - name: ' ~ column_name | lower ) %}
{% if include_data_types %}
{% do model_yaml.append(' data_type: ' ~ codegen.data_type_format_model(column)) %}
{% endif %}
{% do model_yaml.append(' description: "' ~ column_desc_dict.get(column.name | lower,'') ~ '"') %}
{% do model_yaml.append('') %}

{% if column.fields|length > 0 %}
{% for child_column in column.fields %}
{% set model_yaml = codegen.generate_column_yaml(child_column, model_yaml, column_desc_dict, parent_column_name=column_name) %}
{% set model_yaml = codegen.generate_column_yaml(child_column, model_yaml, column_desc_dict, include_data_types, parent_column_name=column_name) %}
{% endfor %}
{% endif %}
{% do return(model_yaml) %}
{% endmacro %}

{% macro generate_model_yaml(model_names=[], upstream_descriptions=False) %}
{% macro generate_model_yaml(model_names=[], upstream_descriptions=False, include_data_types=True) %}

{% set model_yaml=[] %}

Expand All @@ -38,7 +41,7 @@
{% set column_desc_dict = codegen.build_dict_column_descriptions(model) if upstream_descriptions else {} %}

{% for column in columns %}
{% set model_yaml = codegen.generate_column_yaml(column, model_yaml, column_desc_dict) %}
{% set model_yaml = codegen.generate_column_yaml(column, model_yaml, column_desc_dict, include_data_types) %}
{% endfor %}
{% endfor %}
{% endif %}
Expand Down
4 changes: 2 additions & 2 deletions macros/generate_source.sql
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@


---
{% macro generate_source(schema_name, database_name=target.database, generate_columns=False, include_descriptions=False, include_data_types=False, table_pattern='%', exclude='', name=schema_name, table_names=None, include_database=False, include_schema=False) %}
{% macro generate_source(schema_name, database_name=target.database, generate_columns=False, include_descriptions=False, include_data_types=True, table_pattern='%', exclude='', name=schema_name, table_names=None, include_database=False, include_schema=False) %}

{% set sources_yaml=[] %}
{% do sources_yaml.append('version: 2') %}
Expand Down Expand Up @@ -62,7 +62,7 @@
{% for column in columns %}
{% do sources_yaml.append(' - name: ' ~ column.name | lower ) %}
{% if include_data_types %}
{% do sources_yaml.append(' data_type: ' ~ (column.data_type | upper ) ) %}
{% do sources_yaml.append(' data_type: ' ~ codegen.data_type_format_source(column)) %}
{% endif %}
{% if include_descriptions %}
{% do sources_yaml.append(' description: ""' ) %}
Expand Down
22 changes: 21 additions & 1 deletion macros/helpers/helpers.sql
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,24 @@
{% endfor %}
{% endif %}
{{ return(model_names) }}
{% endmacro %}
{% endmacro %}

{% macro data_type_format_source(column) -%}
{{ return(adapter.dispatch('data_type_format_source', 'codegen')(column)) }}
{%- endmacro %}

{# format a column data type for a source #}
{% macro default__data_type_format_source(column) %}
{% set formatted = codegen.format_column(column) %}
{{ return(formatted['data_type'] | lower) }}
{% endmacro %}

{% macro data_type_format_model(column) -%}
{{ return(adapter.dispatch('data_type_format_model', 'codegen')(column)) }}
{%- endmacro %}

{# format a column data type for a model #}
{% macro default__data_type_format_model(column) %}
{% set formatted = codegen.format_column(column) %}
{{ return(formatted['data_type'] | lower) }}
{% endmacro %}
5 changes: 5 additions & 0 deletions macros/vendored/dbt_core/format_column.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{% macro format_column(column) -%}
{% set data_type = column.dtype %}
{% set formatted = column.column.lower() ~ " " ~ data_type %}
{{ return({'name': column.name, 'data_type': data_type, 'formatted': formatted}) }}
{%- endmacro -%}