Skip to content

JorgenG/dbt-codegen

 
 

Repository files navigation

dbt-codegen

Macros that generate dbt code, and log it to the command line.

Contents

Installation instructions

New to dbt packages? Read more about them here.

  1. Include this package in your packages.yml file — check here for the latest version number.
  2. Run dbt deps to install the package.

Macros

generate_source (source)

This macro generates lightweight YAML for a Source, which you can then paste into a schema file.

Arguments

  • schema_name (required): The schema name that contains your source data
  • database_name (optional, default=target.database): The database that your source data is in.
  • table_names (optional, default=none): A list of tables that you want to generate the source definitions for.
  • generate_columns (optional, default=False): Whether you want to add the column names to your source definition.
  • include_descriptions (optional, default=False): Whether you want to add description placeholders to your source definition.
  • table_pattern (optional, default='%'): A table prefix / postfix that you want to subselect from all available tables within a given schema.
  • exclude (optional, default=''): A string you want to exclude from the selection criteria
  • name (optional, default=schema_name): The name of your source

Usage:

  1. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_source('raw_jaffle_shop') }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_source --args 'schema_name: raw_jaffle_shop'

or

# for multiple arguments, use the dict syntax
$ dbt run-operation generate_source --args '{"schema_name": "jaffle_shop", "database_name": "raw", "table_names":["table_1", "table_2"]}'
  1. The YAML for the source will be logged to the command line
version: 2

sources:
  - name: raw_jaffle_shop
    database: raw
    tables:
      - name: customers
        description: ""
      - name: orders
        description: ""
      - name: payments
        description: ""
  1. Paste the output in to a schema .yml file, and refactor as required.

generate_base_model (source)

This macro generates the SQL for a base model, which you can then paste into a model.

Arguments:

  • source_name (required): The source you wish to generate base model SQL for.
  • table_name (required): The source table you wish to generate base model SQL for.
  • leading_commas (optional, default=False): Whether you want your commas to be leading (vs trailing).
  • case_sensitive_cols (optional, default=False): Whether your source table has case sensitive column names. If true, keeps the case of the column names from the source.

Usage:

  1. Create a source for the table you wish to create a base model on top of.
  2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_base_model(
    source_name='raw_jaffle_shop',
    table_name='customers'
) }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_base_model --args '{"source_name": "raw_jaffle_shop", "table_name": "customers"}'
  1. The SQL for a base model will be logged to the command line
with source as (

    select * from {{ source('raw_jaffle_shop', 'customers') }}

),

renamed as (

    select
        id,
        first_name,
        last_name,
        email,
        _elt_updated_at

    from source

)

select * from renamed
  1. Paste the output in to a model, and refactor as required.

generate_model_yaml (source)

This macro generates the YAML for a model, which you can then paste into a schema.yml file.

Arguments:

  • model_name (required): The model you wish to generate YAML for.
  • upstream_descriptions (optional, default=False): Whether you want to include descriptions for identical column names from upstream models.

Usage:

  1. Create a model.
  2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_model_yaml(
    model_name='customers'
) }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_model_yaml --args '{"model_name": "customers"}'
  1. The YAML for a base model will be logged to the command line
version: 2

models:
  - name: customers
    columns:
      - name: customer_id
        description: ""
      - name: customer_name
        description: ""
  1. Paste the output in to a schema.yml file, and refactor as required.

generate_model_import_ctes (source)

This macro generates the SQL for a given model with all references pulled up into import CTEs, which you can then paste back into the model.

Arguments:

  • model_name (required): The model you wish to generate SQL with import CTEs for.
  • leading_commas (optional, default = false): Whether you want your commas to be leading (vs trailing).

Usage:

  1. Create a model with your original SQL query
  2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code
{{ codegen.generate_model_import_ctes(
    model_name = 'my_dbt_model'
) }}

Alternatively, call the macro as an operation:

$ dbt run-operation generate_model_import_ctes --args '{"model_name": "my_dbt_model"}'
  1. The new SQL - with all references pulled up into import CTEs - will be logged to the command line
with customers as (

    select * from {{ ref('stg_customers') }}

),

orders as (

    select * from {{ ref('stg_orders') }}

),

payments as (

    select * from {{ ref('stg_payments') }}

),

customer_orders as (

    select
        customer_id,
        min(order_date) as first_order,
        max(order_date) as most_recent_order,
        count(order_id) as number_of_orders
    from orders
    group by customer_id

),

customer_payments as (

    select
        orders.customer_id,
        sum(amount) as total_amount
    from payments
    left join orders on
         payments.order_id = orders.order_id
    group by orders.customer_id

),

final as (

    select
        customers.customer_id,
        customers.first_name,
        customers.last_name,
        customer_orders.first_order,
        customer_orders.most_recent_order,
        customer_orders.number_of_orders,
        customer_payments.total_amount as customer_lifetime_value
    from customers
    left join customer_orders
        on customers.customer_id = customer_orders.customer_id
    left join customer_payments
        on  customers.customer_id = customer_payments.customer_id

)

select * from final
  1. Replace the contents of the model's current SQL file with the compiled or logged code