dbt-spark/CHANGELOG.md at main · cccs-jc/dbt-spark · GitHub

dbt-spark 1.0.1rc0 (Release TBD)

Fixes

Closes the connection properly (#280, #285)

Contributors

@ueshin (#285)

dbt-spark 1.0.0 (Release TBD)

Fixes

Incremental materialization corrected to respect full_refresh config, by using should_full_refresh() macro (#260, #262)

Contributors

@grindheim (#262)

dbt-spark 1.0.0rc2 (November 24, 2021)

Features

Add support for Apache Hudi (hudi file format) which supports incremental merge strategies (#187, #210)

Under the hood

Refactor seed macros: remove duplicated code from dbt-core, and provide clearer logging of SQL parameters that differ by connection method (#249, #250)
Replace sample_profiles.yml with profile_template.yml, for use with new dbt init (#247)

Contributors

@vingov (#210)

dbt-spark 1.0.0rc1 (November 10, 2021)

Under the hood

Remove official support for python 3.6, which is reaching end of life on December 23, 2021 (dbt-core#4134, #253)
Add support for structured logging (#251)

dbt-spark 0.21.1 (Release TBD)

dbt-spark 0.21.1rc1 (November 3, 2021)

Fixes

Fix --store-failures for tests, by suppressing irrelevant error in comment_clause() macro (#232, #233)
Add support for on_schema_change config in incremental models: ignore, fail, append_new_columns. For sync_all_columns, removing columns is not supported by Apache Spark or Delta Lake (#198, #226, #229)
Add persist_docs call to incremental model (#224, #234)

Contributors

@binhnefits (#234)

dbt-spark 0.21.0 (October 4, 2021)

Fixes

Enhanced get_columns_in_relation method to handle a bug in open source deltalake which doesnt return schema details in show table extended in databasename like '*' query output. This impacts dbt snapshots if file format is open source deltalake (#207)
Parse properly columns when there are struct fields to avoid considering inner fields: Issue (#202)

Under the hood

Add unique_field to better understand adapter adoption in anonymous usage tracking (#211)

Contributors

@harryharanb (#207)
@SCouto (#204)

dbt-spark 0.21.0b2 (August 20, 2021)

Fixes

Add pyodbc import error message to dbt.exceptions.RuntimeException to get more detailed information when running dbt debug (#192)
Add support for ODBC Server Side Parameters, allowing options that need to be set with the SET statement to be used (#201)
Add retry_all configuration setting to retry all connection issues, not just when the _is_retryable_error function determines (#194)

Contributors

dbt-spark 0.21.0b1 (August 3, 2021)

dbt-spark 0.20.1 (August 2, 2021)

dbt-spark 0.20.1rc1 (August 2, 2021)

Fixes

Fix get_columns_in_relation when called on models created in the same run (#196, #197)

Contributors

@ali-tny (#197)

dbt-spark 0.20.0 (July 12, 2021)

dbt-spark 0.20.0rc2 (July 7, 2021)

Features

Add support for merge_update_columns config in merge-strategy incremental models (#183, #184)

Fixes

Fix column-level persist_docs on Delta tables, add tests (#180)

dbt-spark 0.20.0rc1 (June 8, 2021)

Features

Allow user to specify use_ssl (#169)
Allow setting table OPTIONS using config (#171)
Add support for column-level persist_docs on Delta tables (#84, #170)

Fixes

Cast table_owner to string to avoid errors generating docs (#158, #159)
Explicitly cast column types when inserting seeds (#139, #166)

Under the hood

Parse information returned by list_relations_without_caching macro to speed up catalog generation (#93, #160)
More flexible host passing, https:// can be omitted (#153)

Contributors

dbt-spark 0.19.1 (April 2, 2021)

dbt-spark 0.19.1b2 (February 26, 2021)

Under the hood

Update serialization calls to use new API in dbt-core 0.19.1b2 (#150)

dbt-spark 0.19.0.1 (February 26, 2021)

Fixes

Fix package distribution to include incremental model materializations (#151, #152)

dbt-spark 0.19.0 (February 21, 2021)

Breaking changes

Incremental models have incremental_strategy: append by default. This strategy adds new records without updating or overwriting existing records. For that, use merge or insert_overwrite instead, depending on the file format, connection method, and attributes of your underlying data. dbt will try to raise a helpful error if you configure a strategy that is not supported for a given file format or connection. (#140, #141)

Fixes

Capture hard-deleted records in snapshot merge, when invalidate_hard_deletes config is set (#109, #126)

dbt-spark 0.19.0rc1 (January 8, 2021)

Breaking changes

Users of the http and thrift connection methods need to install extra requirements: pip install dbt-spark[PyHive] (#109, #126)

Under the hood

Enable CREATE OR REPLACE support when using Delta. Instead of dropping and recreating the table, it will keep the existing table, and add a new version as supported by Delta. This will ensure that the table stays available when running the pipeline, and you can track the history.
Add changelog, issue templates (#119, #120)

Fixes

Handle case of 0 retries better for HTTP Spark Connections (#132)

Contributors

@danielvdende (#132)
@Fokko (#125)

dbt-spark 0.18.1.1 (November 13, 2020)

Fixes

Fix extras_require typo to enable pip install dbt-spark[ODBC] ((#121), (#122))

dbt-spark 0.18.1 (November 6, 2020)

Features

Allows users to specify auth and kerberos_service_name (#107)
Add support for ODBC driver connections to Databricks clusters and endpoints (#116)

Under the hood

Updated README links (#115)
Support complete atomic overwrite of non-partitioned incremental models (#117)
Update to support dbt-core 0.18.1 (#110, #118)

Contributors

dbt-spark 0.18.0 (September 18, 2020)

Under the hood

Make a number of changes to support dbt-adapter-tests (#103)
Update to support dbt-core 0.18.0. Run CI tests against local Spark, Databricks (#105)