This package was built to help you monitor the costs involved in using Databricks as a data plataform. It makes it easier to keep up with the costs related to DBUs and clusters.
Pay attention, it is very important to know if your modification to this repository is a release (breaking changes), a feature (functionalities) or a patch(to fix bugs). With that information, create your branch name like this:
release/<branch-name>
ormajor/<branch-name>
orRelease/<branch-name>
orMajor/<branch-name>
feature/<branch-name>
orminor/<branch-name>
with capitalised letters work as wellpatch/<branch-name>
orfix/<branch-name>
orhotfix/<branch-name>
with capitalised letters work as well
0.3.0 - For Snowflake warehouses 0.3.1 - For Databricks warehouses
dbt
dbt version >= 1.0.0
dbt_utils package:
dbt-labs/dbt_utils version: 1.1.1
Include the following package version in your packages.yml
file.
packages:
- git: https://github.com/techindicium/dbt-databricks-billing
revision: # 0.3.0 or 0.3.1
Then run dbt deps
to finish the setup.
This package uses four main sources:
- list_prices
- usage
- warehouses
- clusters
list_prices and usage are system tables of Databricks, located inside system.billing
.
The warehouses and clusters tables contain informations cuptured by a REST API request.
More information about the endpoints could be finded here:
You can use our adf tap to extract this informations:
Platform Meltano on Databricks
The location of the raw data to be used in this package is configurable, so it's importante to complete the following information at dbt_project.yml
:
models:
databricks_billing:
marts:
+materialized: table
staging:
+materialized: view
vars:
databricks_billing_database: # name of the database
databricks_billing_schema: # name of the schema
In this case, it's important to put the name of the catalog and schema where the tables warehouses
and clusters
are.
We strongly recommend that the job that will run this package to be apart from the job that runs the models in production.
This allows to prevent trubles with errors with the package that could make models in production to crash.