Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get table & column descriptions from information_schema metadata columns when generating yml #119

Closed
jakub-auger opened this issue Mar 2, 2023 · 4 comments
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@jakub-auger
Copy link

Describe the feature

As a dev I want to leverage the metadata that already exists in my database, namely table and column descriptions which are captured in the information_schema.tables and .columns metadata tables

Describe alternatives you've considered

manually scripting out the info on a per table

Additional context

Current use case is on databricks w/unity catalog, but would be useful elsewhere

Who will this benefit?

The source tables are very wide (100s of columns), manually data entering descriptions into a yml file in notepad isn't feasible

More robust to enter the info via the databricks UI into the table and columns. Need to pull it out to expose this info in the dbt docs

Are you interested in contributing this feature?

@jakub-auger jakub-auger added the enhancement New feature or request label Mar 2, 2023
@kellybh123
Copy link

kellybh123 commented Apr 14, 2023

I have similar ask. I have a bunch of source tables in bigquery that have descriptions already in them and I would like to port those over to the dbt model ymls as well. But when i create the a base/source model yml using the generate_model_yaml function there is no way to tell the function hey can you look for descriptions from source table and if present automatically put those in those first models ymls. I understand that once the descriptions are in the first model i can use the upstream_descriptions in all downstream models, but i have 1000+ columns i am not trying to copy paste over, which already have somewhat of adequate descriptions.

@Rkejji
Copy link

Rkejji commented Jun 27, 2023

@kellybh123 I am encountring the same issue. Note that in dbt-bigquery, the column class that is used by codegen to retrieve columns from big query, does not have a description attribute.

@aaronsteers
Copy link

aaronsteers commented Dec 6, 2023

I see we have some upvotes here and I'm interested also.

Any objection from maintainers about adding descriptions to columns and tables if those can be discovered from the table/column metadata?

Aka - if capacity opens up, would a contribution here be accepted? 😄

@dbeatty10
Copy link
Contributor

Thank you for opening this @jakub-auger for opening this, and for all of you that have shown interest in it!

It's not a priority for us to add this to dbt-codegen at this time, so we won't be accepting contributions.

Alternative approaches are sketched out here:
dbt-labs/dbt-core#9198 (comment)

TLDR

Upon the next release of dbt-osmosis, you should be able to do something like this:

dbt docs generate
dbt-osmosis yaml document --catalog-file target/catalog.json

Or you (or a different 3rd party tool) can utilize programmatic invocations to generate a Catalog artifact and use it to scaffold your YAML files with comments included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants