-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] parse SQL to generate list of columns #152
Comments
Thats a cool idea. Implementation wise, its not an easy task. Its fraught with edge cases. Star expansions, lexical scoping, etc. Not to mention dbt SQL is not valid sql since it has jinja markup so we cannot generate an AST. The best bet for making this successful would be to use the compiled SQL. Compilation can be done without a connection to the DWH I think? Unless there are introspective queries ofc. So maybe we start there and do a best effort? |
There is built-in DBT macro get_columns_in_query But I think the main problem would be dealing with model dependencies. If we have 2 models A and B. The only alternative to that is tweaking materialization of every upstream model to I think the only reliable approach is to use new DBT 1.8 feature |
|
In a similar vein to #57, I'm looking for ways to run this without needing to connect to the data warehouse.
My understanding of the current implementation is that we either pull columns from the warehouse, or rely on them being manually generated ahead of time. What I would like to see additionally is an option to parse the compiled SQL, derive the names of the columns outputted by the model, and scaffold those into the yaml files - with appropriate upstream descriptions if applicable.
Presumably this would be added somewhere here, and controlled through a CLI flag. We can require a manifest, similar to the catalog option, to avoid parsing dbt during runtime.
This would enable fully self-contained generation and propagation of documentation, without the need for large manual data entry or connections to external systems.
I'm happy to take a stab at this if you think it makes sense to include?
The text was updated successfully, but these errors were encountered: