Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During import of SQL, dbt or DBML, the CLI must know which type system is used (i.e. which server) #423

Open
simonharrer opened this issue Sep 16, 2024 · 2 comments

Comments

@simonharrer
Copy link
Contributor

datacontract import --format sql --source my-databricks-ddl.sql

should map the types in the SQL DDL to the data contract types using the databricks types mapping. This is currently not possible. For that, we need to pass in additional info.

The same holds true when importing from the phsyical world, including any dbt or DBML import.

@jochenchrist
Copy link
Contributor

I would align with the dialects from sqlglot:
https://github.com/tobymao/sqlglot/tree/main/sqlglot/dialects

@simonharrer
Copy link
Contributor Author

The logic for importing from a physical type is as follows without sqlglot

server_type = "databricks"
logical_type = find_logical_type(server_type, physical_type)
  
def find_logical_type(server_type, physical_type):
  if server_type == "databricks":
    if physical_type == "map":
      return {"type": custom, "config": { "databricksType": "map" }}
    elif physical_type == "int":
      return return {"type": "int"}
    ...
 elif server_type == "snowflake":
   ...   

with sqlglot, we could do the following:

server_type = "databricks"
# define everything as custom with config+databricksType
parsed_expression_tree = sqlglot.parse(sql_ddl) # pseudo code!
for column in parsed_expression_tree.columns:
  logical_type = map_from_sqlglot_types_to_data_contract_spec_types(column.type)
  if logical_type:
    # set as type, remove config option

def map_from_sqlglot_types_to_data_contract_spec_types(sqlglot_type, physical_type, server_type):
  # map logic
  ...
  # else
  return nil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants