-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix databricks pandas error #1443
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
e3c5d68
to
b22f4aa
Compare
4cfbb87
to
06de0b3
Compare
06de0b3
to
41a8a6f
Compare
@@ -264,7 +264,7 @@ def __init__(self, schema: Schema, config: DatabricksClientConfiguration) -> Non | |||
sql_client = DatabricksSqlClient(config.normalize_dataset_name(schema), config.credentials) | |||
super().__init__(schema, config, sql_client) | |||
self.config: DatabricksClientConfiguration = config | |||
self.sql_client: DatabricksSqlClient = sql_client | |||
self.sql_client: DatabricksSqlClient = sql_client # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did it work before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea tbh. Maybe something to do with 2.9 driver not having type hints? 3.x does.
@@ -37,7 +38,9 @@ def __init__(self, dataset_name: str, credentials: DatabricksCredentials) -> Non | |||
|
|||
def open_connection(self) -> DatabricksSqlConnection: | |||
conn_params = self.credentials.to_connector_params() | |||
self._conn = databricks_lib.connect(**conn_params, schema=self.dataset_name) | |||
self._conn = databricks_lib.connect( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is my only issue here: could we support 2.9 driver? the 3.x has
paramstyle = "named"
and 2.9 pyformat so it is trivial to distinguish them
why: I bet there are people with constrained environments (ie Airflow) where upgrading dependencies is hard. and 3.0 was released only in march
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.9 still works if you have it installed. There's no change in our code except this argument that's ignored in 2.9.
paramstyle
doesn't matter with use_inline_params
, this makes 3.x work like 2.x which does not use parametrized queries at all, it's some custom escaping and %s
string formatting under the hood. 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to switch to named paramstyle we can detect driver version and cluster version. But I don't know enough about it to tell if this is reliable in all environments: https://docs.databricks.com/en/sql/language-manual/functions/current_version.html
pyproject.toml
Outdated
@@ -81,8 +81,8 @@ weaviate-client = {version = ">=3.22", optional = true} | |||
adlfs = {version = ">=2022.4.0", optional = true} | |||
pyodbc = {version = "^4.0.39", optional = true} | |||
qdrant-client = {version = "^1.6.4", optional = true, extras = ["fastembed"]} | |||
databricks-sql-connector = {version = ">=2.9.3,<3.0.0", optional = true} | |||
dbt-databricks = {version = ">=1.7.3", optional = true} | |||
databricks-sql-connector = {version = "^3", optional = true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK this is good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I cleaned up dbt stuff (moved from extras to dependency group), allowed databricks > 2.9.3 bumped dependency to >3.
all seems to work together
Description
Continued from #1436 without fork
Related Issues
Additional Context