-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow writing dataframes that are either a subset of table schema or in arbitrary order #829
Changes from all commits
f2949b7
397e31e
fd24c56
2c52276
6a9b612
9ee11c2
84ad3da
2ce6db3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -484,10 +484,6 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT) | |
_check_schema_compatible( | ||
self._table.schema(), other_schema=df.schema, downcast_ns_timestamp_to_us=downcast_ns_timestamp_to_us | ||
) | ||
# cast if the two schemas are compatible but not equal | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @syun64 I want to get your take on this part. Due to the timestamp change, do you know if the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Happy to extract this convo into an issue, to also continue the convo from #786 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I have a PR open to try to fix this behavior: #910 I think it's almost ready to merge 😄 |
||
table_arrow_schema = self._table.schema().as_arrow() | ||
if table_arrow_schema != df.schema: | ||
df = df.cast(table_arrow_schema) | ||
|
||
manifest_merge_enabled = PropertyUtil.property_as_bool( | ||
self.table_metadata.properties, | ||
|
@@ -545,10 +541,6 @@ def overwrite( | |
_check_schema_compatible( | ||
self._table.schema(), other_schema=df.schema, downcast_ns_timestamp_to_us=downcast_ns_timestamp_to_us | ||
) | ||
# cast if the two schemas are compatible but not equal | ||
table_arrow_schema = self._table.schema().as_arrow() | ||
if table_arrow_schema != df.schema: | ||
df = df.cast(table_arrow_schema) | ||
|
||
self.delete(delete_filter=overwrite_filter, snapshot_properties=snapshot_properties) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't work for nested structs, need a better solution