-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pyarrow type error #541
Comments
@dev-goyal Thanks for raising this. It looks like that's a timestamp with nanosecond precision. Support for nanosecond timestamps is currently being added in the latest specification. Can I ask how you wrote the Parquet file? |
Thanks @Fokko, makes sense! I was able to simply reduce precision on my end so it's not a big deal, but I figured it couldn't hurt to raise this. I wrote these data using DBT into an iceberg table (Athena/Trino as the engine) (sourced from a CDC topic, hence the nanosecond precision), they are represented originally as Timestamp. |
Hey @dev-goyal, do you mind posting a snippet of your example above? I think this is very similar to #520 In #520, the iceberg table is created with pyarrow schema. Internally, iceberg converts the schema and "downcast" certain types (large_string -> string, timestamp nano -> timestamp). #523 should help solve this |
I'm facing a similar issue in my code. Tested using main@7fcdb8d25dfa2498ba98a2b8e8d2b327d85fa7c9 (the commit after In my case I'm creating a new table from this arrow schema:
This is the full stacktrace:
|
So, In my case, the column has been generated by this user-generated snipped:
Unfortunately, I can not control what a user can write and how she produces the table. What's the recommended solution for downcasting unsupported column types into something less precise, without raising an error? |
Ciao @bigluck. Thanks for jumping in here. Until V3 is finalized, we can add a flag to cast a nanosecond to a microsecond precision. Would that work for you? |
@Fokko it sounds good to me! :) |
For anyone else that stumbles across this, you can:
(where tbl is a pyarrow Table) as nulls are also not supported |
Apache Iceberg version
0.6.0 (latest release)
Please describe the bug 🐞
Given a table like so:
I get the following error
After some debugging, at this line I find
I imagine the fix is to do something like this on this line, but currently those overrides are not exposed. Am I on the right track?
I believe that this issue is somewhat similar to #520
The text was updated successfully, but these errors were encountered: