You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I realized that DuckDB can only read Iceberg metadata files if there have been no updates/deletes in the Iceberg table. I verified this with the following setup:
INSTALL iceberg;
LOAD iceberg;
INSTALL httpfs;
LOAD httpfs;
SET s3_access_key_id='key';
SET s3_secret_access_key='secretKey';
SET s3_region='us-east-1';
SET s3_use_ssl=true;
SET s3_url_style='path';
SELECT *
FROM
iceberg_scan('s3://my-bucket/observation/metadata/00004-bc91e4be-ee63-4922-89eb-f7730dbbee82.metadata.json');
SQL Error: java.sql.SQLException: Binder Error: Table "iceberg_scan_deletes" does not have a column named "file_path"
I tried update/delete using Nessie as catalog and Trino as writer (the engine behind AWS Athena)
DuckDB 1.1.2 has no issue reading and providing accurate results for a table with deleted/updated rows.
I'd like to verify it on Glue/Athena just to be certain.
Well, I am using AWS Firehose for writes so I don't really have information on that but I suspect it to be using Trino under the hood to write and I think Trino does not do positional deletes/updates
I realized that DuckDB can only read Iceberg metadata files if there have been no updates/deletes in the Iceberg table. I verified this with the following setup:
Catalog: AWS Glue
Iceberg table format: v2
DuckDB version: 1.0.0
Writer: AWS Firehose
Update strategy: Merge on Read
Here's what my code looks like:
SQL Error: java.sql.SQLException: Binder Error: Table "iceberg_scan_deletes" does not have a column named "file_path"
#60 seems like the same problem
The text was updated successfully, but these errors were encountered: