-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues/29: Add version guessing functionality #79
Conversation
Signed-off-by: Teague Sterling <[email protected]>
Signed-off-by: Teague Sterling <[email protected]>
Signed-off-by: Teague Sterling <[email protected]>
Signed-off-by: Teague Sterling <[email protected]>
Signed-off-by: Teague Sterling <[email protected]>
Thanks for the PR @teaguesterling, I think this makes sense! There do seem to be some tests failing related to this PR |
Thanks, @samansmink. I'm trying to debug the test failures. I added some new ones to cover the new functionality introduced here but was seeing failures in untouched parts of the code related to the |
@samansmink It seems there was some weirdness with the test data and cases in my branch. I've cleaned those up and made some fixes to get this to pass. |
Happy to do that! Thank you for the review
…On Mon, Nov 18, 2024, 07:53 Thijs ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/common/iceberg.cpp
<#79 (comment)>:
> return path;
- }
-
- auto meta_path = fs.JoinPath(path, "metadata");
- string version_hint;
- if(StringUtil::EndsWith(table_version, ".text")||StringUtil::EndsWith(table_version, ".txt")) {
- version_hint = GetTableVersion(meta_path, fs, table_version);
- } else {
+ } else if (!fs.DirectoryExists(meta_path)) {
+ // Make sure we have a metadata directory to look in
+ throw IOException("Cannot open \""+path+"\": Metadata directory does not exist");
Bit of a nitpick, feel free to ignore this if there is no other reason to
touch the branch
Formatting can be used within an exception message:
This could instead be:"Cannot open \"%s\": Metadata directory does not
exist", path
—
Reply to this email directly, view it on GitHub
<#79 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAARLWK6OVGA63H26NPRJK32BIEQFAVCNFSM6AAAAABRTEMHPCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDINBTGA2DINBSGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I will update the test cases to cover the new flag. Thank you for implementing that! |
Signed-off-by: Teague Sterling <[email protected]>
Signed-off-by: Teague Sterling <[email protected]>
This PR addresses the review comments from @Tishj and @samansmink and updates the test cases to cover the new config variable. However, it seems that the OSX build is failing before getting building or testing the extension. |
I've fixed the problem with CI and opened a PR to your branch here teaguesterling#1 (for some sort of reason i can't seem to push directly to this branch using the GH cli like i normally can) |
Issues/29
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks a lot @teaguesterling
Thanks for all your support on this! I had been looking for a way to familiarize myself with the DuckDB extension system. |
Update the duckdb_iceberg documentation to better describe some of the optional version resolution functionality and document the version guessing behavior provided by duckdb/duckdb-iceberg#79. Also add a table of common parameters at the top of the document for easier reference.
This adds a new (default) version option that can be passed to iceberg table functions.
When
version="?"
, some simple logic will go into effect that does the following:version-hint.text
file, using it if it exists.version_format
patterns, and select the lexicographically "largest" metadata filename (e.g.,v00002-xxxxxxxxxxxxx.metadata.json
will be picked overv00001-xxxxxxxxxxxxx.metadata.json
).There's additional tests needed here and I'm running into a seemingly unrelated parquet bug when testing. There aslo some room for improvement on the logic of how versions are found and selected. However, at the very least, this works as we've all hoped from #29 :