-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add generated_data to repo to facilitate CI #59
Conversation
…unning the tests in CI (since generating the data requires to equip the test runners with pyspark, which at least mother prefers not to) - the data is minimized to 5MB, by setting the scale factor 10x smaller, and from expected_results/ only keeping last/ this motivates rewriting some testcases that used expected_results/q08 into expected_results/last (which is the same) - removed one testcase, which tests if we get an error when parquet is not loaded (in MotherDuck, when we run iceberg in so-called "remote_only" mode, this will actually not raise an error) it was kind of a trivial test, so I hope this is OK..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! added 1 small comment
Makefile
Outdated
@@ -9,11 +9,11 @@ include extension-ci-tools/makefiles/duckdb_extension.Makefile | |||
|
|||
# Custom makefile targets | |||
data: data_clean | |||
python3 scripts/test_data_generator/generate_iceberg.py 0.01 data/iceberg/generated_spec1_0_01 1 | |||
python3 scripts/test_data_generator/generate_iceberg.py 0.01 data/iceberg/generated_spec2_0_01 2 | |||
python3 scripts/test_data_generator/generate_iceberg.py 0.001 data/iceberg/generated_spec1_0_01 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the name of the directory is now no longer correct which could be a little confusing
windows CI is failing due to some upstream issue, however the tests also seem to fail due to only the json metadata files being added and not the avro and parquet files |
- also in Makefile and in the tests (name, and paths used)
apologies for that messup. I have made the name change and made sure all files are pushed now. |
Thanks for the changes! looks good! |
add generated_data to repo to facilitate CI
Commit the generated data into the repo, because this facilitates running the tests in CI
(since generating the data requires to equip the test runners with pyspark, which at least motherduck prefers not to).
The data is minimized to 5MB, by setting the scale factor 10x smaller, and from expected_results/ only keeping last/.
This motivates rewriting some testcases that used expected_results/q08 into expected_results/last (which is the same)
Removed one testcase, which tests if we get an error when parquet is not loaded (in MotherDuck, when we run iceberg in so-called "remote_only" mode, this will actually not raise an error).
It was kind of a trivial test, so I hope this is OK..