Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release] Pyiceberg 0.8.1 #1369

Merged
merged 12 commits into from
Nov 27, 2024
Merged

Conversation

kevinjqliu
Copy link
Contributor

@kevinjqliu kevinjqliu commented Nov 25, 2024

pyiceberg 0.8.1 proposal on devlist

0.8.1 Release Note

The behavior of Table.name is changed to return the table name without the catalog name. This is a broader effort to remove references to the catalog name in pyiceberg.

  • Replace usage of Table.identifier with Table.name which returns the table name without the catalog name
  • Replace the use of a deprecated function (identifier_to_tuple_without_catalog) in pyiceberg; remove unnecessary warnings

Documentation updates are included to reflect the updated process in https://py.iceberg.apache.org/

  • Update “how to release” documentation
  • 0.8.0 post-release steps

Bug fixes

  • Fix add_files for parquet files without column stats
  • Allow leading underscore in column name used in row filter
  • Ignore tables without table_type property from Glue and Hive
  • Write null in manifest list metadata when there is no parent-snapshot-id

Remove upper bound restrictions for dependency libraries; allow early testing of new versions

  • Remove Python library version upper bound restriction; allow Python 3.13
  • Remove fsspec library version upper bound restriction

Commits

36 new commits since the 0.8.0 release.

12 new commits will be included in 0.8.1

  • 11 commits cherry-picked as bug fixes (listed below)
  • 1 commit to bump version to 0.8.1

11 bug fixes (cherry-picked)

acbd071 Write null when there is no parent-snapshot-id (#1383)
bb078cf Add instruction for patch release (#1373)
ab43c6c fix KeyError raised by add_files when parquet file doe not have column stats (#1354)
cc1ab2c Improve documentation for "how to release" (#1359)
64dc6fe Remove Python 3.13 upper bound restriction (#1355)
d86ab6e Allow leading underscore in column name used in row filter (#1358)
7a4734e Replace reference of Table.identifier with Table.name (#1346)
a66ddc0 Ignore tables without table_type from Glue and Hive (#1332)
2cbc77d Drop upper bounds for fsspec and it's implementations (#1341)
7660a5b 0.8.0 post release steps (#1334)
b2f0a9e use the non-deprecated func (#1326)

9 features (not cherry-picked)

b4395ed Extend bugfix report (#1380)
3230186 Update upload-artifact to use v4 (#1371)
3b559c4 Deprecate the use of last-column-id (#1367)
6316900 check mkdocs build strict in CI (#1360)
8e0e6a1 dont override global warning (#1350)
12e87a4 Boto Glue standard retry policy with configuration (#1307)
150fa0c Set default for SortField's transform (#1347)
5f0f770 Remove deprecated datetime functions (#1134)
a90c014 Tests: Bump Spark to 3.5.3 (#1322)

16 misc (not cherry-picked)

7fe8fdc Bump Poetry to 1.8.4 (#1379)
1e9bdc2 Bump pypa/cibuildwheel from 2.21.3 to 2.22.0 (#1374)
8f6a3d4 Bump coverage from 7.6.7 to 7.6.8 (#1375)
d5fa615 Bump mkdocs-material from 9.5.45 to 9.5.46 (#1376)
c21aefd Bump getdaft from 0.3.13 to 0.3.14 (#1361)
e8e0037 Bump pydantic from 2.10.0 to 2.10.1 (#1364)
7a83695 Bump mkdocs-material from 9.5.44 to 9.5.45 (#1351)
15cfc51 Bump pydantic from 2.9.1 to 2.10.0 (#1352)
102a3bb Bump pre-commit versions (#1344)
93ebd39 Bump deptry from 0.21.0 to 0.21.1 (#1342)
a2b11de Bump mypy-boto3-glue from 1.35.53 to 1.35.65 (#1343)
7ecfa71 Bump moto from 5.0.20 to 5.0.21 (#1339)
42145f1 Bump aiohttp from 3.10.5 to 3.10.11 (#1338)
b4c43b0 Bump coverage from 7.6.5 to 7.6.7 (#1329)
1cbf429 Bump mkdocstrings from 0.26.2 to 0.27.0 (#1324)
60800d8 Bump coverage from 7.6.4 to 7.6.5 (#1325)

kevinjqliu and others added 10 commits November 25, 2024 09:27
* Drop upper bounds for fsspec and it's implementations

* Run poetry lock
* Ignore tables without table_type parameters while loading all iceberg table from Glue and Hive catalog (apache#1331)

* Use TABLE_TYPE

---------

Co-authored-by: Wenzhuo Zhao <[email protected]>
* fix Table.name

* replace Table.identifier with Table.name

* add warning filter
* Update parser.py

Allow leading underscore in column name used in row filter.

* Update test_parser.py

* Update test_parser.py

* Update test_parser.py
* Remove Python 3.13 upper bound restriction

* Fix missing poetry.lock file

* Upgrading numpy on the poetry.lock file from v1.26.0 to v1.26.4
* initial update

* edits

* add gpg instructions

* verify artifacts

* add twine not

* grammar

* edits

* remove old artifacts

* update doc workflow action

* and name

* add docs on patch vs major/minor release
…olumn stats (apache#1354)

* fix KeyError, by switching del to pop

* added unit test

* update test

* fix python 3.9 compatibility, and refactor test

* update test
@kevinjqliu kevinjqliu requested a review from Fokko November 25, 2024 17:42
@Fokko
Copy link
Contributor

Fokko commented Nov 25, 2024

Looks good @kevinjqliu! Thanks for creating that nice overview!

@Fokko
Copy link
Contributor

Fokko commented Nov 25, 2024

@kevinjqliu One question, I noticed that this PR targets the pyiceberg-0.8.1 branch, what's the value of this? I missed this step when reviewing: https://github.com/apache/iceberg-python/pull/1359/files#r1857301535

@kevinjqliu
Copy link
Contributor Author

kevinjqliu commented Nov 25, 2024

I assume a new branch is needed for a patch release, but it seems like we just need a tag.
Originally I saw that the iceberg 1.7.1 patch release had its own branch, apache/iceberg#11629. But it seems like the commits were cherry-picked into the 1.7.x branch.

So for patch release, we should

  • cherry-pick into 0.8.x branch
  • create tag for 0.8.1
  • proceed with the rest of the instructions

is that right?

@Fokko
Copy link
Contributor

Fokko commented Nov 25, 2024

I think we should target this PR to the pyiceberg-0.8.x branch I created earlier this week, and then proceed with the instructions.

The instruction states that we create a tag for each of the RCs and when the vote passes. And we need to create a tag when the vote passes (I'll follow up with a PR).

@Fokko
Copy link
Contributor

Fokko commented Nov 25, 2024

@kevinjqliu kevinjqliu changed the base branch from pyiceberg-0.8.1 to pyiceberg-0.8.x November 25, 2024 21:12
@kevinjqliu
Copy link
Contributor Author

changed the base branch to apache:pyiceberg-0.8.x

I want to include #1373 in this too so the instructions are up to date

* add instruction for patch release

* create branch from tag
@kevinjqliu
Copy link
Contributor Author

thanks @Fokko cherry-picked #1373 as well

@kevinjqliu
Copy link
Contributor Author

Added #1383

@kevinjqliu kevinjqliu merged commit d24e0c9 into apache:pyiceberg-0.8.x Nov 27, 2024
7 checks passed
@kevinjqliu kevinjqliu deleted the pyiceberg-0.8.1 branch November 27, 2024 18:55
@kevinjqliu kevinjqliu restored the pyiceberg-0.8.1 branch November 27, 2024 18:58
@kevinjqliu kevinjqliu deleted the pyiceberg-0.8.1 branch November 27, 2024 19:23
@kevinjqliu
Copy link
Contributor Author

accidentally squashed the commits... redoing in #1384

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants