Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-3091: Add verification guide and .rat-excludes.txt for release #3101

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Dec 9, 2024

Rationale for this change

There is currently no guide on how to verify. Having a guider will encourage people to verify the release.
This guide should be linked on the Voting thread.

What changes are included in this PR?

Added documentation about how to verify and add .rat-excludes.txt file with the files that do not contain a current license but we are fine with those. Those files are mainly testing .avsc, .parquet , .gitignore and PULL_REQUEST_TEMPLATE.md files.

Are these changes tested?

I've validated all steps localy

Are there any user-facing changes?

No, just more docs.

Closes #3091

@@ -498,6 +498,7 @@
<consoleOutput>true</consoleOutput>
<excludes>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure we can change excludes for excludesFile (https://creadur.apache.org/rat/apache-rat-plugin/rat-mojo.html#excludesFile) but I am unsure why some of the regex on the individual excludes don't seem to work with excludesFile when I ran:
java -jar apache-rat-0.16.1/apache-rat-0.16.1.jar -a -d apache-parquet-1.15.0.tar.gz -E $PARQUET_SRC_FOLDER/.rat-excludes.txt
I'll investigate how to consolidate those two lists

@@ -91,3 +91,61 @@ Merge hash: 485658a5
Would you like to pick 485658a5 into another branch? (y/n):
```
For now just say n as we have 1 branch

# Release Verification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Is it better to add this to the parquet site: https://github.com/apache/parquet-site/blob/production/content/en/docs/Contribution%20Guidelines/releasing.md?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the "release verification" should be moved to the parquet-site repo instead. Then, we even can have a link to this section in the VOTE email template.

I'm not sure why we need to check for the license headers separately in the tarball. It is already in the build process so we shall not have license header issues in the repo. What I usually do instead is comparing the content of the tarball with a freshly cloned repo set to the release RC tag. There should be no differences.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we need to check for the license headers separately in the tarball. It is already in the build process so we shall not have license header issues in the repo.

I see, we can probably remove explicitly checking license headers. This is something that in general I've seen all projects do as part of their release verification process and something that I would say falls under the "verify that they meet all requirements of ASF policy on releases as described below" point on the ASF release guide. But it is true that as soon as there hasn't been any change as those are already done feels unnecessary.

How do you perform the comparison between the content of the tarball with a freshly cloned repo set to the release RC tag? Do we want to add that as a step?

I will move the PR to the parquet-site one, I might take a couple of days as I am slightly busy at the moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you perform the comparison between the content of the tarball with a freshly cloned repo set to the release RC tag? Do we want to add that as a step?

I use meld as diff tool but I don't think it should be added. Probably GNU diff can be configured to work on directory trees.

I will move the PR to the parquet-site one, I might take a couple of days as I am slightly busy at the moment.

I don't think we need to hurry. Please refer the parquet-site PR here so anyone call follow up.

Thanks a lot for working on this!

Copy link
Contributor

@Fokko Fokko Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the parquet-site is more suitable for these steps.

Regarding the license headers. It is part of the verification; having the rat check is just one way. All code must have an ASv2 license header. It would also be good to do manual checks when a new version is being released, as the RAT check might also miss something.

Thanks for working on this, this is really great 🙌

@wgtmac
Copy link
Member

wgtmac commented Dec 11, 2024

cc @Fokko @gszadovszky

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add documentation about how to verify the release
4 participants