[RFC] docs/dev-proc/standard-release-process.md: Update release process #944

miczyg1 · 2024-11-15T09:25:58Z

The idea here is to make it more clear that having a release branch doesn't necessarily fork in our favor for the following reasons:

Release may take sometimes months. When we finally merge the release branch to the main branch, a lot may have changed, so what lands in the main branch is far from what was released.
Due to 1, the code landing in the main branch may not even work anymore. This makes testing various changes between release cycles harder by requiring additional effort to restore working code base for given platform. Having a release branch for platform (and keeping it until final release) minimizes problems during release cycles, but only postpones the problems describes in the first two sentences. Do we really want to have a potentially broken code for platform right after release is out and code is merged? I don't think so.
IMO it is better to make smaller iterations with release candidates but merged to the main branch once some subset of bugs is fixed. Tags should only happen on the main branch after branch for RC is merged. Having all tags on the main branch opens the door for easier changelog generation based on git history.
Recent experiment with NVC release from a branch living for months showed its weaknesses. Constant rebasing is tiresome. Smaller iterations with RCs merged to main branch should help with that. There were also other problems, like unable to make quick changes and leverage CI to build binaries for external testing, due to never ending merge conflicts on release branch.
Sometimes (or often) there may be multiple concurrent release cycles. If each platform undergoing a release is living on a release branch, the disorder only grows. Especially when things get merged after release.
A very good example of disorder with branches is the OSFV repo. I saw the same platform being validated on multiple branches, each having different set of fixes. Figuring out what source given tests were run becomes harder. Fixes are scattered throughout many branches and pull requests, some of them also living for months (and probably suffer from similar problems described above). main branch isn't really main, but develop is instead. All of these little things contribute to the disorder and never-ending instability of the test environment. coreboot may end up the same way if we keep managing source like that.

As PR states, it is an RFC. Comments and constructive critique appreciated.

PS: I'm aware of different approaches being discussed, such as patch-queue and upstream-first.

Signed-off-by: Michał Żygowski <[email protected]>

pietrushnic · 2024-11-15T10:12:23Z

Dasharo/dasharo-issues#310 cc

krystian-hebel · 2024-11-25T17:44:09Z

There is one obvious(-ly impossible) solution: don't do multiple releases at the same time.

As we obviously can't afford that, perhaps we should focus on which parts of the release are blocking. It isn't development, it may happen on dedicated branch (call it develop, release, rc, this is just a name so it doesn't matter that much) and it doesn't get in the way of other releases.

It also isn't the rebase itself, it would have to be done anyway. In addition, with properly split commits rebasing isn't that hard, unless there was a major redesign of common code - but in that case rebase would be needed anyway, in the next release at the latest, but most likely before the merge. This is a cost that will have to be paid, and it should be included in the estimates. Smaller iterations will make it appear smaller, but I'm not sure if the total cost would be smaller. After all, RC preparation cost doesn't consist of only the developers time.

Testing time IMHO is the least predictable (but not necessarily the longest) part of the release, for various reasons, including broken or misplaced hardware (DUT, RTE, pomona clip), lack of required OS on the device, issues with OSFV itself, need for manual retests etc. Minimal regression was created partially to ensure that the environment is ready for full set of tests, but I don't think it is widely used for that purpose.

Then there is the whole "housekeeping" part, like preparation of release notes, newsletter, publishing binaries etc. There are tools that may help with changelog automation, but most of them expect to work with linear history, which current process doesn't always provide. The fact that a big part of changes happens in edk2 also doesn't help.

In general, I like the idea of having RCs on main branch, but not necessarily at the cost of having many more smaller RCs. I'd slightly modify it to this:

Prepare the scope of the release.
Create a branch for RC from dasharo.
Develop things included in the scope.
1st part of review: reviewer checks if the scope has been fulfilled, as well as general code quality. No merge at this point, but rebases may happen to make further work easier.
If that part of review passed, inform PM that the release testing can be scheduled soon.
Rebase if needed, run minimal regression on all platforms included in the release, report and/or fix any hardware issues, install of OSes required for full regression.
When PM gives green light:

rebase, 2nd review iteration,
merge and put an -rcN tag on the main branch,
forbid merges of PRs to the main branch from now on.

These can happen in parallel:

Run the full regression tests.
Prepare or update release metadata (release notes, newsletter).

Based on the test results, decide whether this RC becomes full release.

Not a full release:
- allow merges of other PRs,
- set new scope to a list of bugs that need to be fixed,
- go to point 2.
Full release:
- push commit with changed LOCALVERSION (and DRIVERS_EFI_MAIN_FW_VERSION, if applicable),
- put the full release tag on the new commit,
- allow merges of other PRs.

Publish the release.

Points 4-7 and 9 are new or modified, the rest is similar to what we have. This addresses few things:

both release and RC tags are on the main branch
- linear history
- easier changelogs
- bisect works (at least as long as every commit is buildable)
no other commits between final RC and proper release, other than change to the config file(s)
all RCs are created from fresh dasharo
- no need to implement the same fix on multiple branches
- some fixes could (and should) be merged to dasharo directly, instead of RC branches
higher level of certainty that the main branch produces working image

Cons:

main branch is frozen between points 7 and 9
- may be hard to enforce, not sure if GH can reliably help with that
- any "accidental" merge would require either a rewind of main branch, or creation of another RC
- any conscious decision to merge despite the lock-down (e.g. security issue so severe that we can't wait with the fix, or broken submodule that breaks build) generates another RC, but usually it would be needed anyway
regression tests become the bottleneck
- tests themselves may need optimizations (time, reliability and degree of automation)
- point 6 is critical to catch potential hardware issues before the branch gets frozen

docs/dev-proc/standard-release-process.md: Update release process

dce8387

Signed-off-by: Michał Żygowski <[email protected]>

miczyg1 requested review from mkopec, macpijan and krystian-hebel and removed request for mkopec November 15, 2024 09:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] docs/dev-proc/standard-release-process.md: Update release process #944

[RFC] docs/dev-proc/standard-release-process.md: Update release process #944

miczyg1 commented Nov 15, 2024

pietrushnic commented Nov 15, 2024

krystian-hebel commented Nov 25, 2024

[RFC] docs/dev-proc/standard-release-process.md: Update release process #944

Are you sure you want to change the base?

[RFC] docs/dev-proc/standard-release-process.md: Update release process #944

Conversation

miczyg1 commented Nov 15, 2024

pietrushnic commented Nov 15, 2024

krystian-hebel commented Nov 25, 2024