doc: show how to use to git to submit smaller and faster PRs #83839

marc-hb · 2025-01-11T01:04:52Z

The documentation had for a long time a section that specifically recommends to submit "smaller PRs" for review:
https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#defining-smaller-prs

Yet submitters keep submitting large PRs on a regular basis, sometimes very large ones. See a couple of very recent examples below.

(Reminder: submitting a giant, draft PR for pure testing purposes and NOT for review is a perfectly fine)

The "natural" explanation is that submitters optimistically and wrongly believe that dumping a large amount of code at once onto reviewers will be dealt with faster than in smaller chunks. This is most likely a contributing factor but most people should quickly learn from bad experience. Yet we keep seeing large PRs on a regular basis. So there must be other factors too.

Based on personal but fairly extensive git support experience, another top reason is likely git usability and some lack of git knowledge (neither the first nor the last time git usability would have a significant impact)

To help with this, add to the existing git guide the simple command that lets split a large submission in several, smaller PRs. This can only help demystify and promote smaller PRs while barely growing the size of the documentation.

While at it, also add a couple missing benefits of smaller PRs.

Recent examples of large PRs:

In the controversial and giant PR Add initial support for the Raspberry Pi Pico 2 #77368 (comment) the exhausted submitter wrote:

Every time any one person requests a rebase that changes the PR,
unless there's consensus, there's no mechanism (manual/project process
or built into GitHub) to know/prevent a different person from rejecting
the new changes.

That PR had 21 commits (18 in the final version), 82 files changed and 400 (!) comments. The sheer size massively increased the likelihood of the problem described. Re-submitting it in smaller chunks would obviously have mitigated that problem. Yet that PR was never split and stayed huge...?

In this second example, a large PR was submitted with different authors. A disagreement emerged about squashing across different authors:
Introduce Bouffalo Lab SoC's #78795 (comment) If this PR had been split in smaller chunks, then the squashing discussion might have been avoided entirely. Whether squashing is good or bad in this particular case is irrelevant (and already discussed at great in length in doc: contribution guidelines: Clarify and extend #83117). What matters here is: the submitter lost that chance by submitting a larger PR with different authors.

The documentation had for a long time a section that specifically recommends to submit "smaller PRs" for review: https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#defining-smaller-prs Yet submitters keep submitting large PRs on a regular basis, sometimes very large ones. See a couple of very recent examples below. (Reminder: submitting a giant, draft PR for pure _testing_ purposes and NOT for review is a perfectly fine) The "natural" explanation is that submitters optimistically and wrongly believe that dumping a large amount of code at once onto reviewers will be dealt with faster than in smaller chunks. This is most likely a contributing factor but most people should quickly learn from bad experience. Yet we keep seeing large PRs on a regular basis. So there must be other factors too. Based on personal but fairly extensive git support experience, another top reason is likely git usability and some lack of git knowledge (neither the first nor the last time git usability would have a significant impact) To help with this, add to the existing git guide the simple command that lets split a large submission in several, smaller PRs. This can only help demystify and promote smaller PRs while barely growing the size of the documentation. While at it, also add a couple missing benefits of smaller PRs. Recent examples of large PRs: - In the controversial and giant PR zephyrproject-rtos#77368 (comment) the exhausted submitter wrote: > Every time any one person requests a rebase that changes the PR, > unless there's consensus, there's no mechanism (manual/project process > or built into GitHub) to know/prevent a different person from rejecting > the new changes. That PR had 21 commits (18 in the final version), 82 files changed and 400 (!) comments. The sheer size massively increased the likelihood of the problem described. Re-submitting it in smaller chunks would obviously have mitigated that problem. Yet that PR was never split and stayed huge...? - In this second example, a large PR was submitted with different authors. A disagreement emerged about squashing across different authors: zephyrproject-rtos#78795 (comment) If this PR had been split in smaller chunks, then the squashing discussion might have been avoided entirely. Whether squashing is good or bad in this particular case is irrelevant (and already discussed at great in length in zephyrproject-rtos#83117). What matters here is: the submitter lost that chance by submitting a larger PR with different authors. Signed-off-by: Marc Herbert <[email protected]>

kartben · 2025-01-11T04:09:47Z

This is great—it really is—and I will provide feedback on the suggested additions that I don't see any reason not to include.
What this doesn't address (not that I have an obvious solution to this) is the double standard that can be, and I'm pretty sure will still be, observed where "experienced" maintainers/contributors can get away just fine with sometimes really large PRs that touch multiple areas (sometimes for good reason, as maybe said PRs are already as small as can be, but certainly not always), making it less than obvious to newcomers why they would be doing anything wrong after all given what they see on other PRs. You took two PRs touching boards/SoCs and this is certainly an area where large PRs from vendor themselves tend to not necessarily be frown upon. In fact—and not saying this is done intentionally—they sometimes pack so much that the somewhat opposite problem arises: PR gets merged easy-peasy, but contains changes that should have actually been submitted separately and would have had a different assignee, except they didn't see the PR in time...

To put it differently, I find it unfair to tell contributors something along the lines of "RTFM wrt how to 'properly' contribute" when their natural tendency will always be to use common sense and what they observe as being some kind of norm. "Do as we say, not as you see...".

Am I making sense? Thoughts?

marc-hb · 2025-01-11T05:12:53Z

I totally get what you mean and I think what you call "double standards" and "unfair", others would call it... "trust"?

Contributors with a good track record tend to get less scrutiny whereas brand new people rarely get any slack. I don't think this is intrinsically wrong because the risk level is actually different and trusting old timers reduces the total review workload which is always, always a problem[*]. But there is a very fine balance and it can definitely go too far sometimes.

[*] code reviews are like everything else in QA: everyone wants quality but no one wants to pay for it.

What this doesn't address (not that I have an obvious solution to this)

Me neither - I don't think there is any easy way to solve this besides maintainers trying to keep an eye on their own biases and helping each other. This is not an exact science.

If anything at all, I hope this PR only helps mitigate such "unfairness" by clarifying a guideline that should apply to all?

BTW there is a common misconception that newcomers cannot review code. It is true that new reviewers cannot be as thorough but they can still be very useful:

new eyes can help spot tribal knowledge
they can still find issues and lower the total review workload
great way to learn and progress
last but not least: only way to notice and draw attention to such "double standards"!

where large PRs from vendor themselves tend to not necessarily be frown upon.

This is a bit of a special case because... who cares if vendors break ONLY their own products? Small exaggeration to get the point across, it's getting late. You see what I mean.

but contains changes that should have actually been submitted separately and would have had a different assignee, except they didn't see the PR in time...

Now that is 100% wrong and should be flagged - and probably reverted temporarily!

making it less than obvious to newcomers why they would be doing anything wrong after all given what they see on other PRs.

What I'm trying to reinforce in this PR is: smaller PRs are also in the interest of the submitter themselves. Even experienced people who managed to fly quickly "under the radar" don't want to end up wasting time post merge dealing with regressions, drama and reverts. Do they?

marc-hb marked this pull request as ready for review January 11, 2025 01:26

zephyrbot added area: Documentation area: Coding Guidelines Coding guidelines and style labels Jan 11, 2025

zephyrbot requested review from carlescufi, jfischer-no, kartben, keith-zephyr, nashif and simhein January 11, 2025 01:27

zephyrbot assigned kartben and carlescufi Jan 11, 2025

marc-hb mentioned this pull request Jan 11, 2025

doc: contribution guidelines: Clarify and extend #83117

Open

marc-hb requested a review from henrikbrixandersen January 11, 2025 05:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc: show how to use to git to submit smaller and faster PRs #83839

doc: show how to use to git to submit smaller and faster PRs #83839

marc-hb commented Jan 11, 2025

kartben commented Jan 11, 2025

marc-hb commented Jan 11, 2025

doc: show how to use to git to submit smaller and faster PRs #83839

Are you sure you want to change the base?

doc: show how to use to git to submit smaller and faster PRs #83839

Conversation

marc-hb commented Jan 11, 2025

kartben commented Jan 11, 2025

marc-hb commented Jan 11, 2025