[policy] Decide on format of commit message

Created on 8 April 2024, 10 months ago

Problem/Motivation

The commit message format is not easy to scan to find the reason the change was made.

From this comment 🌱 [policy, no patch] Determine format for commit credit for individuals/organizations/customers Needs review drumm states the GitLab is unlikely to support a list of user names. GitLab provides Commit message templates which use a defined set of variables.

There have been several issues about the format of the commit message, some related to credit and some not. However, the credit system is implemented on Drupal.org and doesn't not rely on parsing the commit message. Thus, the message can change.

Steps to reproduce

Proposed resolution

TBA
The solution can be multiple lines, agreed to in 🌱 Discuss if git commit messages should be multiple lines Needs review

Remaining tasks

Discuss
Reach agreement
Implement the plan

User interface changes

API changes

Data model changes

Release notes snippet

🌱 Plan
Status

Active

Version

11.0 🔥

Component
Other 

Last updated about 1 hour ago

Created by

🇳🇿New Zealand quietone

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @quietone
  • 🇺🇸United States dww

    To me, the two most essential things we need are:

    1. A valid 1-line summary that shows you what the commit changed, without an incomprehensible wall of usernames.
    2. That summary must contain a way to find the issue where the change was proposed, discussed, implemented.

    https://www.conventionalcommits.org/en/v1.0.0/ is gaining popularity, and there's a lot to recommend it. It basically handles point #1, but doesn't solve for #2. Here's the TL;DR of the spec:

    <type>[optional scope]: <description>
    
    [optional body]
    
    [optional footer(s)]
    

    In a Slack thread on this topic a while ago, @larowlan and I decided that we could still put the issue ID into the "description" area without breaking anything. For example, here's a recent core commit:

    Issue #3420997 by sorlov, quietone, DanielVeza, smustgrave, alexpott, mstrelan: Convert MediaSource plugin discovery to attributes
    

    In the new world, this would become:

    task(media): #3420997: Convert MediaSource plugin discovery to attributes
    
    Authored-by: sorlov <xxx@no-reply...>
    Authored-by: quietone <xxx@no-reply...>
    Authored-by: DanielVeza <xxx@no-reply...>
    Reviewed-by: smustgrave <xxx@no-reply...>
    Reviewed-by: alexpott <xxx@no-reply...>
    Reviewed-by: mstrelan <xxx@no-reply...>
    

    Yes, I fully understand that we don't need the credits embedded in the commit history anymore (and haven't for a long time). But part of the joy of Git is that you can work offline with it. I don't want to need a live internet connection to d.o to see who was involved in a given change. I still think the history should speak for itself, as much as possible.

    I'll also note that the type aspect of conventional commits is exactly what I've been clamoring for in other issues on this topic, to stop using the meaningless "Issue" bytes, and replace that with the category of the issue (bug, task, feature, etc). In conventional, the type is supposed to be "fix" for bug and "feat" for feature, but you can use whatever other values you want, e.g. "task". I'm fine sticking to the standard for "fix" and "feat", although personally, I think "bug" and "feature" are more self-documenting. 🤓

    The optional scope would map to an issue's Component. That's a nice touch, but I'm not as set on including that as the other aspects.

    So that's my modified conventionalcommits proposal. Not putting it in the summary yet since it's just me, but putting it here for consideration.

    Thanks!
    -Derek

  • 🇫🇷France nod_ Lille

    I was pleasantly surprised by Typo3 commit messages that link to the review page (their setup is different than ours so grain of salt and everything). They're nice and informative, for example: https://forge.typo3.org/projects/typo3cms-core/repository/1749/revisions...

  • 🇳🇿New Zealand quietone

    Moved the suggestion in #3 to an option in the issue summary.

    Where does 'media' come from in "task(media):" ?

  • 🇺🇸United States cmlara

    For me a good commit message should tell you the main points you need to know, and should be self sufficient. The assumption for commit messages is that D.O. looses all of its data tomorrow and we have to rebuild from just the Git Logs.

    The issue number I would suggest would be better placed just above the signoffs and could use the classic "Closes: #isssue or !mr" syntax. It is useful if you need to get into the deep nitty gritty of an issue discussion, however you should only need to do that after fully reading the commit message that may answer your question.

    I believe we absolutely should drop usernames from the first line.

    Yes, I fully understand that we don't need the credits embedded in the commit history anymore (and haven't for a long time). B

    This does help maintain the copyright audit logs which are necessary for licensing. Our current use of the MR's from a shared issue fork means only the person to open the MR is credited as the author which is a legal problem long term.

  • 🇺🇸United States dww

    Re #5: Thanks for putting #3 into the summary.

    Where does 'media' come from in "task(media):" ?

    The spec:

    <type>[optional scope]: <description>
    

    My interpretation from #3:

    The optional scope would map to an issue's Component. That's a nice touch, but I'm not as set on including that as the other aspects.

    Re: #6: Maybe it's only because I've been doing this way in Drupal and every other kind of thing I code for decades, but having the "issue" ID prominent in the summaries is essential. There are lots of places in Git (CLI or otherwise) where *all* you see is the 1-line summary. I'd hate to not know the unique ID for the issue when scanning those.

  • 🇳🇱Netherlands bbrala Netherlands

    At our company we also use conventional commits in a form that DWW suggests. Only adjustment we use is brackets around the issue number

    [jira-ticket] <type>[optional scope]: <description>
    [optional body]
    [optional footer(s)]
    

    For example:

    [SWIS-1234] feat: add blog overview
    
    [SWIS-1234] refactor: show a message when there are no results
    
    Some categories don't have any blog posts at the moment.
    

    So i really really like the suggestion to switch to that. There is a few things this could also help with.

    For realease management you could just check all commits with feat and find a list of all added features that release. Same for different issue types. There are even automated tools that could generate that list for you.

    Other than that, the suggestion by DWW also keeps all the info we have right now, which is great to have the complete history in git.

    Using the scoping could be usefull, this could be subsystem. Just trying to think what would happen if it's a global think, not directly tied to a subsystem.

    I have not looked at how automatable it is, but i think this could actually be built from most information that an issue will have.

  • 🇳🇱Netherlands bbrala Netherlands

    Real log exmaple for reference:

  • 🇺🇸United States cmlara

    Fair point on the one line review.

    Interesting how different development lines happen where some
    have always done it first line and some never have.

    Last sample by @bbrala does look fairly clean IMHO and helps visually focus on the parts one is most interested in.

  • 🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia

    Another example of a commit message that has no room for a meaningful message because it's just a list of contributors:

    commit dae385398eb9860650b50f2aac087c33ee35083b (HEAD -> 11.x, origin/11.x)
    Author: Lauri Eskola <lauri.eskola@acquia.com>
    Date:   Sat Apr 27 00:43:13 2024 +0300
    
        Issue #3438895 by finnsky, larowlan, plopesc, m4olivei, ckrina, mherchel, catch, alexpott, nod_, deviantintegral, jwitkowski79, nod_, jponch, jwitkowski79, rkoller, Ana Barcelona, YurkinPark, finnsky, javi-er, alvarito75, ctrlADel, AaronMcHale, Emma Horrell, akshayadhav, claireristow, baluv3, bronzehedwick, NikMis, deviantintegral, hot_sauce, kostyashupenko, gnuschichten, keyboardcowboy, gdd, silviu, Prashant.c, ehsann_95, pjudge, rkoller, Shreya_th, starshaped, bal krishna, meeni_dhobale, mherchel, jo→
    
  • 🇳🇿New Zealand quietone

    I have interpreted #6 and #8 and added them as options to the issue summary.

  • 🇳🇱Netherlands bbrala Netherlands

    For option #8 we might need a # in front to supprt issue linking in gitlab. See: https://docs.gitlab.com/ee/user/project/issues/crosslinking_issues.html

  • 🇳🇱Netherlands bbrala Netherlands
  • 🇳🇱Netherlands bbrala Netherlands

    Ok, that breaks the issue summary because of d.o. replacement... o_O

    Not sure how to communicated that then.

  • 🇳🇱Netherlands bbrala Netherlands

    Found a way (non existing node)

  • 🇺🇸United States bradjones1 Digital Nomad Life

    I am in favor of a first line that is descriptive of the thing being done, and an issue number. I _think_ it would need to be in the form of "Closes #XXX" for GitLab to auto-close the issue, however we might decide we don't need that magic.

    The authored-by annotations are great for linking users to commits and could be used for commit credit. Even if it's not necessary for commit credit (I know that's being wired up somehow) it's still helpful in so far as it is an audit log of who was involved, and even can flow through to other things like github where we have a mirror, and you would get "credit" in their glowing green bar thingie too.

  • Status changed to Needs review 4 months ago
  • 🇳🇿New Zealand quietone

    There are few people discussing this but of those that have I think there is a preference for Option B. I have updated the issue summary with that as the proposed resolution.

  • 🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia

    Option B +1

  • 🇩🇪Germany FeyP

    Looking at commits in GitLab, it would be great if I didn't have to expand the commit message to get to the most important information.

    I like that authors and reviewers are moved into the description and that the information is presented in a more structured manner. This information is valuable to have, because it gives credit where credit is due, but it is not the information I'm immediately interested in when looking at commits.

    I'd like to have the following information on the first line:

    1. The issue number
    2. A meaningful commit summary

    The issue number should be in the first part of the first line. That way, I can click right through to the issue without having to expand the commit message.

    Ideally, the commit summary should start with an imperative verb in upper case (e.g. Add xyz, Remove xyz, Convert xyz).

    Honestly, I don't care at all about the conventional commit stuff, I don't need that. It just takes away valuable space on the first line in the part of the commit message that is immediately visible without expanding the message and is hard to correct later, if there was a mistake and it is wrong for some reason (e.g. a change not correctly identified as a breaking change on commit). It means we have to be even more careful checking issue metadata before marking an issue as RTBC, right now we often don't even manage to adjust the issue title into a meaningful commit summary. And I really don't like the "all lower-case" commit summary line in conventional commits, although that doesn't seem to be part of the standard, it is just in all the examples?

    W/r/t #40, the issue closing pattern in GitLab is just a regular expression that can be configured. The default issue closing pattern matches a whole bunch of key words already, but none of the proposals will auto-close issues in GitLab by default. B) is close to what would be needed, but won't match because of the colon. But I'm also not sure that we really want auto-closing for issues.

    Looking at the proposals, A) and C) match some of my requirements. B) does not.

  • 🇦🇺Australia acbramley

    I very much agree with #20, it would be a big loss IMO to lose the issue number from the beginning of the commit.

    IMHO "task(media):" is just cruft that I don't care about. Much more important to have the issue number in that place as 99% of the time when I'm looking at a commit message it's to find the issue that it belonged to.

    Other than that, I'm a fan of Option B but please reconsider moving the issue number

  • 🇸🇰Slovakia poker10

    Currently the issue number is the most important part of the commit message in our company, because it connects with the specific issue, where all details are stored. Issue titles can be sometimes pretty general (or not fully indicative), but the issue numbers is the one item here, which can provide all relevant information.

    So I agree with previous comments to keep the issue number in the beginning of the commit, thus we prefer option A (or B with the issue number in the beginning).

  • 🇺🇸United States dww

    Huge -1 to B. As I and others have said, the only things I REALLY care about in the first line are the issue ID and a short, meaningful summary of the change.

    I also don’t really care about conventional commits as such. However, I think (task|feat|bug) is vastly more useful in the summary than “Issue”.

    If the “scope” part of conventional commits is considered noise, let’s drop it. That was always intended to be optional. Added option D to the summary, which is basically #8.

    I’m very strongly in favor of D over B.

    Thanks,
    -Derek

  • 🇬🇧United Kingdom joachim

    I agree with all the comments about the issue number being the most important. That's the thing I want from a git log, if I'm doing `git log -p` or `git blame` then `git show`.

    I'm not sure about the usefulness of the scope part in option C. Would it be the module name? What about the core components? How do we distinguish between Core-Core and Core-Components, especially when they have the same name (e.g. there's Core/Plugin and Component/Plugin).

    Generally, if I want to see what parts of Drupal a commit touched, I do `git show --name-only`.

    +1 to D.

  • 🇺🇸United States fathershawn New York

    Thank you @dww for Option D. The scope is definitely noise in contrib as it would be the same for every commit in the project. Even in core, anything not in a module might be scope (core)?

    +1 for Option D.

  • 🇧🇪Belgium BramDriesen Belgium 🇧🇪

    Big + for conventional commits! We're using the one from Angular in our projects at work. That paired with semantic release is really super easy to work with!

    I think option D with an optional scope would work best. So you could still do things like "tagging" submodules.

    [#999999] task(submodulename): Changed to dependency injection
    
  • 🇺🇸United States dww

    Re #26: that’s exactly option C.

    But folks are worried about “optional scope”. I tend to agree it’s mostly going to be noise, often duplicate with the summary itself (Eg the summary should tell you basically what part(s) of core are changing), and if you really need to know, we’ll have it in the diff itself and the linked issue.

    That said, I’d be happy with D, C or A. Anything but B. 😂

  • 🇩🇪Germany FeyP

    That said, I’d be happy with D, C or A. Anything but B. 😂

    Same here, preference now for D, so thanks for adding that :). My comment was triggered by comment #18 really, which stated that the consensus seemed to be option B. Until then, I was just a quiet follower ;).

  • 🇺🇸United States dww

    P.s. I’m -1 to trying to have GitLab auto close issues based on commits. I don’t think that’s a useful goal for us, especially in core. We often need at least 2 different MRs for any given change depending on backports. There are change records to review and publish. While auto-close might make sense for smaller contribs (who can already format their commits however they/we want), this issue is for core, and I don’t think we ever can rely on auto close just because any single commit was pushed.

  • 🇺🇸United States dww
    • Removed the duplicate B from proposed resolution and set that to “TBD”.
    • Added a little clarification to D for context.
  • 🇧🇪Belgium BramDriesen Belgium 🇧🇪

    Re #27: Ah ok, that was not clear to me in the format of the IS as this is not following the way conventional commits is doing things.

    The context you added to the comments where this is in that format helps! Thanks ;-).

  • 🇭🇺Hungary Balu Ertl Budapest 🇪🇺

    I welcome this discussion as the Drupal Core (and many contrib) project's commit message customs definitely can be improved.

    Moving as one

    I see that this issue has been created for Drupal Core. However, in my view introducing a new pattern only for Drupal Core but leaving out the contrib space would inevitably alienate the two from each other. Many community members who contribute to both kinds of projects (therefore composing commit messages for their MRs) would have to keep in mind two different customs. Moreover, further development would be needed on Drupal․org to display the suggested commit message at the bottom on each issue page.

    Whichever proposal wins I'd suggest tying its introduction to the date of a Core version release (optimally a major one but if D12 is too far away on the timeline, then the next minor). Regarding the introduction to contrib space, we can choose a calendar day to appoint as the recommendation being in effect (eg. from Jan 1, 2025).

    Practicality first, fanciness after

    I can imagine that adopting the standards of Conventional Commits might ease newcomer contributors to join the Drupal community from the global developer society by decreasing the number of “drupalisms”. However, I also agree with previous comments prioritising the practical value of each message part to our needs rather than just bringing in some fancy brand name to put on the list showing off how standardised the Drupal project is.

    Bear with the people

    Let's face the fact: most Drupal contributors have no practice yet in writing commit messages. In our decade-old patch uploading workflow, a community member trying to fix the code was not required to compose a commit message but to share the suggested code change only in its raw form. Only a smaller subset of our community, the maintainers were dealing with commit messages while the significance of keeping commit history clean remained hidden for the wider audience.

    During this transitional period shifting over to a more modern MR-based workflow, I see in my everyday work even this change causes friction for many. Therefore we need to keep our commit message pattern dead simple as possible if we expect past and current contributors to adopt something totally new for themselves.

    Decrease focus on scope

    Partly based on my previous point, regarding the optional scope (the (media) part in the examples): if something causes questions (comment #5 🌱 [policy] Decide on format of commit message Active ) within the circle of experienced developers then projecting it to the scale of the wider masses will definitely cause general confusion.

    If any maintainer sees added value in including this information in the commits of their projects then offer them the opportunity to do so at the end of the commit description. This way commits will still remain searchable but do not spoil the visual consistency when multiple commits get listed some containing and some not.

    Do not “hardcode”* changeable information

    Regarding the (task|feat|bug) type: if I understand correctly this information would be in connection with the value of the Category field on the issue edit page.

    If my assumption is correct, then I find it unfortunate to copy over data from an easy-to-change source system into a less easily changeable system. This can (and I bet will) cause discrepancies leading to confusion again. IO would love to see some numerical statistics or hear maintainers' voices about how important role this issue category field plays in their everyday job. Such feedback would help us in this discussion a lot to make a more prepared decision about the necessity of sacrificing these 5 characters in a valuable spot of the commit message.
    *Yes, I know about amending to existing commits. This is why in quotation marks.

  • 🇭🇺Hungary Balu Ertl Budapest 🇪🇺

    Updating IS with option E.

  • Status changed to Needs work 4 months ago
  • 🇵🇪Peru marvil07

    Current commit message format

    There use to be documentation around commit message format, looking a bit around that use to be node nid 52287, but now that redirects to node nid 3156588, which is about issue credits.
    AFAIK that format is the following.

    Issue #[issue number] by [comma-separated usernames]: [Short summary of the change].
    

    The current official drupal core hint about what the commit message format should be , following Granting credit to issue contributors is:

    Commit the issue using the default commit message in the "Credit & committing" widget

    That is currently part of drupal.org custom code that produces a template for maintainer, including core maintainers.
    So a message like the example on the description is produced.

    Issue #3420997 by sorlov, quietone, DanielVeza, smustgrave, alexpott, mstrelan: Convert MediaSource plugin discovery to attributes
    

    The problem

    The commit message format is not easy to scan to find the reason the change was made.

    Yes and no.

    Commit messages have historically been succinct.
    The pattern in Drupal core development commit messages has been for almost all of its existence, especially since Drupal 5.x development around 2006, relying on drupal.org issues as the main way to have information around a given change.
    The issue usually has most details, especially around decisions, approaches considered, and more.

    So, for some of us it is just a reflex to go to the related issue for any commit we want to know more about.
    Granted, that is a something that needs to be learned if it is the first time you look at drupal core git history.
    That may not be the best developer experience, but the path is quite clear: go to the issue referred on the commit message.
    Also, the issue is the first of second thing on the first line of the commit, so it is part of the git subject, shown on most of the tooling that reads git.

    The other thing here, and likely the reason not directly highlighted on the issue description is that our current format implies listing all contributors of an issue, and that list can be long.
    That is a practice that comes from times where we did not have issue credits, #2288727: [meta] Provide credit to organizations / customers who contribute to Drupal issues , which is now the used solution.

    Drupal is a really special project, in which it is the norm that multiple people help a given issue.
    That is great, and helps in many ways.
    In fact in the last year more than 79% of the commits involve 2 or more mentions vs. the rest being one mention; which usually means two, i.e. the author and the reviewer/commiter. See the end of the comment for the source of this affirmation.

    If the amount of names, especially because they are before the actual description, is the actual problem that wants to be solved here, yes, moving that into git trailers is a good idea.

    Instead if by "not easy to scan to find the reason the change was made" we think about self-containment of a given commit message, then we may need some extra steps.
    Notice that other major software projects use self-contained commit messages, more on that below.

    Self-containment of a commit message can help to actually read commit messages without looking externally to drupal.org issues.
    That would be really great to have, but it would be a change in process.
    Given the collaborative nature of the drupal community, building that summary as part of the issue process could be a good idea.
    The simplest way is likely to use a new section on the issue summary description to do that.

    On proposed solutions

    All the existing alternatives proposed above use Conventional Commits 1.0.0 as its base, but they do not really address how we would like to embrace that in detail.
    Yes, there is one example per option; but that is far from a clear guideline, especially about git trailers.

    Also, for this to happen, the drupalorg project related code needs to be changed to provide the desired format.
    Otherwise commiters will need to do the user formatting on git trailers manually.

    Namely, what is missing would be to establish how to use \<type\>, [optional body], [optional scope], and [optional footer(s)].

    We should decide on the set of types to use here, or a guideline on what to use.
    E.g. embracing other project set or not.

    The optional scope is also not clear.
    The examples point to "media", which is the main module changed on drupal core git commit 7201d4bfba4aaedf5ff68ab9ccfa035554fa88be, from where the commit messag sample seems to come from.
    The natural fit may be the core subsystem, as defined over MAINTAINERS.txt, but that may needs some standardization of names.

    If we want to make the commit message more self-contained, then the body should be filled most of the time.

    The optional footers also need to be defined, conventional commits point to a custom BREAKING CHANGE: \<description\> footer and then point to other git trailers, which mean any git trailer.
    Examples suggest to use Authored-by: and Reviewed-by:.

    The list of possible git trailers should be defined, so I guess that is reason enough to move this issue back to NW.

    I would suggest to embrace other major project git trailers instead of creating new ones.
    Both linux kernel and git projects use Reviewed-by:, but Authored-by: is not used.
    For that linux kernel project use Co-developed-by:, and git uses Co-authored-by:.

    Apart from those, we may want to define other possible trailers, and if we want to embrace or not BREAKING CHANGE: \<description\>.

    To keep or not to keep issue number

    I think this depends on what we want to do on commit message self-containment.

    If we want to keep the existing approach to have most of the information on the issue, then the issue number should still be the first thing on the commit message.

    If if we move to self-contained commit messages, in our current process, the issue discussion is still useful to reference, so keeping it at least as a git trailer is a good idea.

    It may also be worth to point to a previous relevant comment around why Issue may still be useful:

    Because prepending "Issue" adds some semantic value to the self-contained commit message (even if it's implied, it makes explicit the fact that the following number refers to an issue).

    On some inspiration from other major projects

    Other major projects could be a source of inspiration on how to do things.

    The following links point to how the linux kernel and the git projects define their git trailers, along with more information about how they structure the commit messages.

    https://www.kernel.org/doc/html/latest/process/submitting-patches.html#w...
    https://git-scm.com/docs/SubmittingPatches#sign-off

    A couple of examples of each of them follow.

    netlink: specs: mptcp: fix port endianness from linux's commit 09a45a5553792bbf20beba0a1ac90b4692324d06

    netlink: specs: mptcp: fix port endianness
    
    The MPTCP port attribute is in host endianness, but was documented
    as big-endian in the ynl specification.
    
    Below are two examples from net/mptcp/pm_netlink.c showing that the
    attribute is converted to/from host endianness for use with netlink.
    
    Import from netlink:
      addr->port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]))
    
    Export to netlink:
      nla_put_u16(skb, MPTCP_PM_ADDR_ATTR_PORT, ntohs(addr->port))
    
    Where addr->port is defined as __be16.
    
    No functional change intended.
    
    Fixes: bc8aeb2045e2 ("Documentation: netlink: add a YAML spec for mptcp")
    Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
    Reviewed-by: Davide Caratti <dcaratti@redhat.com>
    Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
    Link: https://patch.msgid.link/20240911091003.1112179-1-ast@fiberby.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    

    wrapper: introduce log2u() from git's commit d343068e4abc5e43d1ef1d5fed42bf4d7aa8cff4

    wrapper: introduce log2u()
    
    We have an implementation of a function that computes the log2 for an
    integer. While we could instead use log2(3P), that involves floating
    point numbers and is thus needlessly complex and inefficient.
    
    We're about to add a second caller that wants to compute log2 for a
    `size_t`. Let's thus move the function into "wrapper.h" such that it
    becomes generally available.
    
    While at it, tweak the implementation a bit:
    
      - The parameter is converted from `int` to `uintmax_t`. This
        conversion is safe to do in "bisect.c" because we already check that
        the argument is positive.
    
      - The return value is an `unsigned`. It cannot ever be negative, so it
        is pointless for it to be a signed integer.
    
      - Loop until `!n` instead of requiring that `n > 1` and then subtract
        1 from the result and add a special case for `!sz`. This helps
        compilers to generate more efficient code.
    
    Compilers recognize the pattern of this function and optimize
    accordingly. On GCC 14.2 x86_64:
    
        log2u(unsigned long):
                test    rdi, rdi
                je      .L3
                bsr     rax, rdi
                ret
        .L3:
                mov     eax, -1
                ret
    
    Clang 18.1 does not yet recognize the pattern, but starts to do so on
    Clang trunk x86_64. The code isn't quite as efficient as the one
    generated by GCC, but still manages to optimize away the loop:
    
        log2u(unsigned long):
                test    rdi, rdi
                je      .LBB0_1
                shr     rdi
                bsr     rcx, rdi
                mov     eax, 127
                cmovne  rax, rcx
                xor     eax, -64
                add     eax, 65
                ret
        .LBB0_1:
                mov     eax, -1
                ret
    
    The pattern is also recognized on other platforms like ARM64 GCC 14.2.0,
    where we end up using `clz`:
    
        log2u(unsigned long):
                clz     x2, x0
                cmp     x0, 0
                mov     w1, 63
                sub     w0, w1, w2
                csinv   w0, w0, wzr, ne
                ret
    
    Note that we have a similar function `fastlog2()` in the reftable code.
    As that codebase is separate from the Git codebase we do not adapt it to
    use the new function.
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    
    In fact in the last year more than 79% of the commits involve 2 or more mentions vs. the rest being one mention; which usually means two, i.e. the author and the reviewer/commiter. See the end of the comment for the source of this affirmation.

    To arrive there, the following was used as the base.

    $ date -u
    vie 13 set 2024 17:58:16 UTC
    $ git describe
    11.0.0-alpha1-541-g1c67d3b559
    $ git log --format=%s --since="1 year ago"| wc -l
    2141
    $ git log --format=%s --since="1 year ago"| sed -e 's/:.*//' -e 's/^Issue #[[:digit:]]* by //'|awk -F, '{print NF}'| sort |uniq -c |sort -nr
        491 2
        440 1
        338 3
        247 4
        183 5
        116 6
         97 7
         56 8
         39 9
         23 11
         20 10
         14 12
         13 14
         10 13
          8 16
          7 17
          6 18
          4 21
          4 20
          3 24
          3 22
          3 19
          3 15
          2 29
          2 25
          1 82
          1 55
          1 52
          1 46
          1 35
          1 34
          1 32
          1 28
          1 26
    

    P.S. It would be so nice to have <details> accepted on the comment format.

  • Status changed to Needs review 4 months ago
  • 🇺🇸United States dww
    1. This is a policy issue for core. We cannot and do not enforce what happens in the "Wild west" of contrib. Yes, many/most contrib maintainers follow core's lead on such things. Certainly the bespoke tooling on d.o issues is largely to "blame", since folks do what's easy. But it's entirely a distraction to worry about differences, enforcement, wringing our hands over how hard it will be if contrib maintainers continue to do their own thing, etc. I just want to be able to read the core Git history and understand WTF is going on, which is close to impossible with the current state of things. Please let's focus on that problem, and not delay this another 10 years (yes, the efforts to fix core's commit conventions have been going on at least that long) by worrying about contrib. If core starts doing this, contrib will (hopefully!) quickly catch up...
    2. This may or not happen before we migrate entirely to GitLab issues. Either way, "we" are almost certainly not going to be changing any drupalorg.module code to better support this. Sadly, there's likely to be some manual effort for crafting the right attribution footers for core commits. That's probably a deal breaker, but I/we can still dream. If it comes to it, we'll have to drop the attributions. Or hopefully there are ways within GitLab to make that possible/easy. I haven't hacked around enough in GitLab to know. There's a chance that GitLab has templates and variables that might be sufficient. So I don't think we need an elaborate bikeshed discussion on the exact formatting of the footer, since all that might have to be dropped for expediency, anyway.
    3. I've only said it dozens of times, but I'll say it again: "Issue #xxxx" is a waste of bytes. What kind of issue? That's part of the point of conventional commits and semantic versioning -- if we see "feat" in the Git logs for a given tag, it should be a new minor version. If we only see "task" and "fix", we know it's still viable for a patch release. Yes, people can / do get in shoving matches over bug vs. feature, and the d.o fields can change. Same with credit + attributions. Things change, but Git history does not. That said, I'd rather the Git history did the best it could to reflect and document what we thought was true at the time the commit was pushed. I don't think it's the end of the world if someone goes back and retroactively changes an issue from a task to a bug but the Git commit still says task.

    Back to NR. I don't see anything of that warrants needing more work here. We need to keep reviewing the options, and ideally converge on agreement. NW will doom this to further delays and more nearly useless core commit messages... 😅

    Thanks,
    -Derek

  • 🇺🇸United States dww

    p.s. Re: #33 and option E: If your concern is that Category and Component can change in the issue but not in the Git history, why put them in the commit message at all (even if only in the footer)? That seems to undo the whole point you were trying to make at the end of #32. I think the "optional scope" here has been a distraction, which is why I dropped it entirely in option D. "Category" is valuable, which is the premise of conventional commits. See #35.3. But option E seems to have all the drawbacks and none of the benefits of putting the category directly into the 1st line summary.

  • 🇭🇺Hungary Balu Ertl Budapest 🇪🇺

    #36 Logically correct, option E updated.

  • 🇬🇧United Kingdom joachim

    > Therefore we need to keep our commit message pattern dead simple as possible if we expect past and current contributors to adopt something totally new for themselves.

    I don't really see how this is an issue -- we usually squash MR commits into one, don't we?

    > There use to be documentation around commit message format, looking a bit around that use to be node nid 52287

    There was. Then I think what happened is that we added a commit message suggestion to d.org, and that just takes the issue title.

    This IMO has made commit messages a LOT WORSE, because issue titles are typically not worded in the same way. They state a problem, whereas the commit message should state an action done.

  • 🇭🇺Hungary Balu Ertl Budapest 🇪🇺

    “I don't really see how this is an issue -- we usually squash MR commits into one, don't we?”

    My point is regarding commits existing on feature branches before the MR gets merged. Your statement refers to the single one commit getting created after the MR got merged into the main branch.

  • 🇬🇧United Kingdom joachim

    I didn't think this issue was concerned with commits on a MR feature branch?

  • 🇺🇸United States dww

    Re #40: 100% correct. This is for the core Git history, the result of the squashed merges that get pushed to the canonical core repo.

    Re: #39: No one cares about intermediary commit messages in MRs, neither core nor contrib. Those aren’t “history”.

  • 🇭🇺Hungary Balu Ertl Budapest 🇪🇺

    #41 Keep your temper, please.

    I understand I misinterpreted the scope of the topic. To outline the base concept here: running a community decision about the pattern of how those few core committers will use when they merge MRs into the Core, right?

    What is it not about?

    • The Contrib space at large
    • Commits on feature branches
  • 🇺🇸United States tim.plunkett Philadelphia

    @baluertl, I see how you mistook @dww's comment as him "losing his temper", but he was stating a fact. Feature branch commit messages can be anything, and that's okay.

    The point I am not clear on is why this wouldn't affect contrib.

    Contrib projects have the same commit message format as core, and have for as long as I can remember.

    I would expect contrib to adopt this change too.

  • 🇳🇿New Zealand quietone

    Thank you to everyone who clearly indicated their preference. I have updated the Issue Summary to indicate that Option D is the preference

    A: 1
    B: 1
    C: 1
    D: 5
    E: 1

  • 🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia

    D looks great. I'm going to change my preference to D. 😁

  • 🇺🇸United States dww

    Thanks, y’all! Indeed, I’m not angry, and still have my temper right here by my side where I left it. 😂

    To be extra clear: I agree that contrib will hopefully follow core’s lead on this, especially if we work out tooling to make it easier to get this right. But contrib can already do whatever they want. I’m acknowledging that freedom, and I’m not claiming this issue will mandate contrib to do the same.

    The thing we can change is the format core committers use when they push to the canonical core repo. Once that’s better (and easy), I imagine most of contrib will do the same.

    Thanks again,
    -Derek

  • 🇦🇺Australia acbramley

    +1 for D

  • 🇧🇪Belgium BramDriesen Belgium 🇧🇪

    #46 Totally agree!

    Would be nice if we can suggest the squashed commit message in the recommended format.

  • 🇭🇺Hungary Balu Ertl Budapest 🇪🇺
  • 🇳🇱Netherlands bbrala Netherlands

    I'm all for using my proposed solution. So guess there a decent consensus on D

  • 🇬🇧United Kingdom joachim

    > [#999999] task: Convert

    I kinda feel that the 'type' of fix/feat/task is a waste of bytes too.

    - fix: Commit message will be of the form 'Fixed the broken thingy.'
    - feat: 'Added support for the thingy'.
    - task: Any other verb as the first word.

    We used to have a commit message standard (~ 15 years ago!) which said the first word should be a verb like 'Added / Fixed / Changed'.

    So I prefer E to D, but D is my second choice.

  • 🇺🇸United States dww

    #51 I hear you that the component can usually be inferred by the summary text. But the promise of conventional commits and related tooling is that if you fully standardize on it you can actually grep the history to learn things, make release notes, etc, instead of parsing text and having to “understand” it.

  • 🇧🇪Belgium BramDriesen Belgium 🇧🇪

    Re #51 Without those, you won't ever be able to benefit from semantic-release. Which is something we use on all our projects.

    https://github.com/semantic-release/semantic-release

  • 🇳🇱Netherlands bbrala Netherlands

    In my opinion we have consolidated on option D. Updated the list so helpfully gathered by @xjm

    A: 1
    B: 1
    C: 1
    D: 8 (plus one second choice)
    E: 2

    I'll go so far to say this is RTBC. <3

  • 🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10

    I will add this to the committers next meeting agenda

  • 🇳🇱Netherlands bbrala Netherlands
  • 🇳🇿New Zealand quietone

    I checked on the next step here and that is to open an issue in the infrastructure project. That issue is, 📌 Change format for git commit message Active .

    So, the policy decision has been made and there is an issue to implement, so this issue can be closed as Fixed.

    Thanks everyone!

  • 🇳🇱Netherlands bbrala Netherlands

    Pro tip, print the list in comment 57 and hang it on your monitor ;) Saves a lot of time while getting used to it.

  • Status changed to Fixed about 2 months ago
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024