There’s lots of places that you can find talking about how to write a good commit message. Some will even give good technical reasons for doing so, but only a few will give you actually good reasons. When’s the last time you had to use
git log --oneline or send a commit via email? Maybe Linus does this everyday, but you don’t.
The one reason I find most important when you’re thinking about the git history as a whole (not just on a per-commit basis) is during peer review. In my current work place and my last the way we did code reviews was by a collogue pushing to github, and opening a pull request. It was then someone else’s job to review and pull that in. That seems to be a very popular workflow from talking to a number of tech teams the past few months whilst interviewing.
Forget about being easily able to look back in the project’s history and see why something was changed, or being able to see a good message on
git bisect. They’ll come incidentally with the following method.
Review is the one stage in product development when you’ll be reading commit messages the most, but it seems like no commit messages I’ve encountered recently, at the very least on my team, are written for that part of the process.
Failing to describe the scope of the message
The first thing I want from a commit message is to know the mindset I should be in whilst reviewing. What are you trying to achieve with the code change you’ve introduced? This Doctrine commit message could be slightly better by saying “always add ‘isEmbeddedClass’ to the serialise array”. I could look at the commit message, and at the code change and see “yup, this is doing exactly what the author wanted it to do.” From that point, I can start thinking of side affects the code change might have on edge cases.
The problem with “fix bug” is that I can’t tell what issue with the code you’re talking about, and so the only review I can do is “yep, that’s syntactically correct”. I can still think about potential side affects, but I’ve no idea if the original writer has thought about and accepts them.
Splitting one unit of work across too many commits
A unit of work should be valuable in some way. A commit should hold one unit of work. If we were to revert this one commit from your pull request, how many other commits would have to be reverted too before we got to a working piece of code? It should be as few as possible without affecting other features.
This usually happens when you’re working in stages and commit each stage. That’s fine at giving you good rollback points, but they should be merged together later on when you’re self-reviewing. An example of this is in the BrowserQuest repo, two commits right after each other with the same commit message: ae773f and 9c1d5.
A more common (and more annoying) issue is when a file has been forgotten to be added, and so the second commit is “lol, forgot to add this file”.
These are exactly what
--amend flag to git was added for; instead of adding a new commit, just push the changes into the last commit. Slightly more complicatedly,
--fixup exists too.
Never mix styling or whitespace changes with functional changes
Most teams with an old codebase have the idea to clean as they go along – that’s a good idea (rather than a huge, up front investment).
These commits should be on their own, because I’m not even going to read them. Your team mates should just be able to auto-merge these commits in (especially in a language like PHP where whitespace doesn’t matter).
The problem is that if you’re changing from tabs to spaces, the git diff will say that every line has changed. Your functional changes will get buried and it won’t be possible to see if anything will have unexpected side effects.