Git rebase vs merge - there can be only one.
Introduction
Rebase vs merge, which is better? There are a lot of online discussions about these two “workflows”. One is pro-rebasing and the other swears by merging. Both sides are stuck in their own thinking and both think they are right. So which is it? In my opinion: neither or both. Before you get defensive, hear me out. I’ll try to explain this in the following post.
TL;DR
Rebase is a tool used while developing. Merging is a tool used for incorporating your branch into a target branch. Both tools or workflows serve each other in the end.
Rebase - But it rewrites history!
Yes, it does, but let’s not kid ourselves, it only applies your commits on top of the HEAD after it pulled in those changes. The easiest way to explain a rebase is:
A replay of your commits from the common ancestor (merge-base), on top of the most current common ancestor (the new merge-base). We essentially reset the merge-base to the current HEAD of your tracking branch and then apply each commit again. Rewriting.. sure — we just re-apply a commit, similar to cherry-picking. Rebasing is actually cherry-picking on steroids without having to create a new branch.
Now, if everything merges cleanly, your commits stay unchanged and are applied without fuss. If there is a merge conflict, you’ll be informed and you are required to take action on this. Just like in a merge commit with a conflict. The biggest difference is, this conflict is microscopic. One commit, not your whole merge.
The rewriting history is technically true, but holistically false. We are just re-applying commits to ensure we have a linear history to be able to merge our work later cleanly, without conflicts.
How to find the common ancestor and tracking branch
Once you branch off your starting point, main, master, or whatever branch
you used and you rebase, you look at the most common ancestor:
common_ancestor=$(git merge-base $branch@{u} $branch)
Before you get confused: $branch@{u} is just a fancy way of asking what the
tracking branch is. You can often inspect this yourself by looking in
.git/config and look for your branch:
[branch "master"]
remote = origin
# Your tracking branch:
merge = refs/heads/master
If you remove the merge = line and run git status, you’ll see that your
branch isn’t tracking anything. If you add it git branch foo --set-upstream-to=remote/branchname it gets added again.
There is one caveat, and this is a tricky one. When a branch has an upstream
configured and you have fetched the branch and run git rebase you might be in
for a surprise. It has everything to do with --fork-point. Forkpoint tries to
be clever and may remove commits you have removed during development for any
reason. When you rebase without arguments, those commits you’d expect to be
there may suddenly be gone. To clarify: this happens when you run git rebase,
not when you run git rebase @{u}. Running rebase without additional arguments
adds --fork-point, to counter it you must use --no-fork-point or configure
it in your config; git config set rebase.forkpoint false.
For more information about this:
Rewriting history - for real this time
Rebasing, in its most simple form, is an automated mass cherry-pick event on an up-to-date branch. But we can indeed rewrite history when we pick other commands in the rebase todo file. This file opens once you start an interactive rebase (or when you actually break the rebase and need to correct).
You get to have several options: pick, edit, reword, drop, fixup,
squash, execute, and break.
This is where your pragmatic friend comes in and allows you to do a whole lot
of actions to ensure your commits stay clean. You can reorder commits, you can
drop them, by either changing pick to drop or just removing them from the
TODO.
With reword you can change the commit message, with squash you can squash
two commits into one and rewrite the commit message. You get to see both so you
can combine the texts. With fixup you can squash commits without having to
change the commit message of the commit you are squashing into.
Now these actions aren’t trivial. Although once you start to get accustomed to
them they become second nature.
You can add breakpoints in a rebase with break. Break stops the rebasing
process and drops to a shell so you can perform actions. You can for example
split a commit or add a commit manually while breaking or perform other actions
at that point in time. With edit you can pause at a specific commit so you
can amend it. It’s almost the exact same as break.
Splitting a commit
- Reset the commit
- Reset the changes in the index
- Reapply the commit message from your original commit.
# undo the last commit
git reset --soft HEAD^
# reset the index
git reset -p
# hack the code
git add -p
# commit the changes
git commit
# or
# rebase knows the commit message: often found in .git/GIT_COMMIT_MESSAGE
# but once you are in worktree land you need other ways to find .git
git commit -F $(git rev-parse --git-path GIT_COMMIT_MESSAGE)
If you want to modify some of its contents, almost the same logic applies. This is actually rewriting history, but for the greater good. You saw a change that needed to be made in a commit and you changed it before you wanted to merge it in your upstream project.
If you are wondering if I forgot exec no, it’s a rather simple thing, just
execute this command. I use it to run scripts that modify the commit message
with an issue number. So I don’t need to manually rewrite the commit message:
pick <sha> I fixed the issue
exec git prepend-msg ISSUE-1234
And now my prepend-msg tooling changes the previous commit message to
ISSUE-1234: I fixed the issue. But you can do anything here. Go wild.
Processing reviews and feedback
Often when you submit a MR (or PR on github) you get feedback and this means
you need to alter code or other things. For years I myself used “Processed
feedback from reviewer” commits. And they suck! Hello my friend fixup: git commit --fixup. This creates a fixup commit and with rebase you can
automatically ensure they are added to the correct commit. With the fixup flag
you can tell git, this commit is here to fix this commit: git commit --fixup <commit-ish> Now with git rebase -i --autosquash the rebase editor will
automatically line the commit up under the correct commit with fixup and
squash them.
I also often used this when a tester gave feedback on a feature and I needed to fix a commit. Fix it up, and before submitting the code I just rebased and everything was structured correctly. You could argue we were rewriting history, but I think this is not a rewrite, we told a story, got corrected and now we have to tell a different story.
Rebasing on a shared branch
The biggest gripe rebase gets is that people don’t advise to rebase on a shared branch because it is bad form. And I fully agree. You can however work perfectly sane with a shared branch if you just work with it correctly and it doesn’t even require advanced usage of git.
You and your co-workers work on a feature branch and someone creates the main
feature branch on the upstream remote. Everyone else then uses that branch as
their main starting point. People hack away and they want to incorporate
changes. The first thing you start doing is to pull --rebase before you push
your changes to the shared branch. After that you just push. Done.
$ git checkout -b foo
$ git reset --hard upstream/master
$ git push upstream foo:foo
# Everyone uses foo as their remote tracking branch:
$ git branch --set-upstream-to upstream/foo
$ git pull --rebase
# hack hack hack and commit
$ git pull --rebase
$ git push upstream foo:foo
You can even do ff-only merges if you really want to, but I think this creates
a ton of additional ceremony without a clear benefit. Unless each merge is done
via an actual PR/MR workflow.
I think it creates a lot of additional burocrazy while working on a feature
branch. Although I once saw our frontend team use a shared branch and treated
it as the master branch so they could merge it into the master branch without
the review ceremony. So every real merge of theirs was treated “Done”.
Once the shared branch is ready: git rebase upstream/target. This is done by
one of you who worked on the shared branch. And it doesn’t matter who this
person is, it could be the weakest git user in your team: It just needs to be
one person. Everyone else can just pull that branch afterwards, or reset it, or
just wipe it. It doesn’t matter. The work is done. The code is going to be
merged into your target branch.
git fetch upstream
git rebase upstream targetbranch
git push --force # you can use force-with-lease if you want, but, in this case
# our local is leading all the way.
And if you took the approach of “my” former frontend team, rebase also allows
you to keep merges: git rebase --rebase-merges.
force or force-with-lease
You generally want to use –force-with-lease. It is safer because it assumes a particular commit is on the remote, for example. If that isn’t the case, it aborts the push. Why we use force in our example? It’s the final act, everyone has stopped working and has pushed their changes. We rebased, and we do the final act before merging: we push our local changes back to the remote.Merge
We talked ad infinitum about rebasing, so merge, what does that do for us?
Proponents of merging have a couple of good points:
- Shared branches cannot be rebased. They are right, you cannot rebase it if you keep changing the common ancestor. If you keep that stable, it’s fair game.
- It preserves the true history, we can see when things were actually integrated and built.
- Merge commits show which commits belonged together as a feature
- Safer for collaboration: No force pushing needed
- Easier to revert:
git revert -m 1undoes an entire feature cleanly - Bisect-friendly: When a merge introduces a bug, you can bisect within that branch
- Less cognitive load: Junior devs don’t need to understand rebase
And these are all valid comments, they lack one crucial insight: Merge is not
used during development. It is about the final act:
The merge into the target branch.
Let me explain why merge is the final act and not a during development tool, because I don’t want to dismiss these arguments, but show why these arguments are incorrectly injected into development workflows.
Merge is a powerful tool for incorporating whole lines of work. It gives structure, shows collaboration, preserves full context as stated in the bullet points. But that strength is exactly what makes it unsuited to development-time syncing.
If one merges a finalized piece of work, I’m more than happy to agree with merge trumps rebase. If one merges back upstream in their own work, this should be done via a rebase. Why? We don’t care in six months time how one made a feature (we do, but we look at commit messages for this). We will more than likely see messy history, especially if one merged back several times during their development cycle. The day-to-day logistics of writing code isn’t that important. And if it was, the commit graph shouldn’t be the one telling you this: the commit message should.
Merge doesn’t allow me to rewrite commits or reorder, split, drop, or include them. It just merges. It has only one task: merge. It does one thing and it does that thing well. And yes, merge does octopus merges and has other strategies. But none of these are mentioned when discussing merge vs rebase. This is probably because these strategies are often used in the final act: the ceremony of incorporating multiple branches into the target branch, no-ff and ff-only tells merge on how to perform the final act.
Conclusion: Two sides of the same coin
Merge advocates are talking about the final integration. Rebase advocates are talking about the development process. They’re arguing past each other.
When you develop you’ll want to incorporate upstream changes into your work. It is generally good hygiene to do it prior to submitting your work to be included. It minimizes merge conflicts. You therefore rebase your branch. You do this to reorder and clean up commits. The history rewriting of these commits is a second utility of rebase. It allows fine-grained control of what you put in each commit. You can split commits, swap them, drop them, squash them, reword them. It is your cleaning tool and pragmatic friend.
Rebase is cleaner, and shows that your branch was ready at a particular merge-base. That’s when the feature got incorporated into the master branch. That’s the reality we want to see and what end users will experience. The in between of a developer workflow isn’t important to the product.
Merging is just the ceremony where you incorporate the final work into the target branch.
The next time you read a discussion about rebase vs merge: know git isn’t the Highlander world where there is only one — they both serve their purpose for different reasons in your workflow.