This post originated from an RSS feed registered with Java Buzz
by Elliotte Rusty Harold.
Original Post: How to squash git commits
Feed Title: The Cafes
Feed URL: http://cafe.elharo.com/feed/atom/?
Feed Description: Longer than a blog; shorter than a book
Lately I’ve been working on a project that uses git as for storing source code. I’ve previously written the fourth edition of Java Network Programming in asciidoc with all files checked into git, but that was a very different experience: single author, no branches, always working against master. In other words, it was much like my experiences with Perforce, Subversion, CVS, and (this is really dating me) RCS.
The new project is more traditional git: many branches, many developers, many forks. Perhaps the git/bitkeeper distributed model makes sense for projects like the Linux kernel where there are many independent repositories on many developers’ machines, none authoritative. However for a traditional single team, single repository project, git feels far too heavyweight and complex for my tastes. I find it slows me way down. However like most developers I’m slowly getting used to it, and developing my own small subset of the vast corpus of git functionality that I actually use.
Git is designed to support frequent commits, and pass change requests back and forth as lists of commits so the development work is tracked, rather than by passing file diffs back and forth like most other systems. Now what really confuses me though is that no one seems to actually use it this way. if you want to submit a patch, you do not in fact send the list of commits that shows the history and ongoing work. Instead you rebase everything against master and send a single commit that squashes all the changes together, which seems to be exactly what git is designed to make unnecessary. In other words, we’re using git as if it were a traditional single-master system such as Subversion. Why? And does any project actually expect developers to send their full list of commits rather than a single squashed commit?
(Side note: Perforce is the best of both worlds here. To my knowledge, Perforce and its clones are the only version control system that manages to separate out the ongoing work in a change list and the final commit, and show you both depending on what you want to see.)
Regardless of the wisdom of discarding all history before submitting, like removing all the scaffolding before publishing a mathematical proof, it is how almost all git-based projects operate. Like most (all?) operations in git, it is far from obvious how to actually squash a series of commits down so it’s one clean diff with the current master. And also like most operations in git, there are multiple ways to do this. What follows is the approach I’ve found easiest and most reliable:
Assumptions:
Working on a branch named feature_branch, not a fork and defnitely not master.
Github is central. (May not be relevant.)
“Head” is origin/master.
Only one developer works on a given branch at a time.
There may be other, implicit asumptions about workflow I don’t realize yet. E.g. I don’t know if this works with a non-Github system.
Finally, go to the github UI and merge origin/feature_branch into origin/master. Of course, this may change if your team has a different workflow or does not use github.
In more detail:
First make sure master is up-to-date:
$ git checkout master
$ git fetch
$ git pull
Now merge the master into the feature branch:
$ git checkout feature_branch
$ git merge master
At this point, you’ll be presented with a bunch of screens in your editor of choice. Just save them all.
Now here comes the magic. We’re going to throw away all the commits but leave
the changes in place: $ git reset origin/master
Now you have a bunch of changed but uncommitted files in your local repository. If you’ve added any new files
they are untracked, add all untracked files:
$ git add path/to/untracked/file1 path/to/untracked/file2 ... $ git commit -a -m "Here's what this feature does
The one thing you lose in this process is your old commit message, so you need to enter it again.
$ git commit -a -m "Here's what this feature does
The one thing you lose in this process is your old commit message, so you need to enter it again.
Finally force push your new clean commit to the server, overwriting the previous commits:
$ git push -f origin feature_branch
There are other ways to squash git commits, in particular using the rebase command. However, in my experience rebasing gets very confusing after more than a few commits, especially if you’ve had to merge changes from other developers or branches into your feature_branch in the meantime. Resetting the origin effectively does a diff between your local branch and head, which is a lot easier to follow.