Sam Gentle.com

Metagit

There's an interesting problem with Git and, indeed, any version control system. These systems are designed to track changes over time, so you write some code, record those changes, write more code, record those new changes, and the history is preserved. The problem is, what if you want to change the history itself?

Let's say it turns out that your first change was actually a mistake. You could commit a third change that reverts the first one, but that's not the same thing as removing your mistake entirely; you can still see it in the history. Alternatively, you could go in and edit the history to make it appear like your mistake never happened. The decision between these options is a matter of considerable debate.

Ultimately, the problem that Git is designed to solve is maintaining a consistent history, and it can't do that when you go back and meddle with the timelines. However, there are various aesthetic (who cares about my fifty "oops now it's really fixed" commits?) and practical (what do I do if I committed a secret key?) reasons to rewrite history. So you end up with weird compromises, like only editing history if you're really sure nobody else will see it, or coordinating your changes manually ("hey, do you have my latest rebase? No, I mean the one after that one...") Exactly the problem Git was meant to solve in the first place!

The central issue is mutability. We can have a consistent mutable codebase because we have an immutable history to coordinate it. Once you start making changes to that history, it stops being immutable, and you lose that consistency. If we want the history to be consistent and mutable, we need to use the same trick we used to make our mutable code consistent: another history.

Essentially, this is something like what I described in a null of nulls. Changes to code form a history, changes to the history form a meta-history: a history of histories. Each thing you change needs a level of history above it. Adding a meta-history to Git would allow you to have a nice, curated history for human consumption while giving the tools enough information to handle synchronisation properly.

There is something a little bit like a meta-history already, in the form of the git reflog, but it's nowhere near sophisticated enough. It's mostly designed for human repair when history-rewriting goes wrong. What if, instead, there was some in-system representation, like a reflog but with enough information to synchronise, merge and reconstruct history changes? It might even turn out to be similar enough to the existing commit structure that most of the code and storage format would be the same.

Of course, the obvious question is: what if you want to make changes to the meta-history? That seems like the kind of problem that will come up eventually and there's no reason why you can't. Assuming the meta-git format is flexible enough to reference itself, you can get as many levels of meta-history as you need. That's not to say it's a good idea, though. One extra level of history is already complex enough.