Daniel Taylor

== Poor man’s git === Git is “free as in freedom”, mine is “poor as in you-poor-thing.”

The goal here is to create a very minimal form of source code version control. I’d be shocked if any of this is new. If anything similar to this has been done before, hit me up, I’d love to hear about it.

For personal, non-collaborative use, git is a bit overkill. In my opinion.

A minimal version control system doesn’t include branches. I can count on two hands the number of times I’ve used branches in one of my personal solo projects. And I can count on zero hands that amount of times that I’ve needed them.

That’s not a legitimate argument for why branches aren’t necessary for version control. I said those things because I just don’t like using them. I’m asserting that branches aren’t necessary.

Commit messages, commit signing, emails, conflict merging, any kind of built-in server are all not present in a minimal version control system. The only thing present is code snapshots, and being able to manipulate the history.

I’m going to lay out the general structure, which can be implemented a variety of ways. Then I’m going to list a simple implementation using common tools.

Theory

Trying to not be needlessly rigorous about it, the setup is as follows:

You have files, named things like $F_1$, $R_n$, etc. You also have collections (folders) of files, written like $D = (F_1

F_2

F_3)$.

You have the $+$ and $-$ operators on these files and folders that result in another file or folder, such that $A = B - C \iff B = A + C$, etc etc. $-$ is supposed to be thought of as a file diff, and $+$ as applying that diff.

You have a number of snapshots of the codebase, $C_n, C_{n-1}, … , C_1, C_0$. $C_0 = \emptyset$, the null file. In this theory the null file isn’t an empty file, it’s a nonexistant file.

You have the codebase merged differences, $U_n = (C_n

U_{n-1} - C_n)$. The $U$ here, very suggestively, stands of “unidirectional”. The second file there, denoted $R_n = U_{n-1} - C_n$, I’m going to be referring below to as a “reverse patch”. For completion’s sake, $R_0 = \emptyset$, the null file.

That’s the main idea!

If you have a version of the codebase $C_n$, or rather the associated $U_n = (C_n

R_n)$, and want to recover the previous version of the codebase $C_{n-1}$, you just do $C_n + R_n = C_n + (U_{n-1} - C_n) = U_{n-1} = (C_{n-1}

R_{n-1})$. From there you can rewind again to the next previous version if you wish.

Implementation

As a practical matter, to implement this you need two folders. One is your “code” or “working directory,” the one where you’re doing your normal coding in. In most situations this fills the roll of $C_{n+1}$, the next version of your codebase before it gets committed.

You also need your “backup directory,” which corresponds to $U_n$. This is the one that you keep backed up.

You need two folders because at the very least you need to know what files have changed. This is no different from git, except there it’s stored in your .git folder.

You can implement the version backtracking with a couple GNU tools. $-$ corresponds to diff and $+$ with patch.

A commit, or transitioning $U_n$ to $U_{n+1}$ is pretty simple:

# Start: CODE = C(n+1) , BACK = U(n)

diff -rN -u0 $CODE $BACK > reverse.patch
# reverse.patch = R(n+1)

cat reverse.patch | patch -Rd $BACK
# BACK = U(n) + -R(n+1) = U(n) + C(n+1) - U(n) = C(n+1)

mv reverse.patch $BACK/reverse.patch
# BACK = (C(n+1) | R(n+1)) = U(n+1)

To rewind the backup directory:

# Start: BACK = U(n)

mv $BACK/reverse.patch reverse.patch
# BACK = C(n), reverse.patch = R(n)

cat reverse.patch | patch -d $BACK
# BACK = C(n) + R(n) = C(n) + U(n-1) - C(n) = U(n-1)

rm reverse.patch

Bidirectional movement

There are a couple obvious improvements that need to be made before this is usable.

First, being able to go forwards as well as reverse in the codebase history would be nice.

Instead of the unidirectional $U_n$, we’re going to be working with the bidirectional $B_n = (C_n

R_n

B_{n+1} - U_n)$. That last file, denoted $F_n = B_{n+1} - U_n$, is also called a “forward patch.” Note that you can also write $B_n = (U_n

F_n)$.

The problem is that $F_n$ can’t be constructed as you’re coding, because you’d need future knowledge of your codebase as you’re committing. So as you’re coding we’re going to adopt the convention that, if $n$ is the version number of the latest source, $F_n = \emptyset$, the null file. Then, when you start scrubbing back and forth through the history the $F_k$’s can be constructed because the final code state is known.

With this modification you can scrub the history forwards and backwards. Given $B_n = (C_n

R_n

F_n) = (U_n

F_n)$, going backwards can be done by $B_{n-1} = (C_n + R_n

B_n - (C_n + R_n))$. Going forwards through the source history can be done by $B_{n+1} = U_n + F_n$.

The second obvious improvement is that it’d be nice if the code directory updated when going back and forth through the history, so that you don’t have to switch your editor to a different directory.

img3(commit.svg)Commit Backwards in history[Forwards in history]

The script for commit is the same as before:

# CODE = C(n+1), BACK = B(n) = U(n)

diff -rN -u0 $CODE $BACK > reverse.patch
cat reverse.patch | patch -Rd $BACK
mv reverse.patch $BACK/reverse.patch

To step back a commit

# CODE = C(n) , BACK = B(n)

cat $BACK/reverse.patch | patch -d $CODE
# CODE = C(n) + R(n) = C(n) + U(n-1) - C(n) = U(n-1)

diff -rN -u0 $CODE $BACK > forward.patch
# forward.patch = B(n) - U(n-1) = F(n-1)

cat forward.patch | patch -Rd $BACK
# BACK = B(n) + -F(n-1) = B(n) + U(n-1) - B(n) = U(n-1)

mv forward.patch $BACK/forward.patch
# BACK = B(n-1)

rm $CODE/reverse.patch
# CODE = C(n-1)

To step forward a commit:

# CODE = C(n) , BACK = B(n)

mv $BACK/forward.patch forward.patch
# BACK = U(n)

cp $BACK/reverse.patch $CODE/reverse.patch
# CODE = U(n)

cat forward.patch | patch -d $BACK
# BACK = B(n+1)

cat $BACK/reverse.patch | patch -Rd $CODE
# CODE = U(n) + C(n+1) - U(n) = C(n+1)

rm forward.patch

You can implement commit messages by editing a dedicated COMMITMSG file. You could implement a git log by stepping backwards in history in a loop, cating COMMITMSG, and then going forwards again. Something like:

while [ -f $BACK/reverse.patch ]
do
    cat $BACK/COMMITMSG
    ./reverse.sh
done

while [ -f $BACK/forward.patch ]
do
    ./forward.sh
done

Etc etc.