Model

I think a user can manage to comprehend many of the git commands, actions, concepts and workflows, by representation via three things at first, the working directory, the index, and the commit graph.

Terms

Note that terms like repo and tree can be found to be used in many ways, and it can take experience, fluency and context to pick up on their meaning. One might use the term repo for any of the entire project directory, the .git directory, or the commit graph. Tree can refer to the working directory, the commit graph, the virtual directories associated with each commit, or state, and perhaps even the distributed nature of git.

Changes to the commit graph

Notice that git-commit will add a commit to the graph, based on the state represented by the index. On the other hand, git-merge, git-rebase and git-cherry-pick all act on existing commits in the commit graph. These are effectively the only commands that change your commit graph, though git-fetch can add commits, and move remote refs.

Movement of refs

Besides an interest in what these commands do to commits, one should also note what they do to refs. In general, whichever ref is currently checked out, is the ref that would actually move. For example, when you git-merge, only the ref on the checked out branch moves, despite other aspects of merging being symmetric.

What is a commit‽

A commit consists of a virtual directory, a list of ancestors, the author and date of the commit, and who made the commit.

List of ancestors‽

Commits often point to a single commit, which we refer to as the parent. However, a commit can also have zero or any number of parent commits. The first, or initial, commit will have zero parent commits. When we merge two branches, the merge commit will have two parents commits, but you can also have a merge of any number of branches. In fact, the linux kernel is said to have four initial commits1.

Virtual directory‽

I’ve borrowed the term “virtual directory” from Aha! Moments When Learning Git – BetterExplained, but a commit is actually one of a few types of objects stored in the git repo. So we call it the commit object. Each commit object points to precisely one tree object, where in turn each tree object points to any number of blob objects and any number of tree objects. The blob objects represent the contents of files. So in this way, the commit object points to a tree object, which represents a virtual directory, or a previous state, of our project.

Two points to add there. One is, I really like the way those last concepts about objects are deliberated on, by John Wiegley, in his book “Git from the Bottom Up”, linked above. My second point to add, is that you don’t have to think about this decomposition of commit objects as you begin your quest to learn git. I think it would be sufficient to note that each commit is essentially recording a previously recorded state of our project.

Project, index, and HEAD

More on this later.

Random notes and snippets

I might accrue git notes and snippets here that I hope to otherwise fold into a larger understanding of git.

A trick described on #git IRC channel, to count number of initial commits.

git rev-list --count --max-parents=0 --all

And another one to count initial commits reachable from current HEAD.

git rev-list --count --max-parents=0 HEAD

15:45: I'm a fan of having $watch -cn.5 'git --log --graph --all --color'; on the side


  1. The Biggest and Weirdest Commits in Linux Kernel Git History!