Your Branch Is 41 Bytes (Git Internals Explained)

Git is a content-addressable object store, not a diff tracker. See how blobs, trees, and commits actually work under the hood.

gitgit internalshow git worksgit tutorialversion controlsoftware engineering

You've used Git almost every day of your career. git add, git commit, git push. But in a poll of experienced developers, 51% said Git stores the differences between versions — a chain of patches, one after another. That's not what's happening.

Snapshots, Not Diffs

Git's data model is built on snapshots. Every single commit captures the complete state of your project — every file, every folder, frozen in time.

But if every commit stores everything, wouldn't that be enormous? No. Git hashes every piece of content. If two files are identical across commits, they produce the same hash — stored once, referenced everywhere. Snapshots are cheap because most files don't change.

Three Objects, Connected by Hashes

At its core, Git is a key-value store. You give it content, it hashes it, and stores the result as an object. There are exactly three types:

Blob — raw file contents. No filename, no metadata. Just bytes and a hash.
Tree — maps filenames to blob hashes. Sub-directories are just trees pointing to other trees.
Commit — points to a root tree (the complete snapshot), plus its parent commit(s), author, and timestamp.

Every commit records its parents, so the commits form a directed acyclic graph. Not a straight line — a graph. Branches split off, merge back together, and every node knows exactly where it came from.

Branches Are Just Text Files

This is the part that surprises most people. Open your .git/refs/heads/main file. You'll find one line:

a1b2c3d4e5f6... (40 hex characters)

That is your main branch. A branch in Git is a tiny text file containing a commit hash. Creating a new branch doesn't copy code or duplicate history. It writes 41 bytes to a new file. This is why branching is nearly instant, even on massive projects.

HEAD is another pointer — it usually points to a branch name, telling Git which branch you're on. The full cycle:

Step	What Happens
`git add`	Stages file contents as blobs
`git commit`	Creates a tree from staged content, wraps it in a commit object
Pointer update	The current branch file moves forward to the new commit hash

Add, snapshot, move pointer. That's the whole cycle.

Why This Matters

Once you see Git as objects and pointers, everything else falls into place. rebase replays commits onto a new base — it creates new commit objects with different parents. cherry-pick copies a single commit's snapshot. The reflog is a history of where your pointers have been.

You don't need to memorize Git commands. You need the right mental model. And the model is: content-addressed objects linked by hashes, navigated by pointer files.

Watch the full animated breakdown: Git Internals — What's Actually Inside .git