Neural DownloadNEURAL DOWNLOAD
← cd ../blog

How Databases Search a Billion Rows

Every database you use is built on B-trees. See how they minimize disk I/O to search billions of rows efficiently.

b-treedatabasedata structurepostgresqlmysqlsqlite
Share

Your database has a billion rows. You run a query. The answer comes back in milliseconds. How?

Not with binary search. Your data lives on disk, and disk reads come in fixed-size pages — four to sixteen kilobytes each. A spinning disk takes about ten milliseconds per read. Binary search on a billion records means ~30 page reads. On a spinning disk, that's 300 milliseconds for a single query.

The bottleneck isn't comparisons — your CPU handles millions per second. The bottleneck is disk I/O. What if each node could hold not one key, but hundreds?

One Node, One Disk Page

A B-tree node holds many keys in sorted order, with child pointers between them. The critical design choice: each node is sized to fill exactly one disk page.

One disk read loads hundreds of keys into RAM. Binary search within that page is essentially free. Find which range your target falls in, follow the child pointer, read the next page.

Keys Per Node2 Levels3 Levels4 Levels
500250,000125 million62.5 billion

A billion rows? Three to four disk reads. That's it. This is why PostgreSQL, MySQL, and SQLite all use B-trees as their default index structure.

Splits: Growing from the Top

Inserting a key is straightforward — slot it into sorted position within a node. Until the node is full.

When a node overflows, it splits. The node breaks into two halves, and the median key gets pushed up to the parent. If the parent is full, it splits too. The cascade continues upward until it finds room — or the root itself splits.

When the root splits, a new root is created above it. The tree grows from the top, not the bottom. This guarantees something remarkable: every leaf is always at the same depth. Not approximately. Not eventually. Mathematically guaranteed on every insert and every delete.

That means the number of disk reads for any lookup is bounded by the height of the tree. Predictable. Small. Consistent.

Why It's Even Faster in Practice

That root node your database keeps? It's cached in RAM. The second level usually is too — they're accessed so frequently they almost always live in the page cache.

So in practice, finding any row among a billion takes two, maybe three actual disk reads. Not thirty. Two.

Most databases actually use a variant called a B+ tree, where all data lives in the leaves and the leaves are linked together. Need a range query? Find the start, then walk sideways through the leaf chain. No jumping back to the root.

FeatureBinary Search TreeB-Tree
Keys per node1Hundreds
Disk reads per lookup~30 (billion rows)3-4
Balance guaranteeRequires rebalancingAlways perfectly balanced
Optimized forRAMDisk page I/O

B-trees aren't just a data structure. They're the bridge between your data and the physical reality of storage hardware. Next time your query returns instantly, you'll know what's working underneath.

Watch the full animated breakdown: B-Trees: How Your Database Finds Data