Every developer has heard “add an index” as the fix for a slow query. Most do it. Few understand why it works — or why it sometimes doesn’t. This post is the explanation I wish I had earlier: what indexes actually are under the hood, how the database decides to use them, and the specific mistakes that silently make them useless.
Let’s start from the beginning.
What an index actually is
An index is a separate data structure that the database maintains alongside your actual data. Its only job is to answer one question faster: “where is the row I’m looking for?”
Without an index, the database has no choice but to scan every single row in the table to find the ones that match your query. This is called a full table scan — and on a table with a million rows, it’s exactly as slow as it sounds.
Query: SELECT * FROM users WHERE email = 'alice@gmail.com'
Without index:
Row 1 → email: bob@gmail.com → not a match, skip
Row 2 → email: carol@yahoo.com → not a match, skip
Row 3 → email: alice@gmail.com → match! return this
...continues scanning all 1 million rows anyway...
With an index on the email column, the database builds a separate sorted structure — a B-Tree (a self-balancing tree that keeps all entries sorted and can be searched in O(log n)) — that maps email values to row locations. Finding Alice’s row becomes a tree traversal: a few pointer hops from root to leaf, directly to the right row. O(log n) instead of O(n).
The tradeoff is real though. Every index takes up disk space, and every write — every INSERT, UPDATE, DELETE — must update not just the table but every index on it too. An index is not free. You’re trading write performance and storage for read performance. This is why you shouldn’t index everything speculatively.
You never call an index directly
Here’s something that surprises many developers: you never actually tell the database to use an index. You just write a normal SQL query. The database’s query planner — an internal component that analyses your query before executing it — decides whether an index exists for your query, and if using it would be faster than a full scan.
SELECT * FROM orders WHERE user_id = 7291;
The planner thinks through this automatically:
- Is there an index on
user_id? Yes —idx_orders_user_id - Is using the index cheaper than scanning the full table? Yes — the index is highly selective
- Use the index → find
user_id = 7291→ fetch matching rows → return
You can see exactly what the planner decided by running EXPLAIN before your query:
EXPLAIN SELECT * FROM orders WHERE user_id = 7291;
This outputs the execution plan — whether it used an index, which one, how many rows it estimated scanning. It’s one of the most useful debugging tools in SQL, and most engineers never use it until something is already on fire.
One important nuance: the planner doesn’t always use an index even when one exists. If a column has low cardinality — few distinct values — the index isn’t selective enough to be useful. For example, a boolean column is_active might be true for 90% of rows. If you query WHERE is_active = true, scanning 90% of the table is barely better than scanning all of it. The planner knows this and will often skip the index entirely.
The primary index — the one you always have
Every table with a primary key has a primary index automatically, whether you think about it or not. Under the hood, most databases build this as a B-Tree sorted by primary key — meaning the data is organised and retrievable in O(log n) time. We’ll get into exactly where this data physically lives shortly, but the short version is: querying by primary key is always the fastest possible lookup.
CREATE TABLE orders (
id INT PRIMARY KEY, ← primary index created automatically
user_id INT,
amount DECIMAL,
status VARCHAR,
created_at TIMESTAMP
);
Querying by primary key is always the fastest possible lookup — one B-Tree traversal, data is right there at the leaf node. This is why WHERE id = 1001 is always fast, regardless of table size.
Everything else — filtering by amount, sorting by created_at, joining on user_id — has no index by default. Those queries hit the full table scan path until you explicitly create secondary indexes.
Secondary indexes — created for your query patterns
A secondary index is an index you create on a non-primary-key column because your application frequently queries on it. Same underlying structure (B-Tree in most cases), just on a different column.
Say your application frequently fetches all orders for a specific user:
-- Without a secondary index on user_id:
SELECT * FROM orders WHERE user_id = 7291;
-- Full table scan. If you have 10 million orders, this reads all 10 million. Slow.
You fix this by creating a secondary index:
CREATE INDEX idx_orders_user_id ON orders(user_id);
Now the same query uses the index — O(log n) lookup, directly to the rows for user 7291. The query itself doesn’t change at all. The planner just now has a faster path available.
The write cost is real though. Every time a new order is inserted, or an existing order’s user_id changes, the database updates this index. On a write-heavy table with many indexes, this overhead compounds. A table with 10 secondary indexes pays 10 index update costs on every write.
This is why the rule is: index for the queries you actually have, not the ones you might have. You can always add indexes later — and removing them is just as easy.
Composite indexes — when one column isn’t enough
Sometimes a single column isn’t selective enough on its own. Maybe you frequently filter by both user_id and status together. A composite index covers multiple columns:
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
But here’s the crucial thing most developers don’t know: the order of columns in a composite index matters enormously.
To understand why, think about a phone book. A phone book is sorted first by last name, then by first name within each last name. That’s exactly how a composite index works — sorted by the first column first, then by the second column within each value of the first.
So the (user_id, status) index physically looks like this on disk:
user_id: 100, status: 'completed'
user_id: 100, status: 'pending'
user_id: 200, status: 'completed'
user_id: 7291, status: 'cancelled'
user_id: 7291, status: 'pending'
user_id: 7291, status: 'shipped'
user_id: 8000, status: 'pending'
All entries are sorted by user_id first. Within the same user_id, they’re sorted by status. Now map the phone book directly to each query:
WHERE user_id = 7291 — you know the last name (Smith). The index is sorted by user_id first, so all rows for user 7291 are physically grouped together. Jump straight there. ✓
WHERE user_id = 7291 AND status = 'pending' — you know last name AND first name (Smith, Alice). Jump to user 7291’s section, then within it find status = 'pending'. Even more precise. ✓
WHERE status = 'pending' — you only know the first name (Alice). Look at the index above — 'pending' rows are scattered at positions 2, 5, and 7. There’s no way to jump to them because the index isn’t sorted by status first. You’d have to scan every single entry. The phone book is useless when you don’t know the last name — and the index is useless when you skip the first column. ✗
-- These queries CAN use the (user_id, status) index:
WHERE user_id = 7291 -- "last name" known → jump to section
WHERE user_id = 7291 AND status = 'pending' -- "last name + first name" → precise jump
-- This query CANNOT use the (user_id, status) index:
WHERE status = 'pending' -- only "first name" known → full scan
If you frequently query by status alone, you need a separate index on just status. The composite index doesn’t help you — because the sort order that makes the index fast simply doesn’t apply to the column you’re searching on.
The general rule: put the most selective column first (the one that narrows results the most), and put columns you filter by range (BETWEEN, >, <) last.
Clustered vs non-clustered — where does the data actually live?
This is one of those distinctions most developers are fuzzy on even after years of using databases — because it rarely comes up until something goes wrong with performance.
To understand it properly, you need to know about one more piece of the puzzle: the heap file.
A heap file is simply a disk file where the database stores your actual row data — in no particular order, just wherever there’s free space. When you insert Alice, then Bob, then Carol, the heap stores them in insertion order regardless of their IDs:
HEAP FILE (on disk — rows in insertion order, no sorting)
──────────────────────────────────────────────────────────
slot 1: {id:3, user:"Carol", amount:510} ← inserted first
slot 2: {id:1, user:"Alice", amount:250} ← inserted second
slot 3: {id:2, user:"Bob", amount:80} ← inserted third
The word “heap” here has nothing to do with RAM — it’s borrowed from the general computer science meaning of an unordered collection. Think of it as a filing cabinet where documents are thrown in wherever there’s space, with no sorting. The heap is always on disk, just like your index files.
Now the clustered vs non-clustered distinction makes sense in physical terms.
Non-clustered: two separate files on disk
In PostgreSQL (which is non-clustered by default), two things exist on disk as separate files:
/data/
├── orders_heap ← the actual rows, unordered
└── orders_id_idx ← the B-Tree index, sorted by id
The index file contains only keys and pointers — not the actual row data:
INDEX FILE (sorted B-Tree) HEAP FILE (unordered rows)
────────────────────────── ──────────────────────────
id:1 → pointer ────────────────→ slot 2: {id:1, user:"Alice", amount:250}
id:2 → pointer ────────────────→ slot 3: {id:2, user:"Bob", amount:80}
id:3 → pointer ────────────────→ slot 1: {id:3, user:"Carol", amount:510}
So SELECT * FROM orders WHERE id = 1 does two hops:
Step 1: Walk the B-Tree index → find id:1 → get pointer to heap slot 2
Step 2: Go to heap file, slot 2 → read the full row
Two files touched. Two reads.
Clustered: one single file on disk
In MySQL InnoDB, there is no separate heap file. The table itself is stored as a B-Tree sorted by primary key. One file, two jobs:
/data/
└── orders.ibd ← this IS both the table and the index. One file.
The B-Tree leaf nodes don’t store pointers — they store the complete rows directly:
orders.ibd — B-Tree sorted by id, full rows live in leaf nodes
──────────────────────────────────────────────────────────────
leaf node: id:1 → {id:1, user:"Alice", amount:250, status:"pending"}
leaf node: id:2 → {id:2, user:"Bob", amount:80, status:"done"}
leaf node: id:3 → {id:3, user:"Carol", amount:510, status:"done"}
Notice Carol is no longer first — because the B-Tree enforces sort order by id, not insertion order. The rows are physically reorganised on disk.
So SELECT * FROM orders WHERE id = 1 does one hop:
Step 1: Walk the B-Tree → find id:1 → full row is RIGHT HERE in the leaf node
One file touched. One read. Done.
A good way to think about this: a non-clustered index is like a library card catalogue — it tells you which shelf to go to, but you still have to walk to the shelf and pick up the book. A clustered index is the card catalogue and the shelf in one — the moment you find the card, the book is attached to it.
MySQL InnoDB always clusters on the primary key. All secondary indexes store the primary key as their pointer, so a secondary index lookup goes: find primary key in secondary index → look it up in the clustered primary index. Two hops, but the second hop is fast because it hits a well-cached B-Tree.
PostgreSQL uses the heap model for everything by default. All indexes — primary or secondary — store a pointer into the heap. Simpler to manage, slightly more I/O per read.
Neither is universally better. InnoDB’s clustered primary key makes primary key reads faster. PostgreSQL’s heap model makes managing many secondary indexes simpler, since they all point to one place.
Covering indexes — answering queries entirely from the index
Here’s a powerful optimisation that most developers discover too late: if all the columns a query needs are stored inside the index itself, the database never has to touch the actual table. The index alone is enough to answer the query. This is called a covering index.
Say your most common query is:
SELECT user_id, amount FROM orders WHERE user_id = 7291;
A regular index on user_id finds the right rows — but then has to make a second trip to the heap file to get the amount column. Two hops per row, for every row returned.
A covering index includes amount directly:
CREATE INDEX idx_orders_covering ON orders(user_id) INCLUDE (amount);
Now the leaf node of the B-Tree contains both user_id and amount. The query is answered entirely from the index — zero trips to the heap file. For hot queries that run thousands of times per second, this difference is significant.
The tradeoff: every included column increases the index size and the write cost. Use covering indexes surgically — for your absolute hottest queries, not as a general strategy.
When indexes silently stop working
This is the section that will save you the most pain in practice. Having an index doesn’t guarantee it gets used. There are specific patterns that prevent the query planner from using an index — and they’re all invisible unless you run EXPLAIN.
Wrapping a column in a function:
-- Index on created_at exists. This query WON'T use it:
WHERE YEAR(created_at) = 2024
-- The function transforms the column value before comparing.
-- The index is sorted on raw created_at values, not YEAR(created_at).
-- The planner can't use the index.
-- Fix: rewrite without the function:
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31'
A leading wildcard in LIKE:
-- Index on email exists. This query WON'T use it:
WHERE email LIKE '%gmail.com'
-- A leading % means "can start with anything."
-- The index is sorted alphabetically — it can't help when the start is unknown.
-- Fix: trailing wildcards are fine:
WHERE email LIKE 'alice%' -- index works, narrows to emails starting with 'alice'
Implicit type conversion:
-- Index on user_id (INT) exists. This may NOT use it:
WHERE user_id = '7291' -- string, not int
-- The database converts every row's user_id to string before comparing.
-- The conversion bypasses the index.
-- Fix: match the column's actual type:
WHERE user_id = 7291
All three of these look completely normal. None of them throw an error. The query returns correct results — just slowly, silently doing a full table scan instead of using your index. EXPLAIN is the only way to catch them.
The practical mental model
After everything above, here’s how to actually think about indexing on a real project:
Start with no extra indexes. The primary key index is always there. Don’t add secondary indexes speculatively — every index has a write cost, and premature indexing is a real problem on write-heavy systems.
Find your slow queries first. Use EXPLAIN or your database’s slow query log to identify queries doing full table scans on large tables. Those are your actual candidates for indexing.
Index for your most common WHERE, JOIN, and ORDER BY columns. Columns you filter on, join on, or sort by are the highest-value targets. The more distinct values a column has — higher cardinality — the more an index helps.
Watch your write performance after adding indexes. Always benchmark writes after adding indexes to a write-heavy table. A table with 10 indexes will have significantly slower inserts than one with 2.
Use EXPLAIN obsessively. On any query that matters, run EXPLAIN and verify the index you think is being used is actually being used. The three patterns above — functions, leading wildcards, type mismatches — are far more common than they should be.
The golden rule: index for the queries you have, not the queries you might have. Indexes are easy to add. Easy to remove. The hard part is knowing which ones are actually earning their keep.
This is part of the BytesByAbhi series on database internals — building intuition from first principles, one concept at a time. Heavily inspired by Designing Data-Intensive Applications by Martin Kleppmann.
