RMRM Full Stack & AI Engineer · All guides · Roadmaps
Databases · guide

Database Indexing Explained

A concise, beginner-friendly technical guide covering what database indexes are, why they matter for performance, how they work under the hood, and best practices to avoid common pitfalls.

What Is a Database Index?

A database index is a separate data structure that stores a sorted subset of column values along with pointers to the corresponding rows in a table. It works similarly to a book's index: instead of scanning every page, you jump directly to the relevant entry. Indexes are created on one or more columns and are maintained automatically by the database engine as data changes.

Why Indexes Matter for Performance

Without an index, the database performs a full table scan, reading every row to find matches — an O(n) operation that becomes painfully slow on large tables. With an index, lookups can drop to O(log n) or even O(1) depending on the index type. This difference can mean queries that took minutes completing in milliseconds, making indexes one of the highest-impact performance tools available to developers.

How Indexes Work: B-Tree and Beyond

The most common index structure is the B-Tree (Balanced Tree), which keeps values sorted and allows searches, insertions, and deletions in O(log n) time. Hash indexes offer O(1) exact-match lookups but cannot support range queries. Specialized types like bitmap indexes suit low-cardinality columns, while full-text indexes enable linguistic search across large text fields.

Composite and Covering Indexes

A composite index is built on multiple columns, such as (last_name, first_name), and is most effective when queries filter on those columns in the same left-to-right order. A covering index includes all the columns a query needs, allowing the database to satisfy the query entirely from the index without touching the base table. This eliminates an extra 'heap fetch' step and can dramatically reduce I/O.

The Write Overhead Trade-Off

Every index must be updated whenever rows are inserted, updated, or deleted, adding write overhead and consuming additional disk space. Over-indexing a write-heavy table can slow down INSERT and UPDATE operations significantly. A good rule of thumb is to index columns used frequently in WHERE clauses, JOIN conditions, and ORDER BY expressions, while avoiding indexes on columns with very high write rates and low read selectivity.

Key Gotchas and Best Practices

Indexes are silently ignored when a function is applied to an indexed column in a WHERE clause (e.g., WHERE YEAR(created_at) = 2024 bypasses an index on created_at — use a range condition instead). Regularly run EXPLAIN or EXPLAIN ANALYZE to verify that the query planner is actually using your indexes. Monitor index usage statistics and drop unused indexes, as they cost write performance and storage without providing any query benefit.

Go deeper with an AI tutor that teaches this in context — and quizzes you on it.
Open the app — free to start

© RM Full Stack & AI Engineer · All guides · Roadmaps · Open the app