What Is an Index?
A database index is a data structure that improves the speed of data retrieval at the cost of additional storage and slower writes. Think of it like the index at the back of a textbook: instead of scanning every page to find a topic, you look up the page number in the index and go directly there.
How B-Tree Indexes Work
The most common index type is the B-tree. It organizes values in a balanced tree structure where each node contains sorted keys and pointers to child nodes. A lookup starts at the root and traverses down, eliminating half the remaining candidates at each level. This gives O(log n) lookup time, even for tables with billions of rows.
When to Create an Index
Create indexes on columns that appear frequently in WHERE clauses, JOIN conditions, and ORDER BY clauses. However, do not index everything. Each index slows down INSERT, UPDATE, and DELETE operations because the index must be maintained alongside the table data.
A good rule of thumb: if a query scans more than 5-10% of a table, an index is unlikely to help. The database optimizer may choose a sequential scan instead.
Specialized Index Types
Beyond B-trees, modern databases offer specialized index types:
- GIN indexes for full-text search and JSONB queries. - GiST indexes for geometric and range data. - BRIN indexes for very large tables where data is naturally ordered (e.g., time-series). - Hash indexes for simple equality lookups (rarely better than B-tree in practice).
Monitoring Index Usage
Use pg_stat_user_indexes to identify unused indexes that are wasting storage and slowing writes. AI for Database can surface this information with a simple question: "Which indexes in my database are never used?" This helps you keep your schema lean and performant.