Indexes are one of the most important performance tools in MySQL.

They can turn a query from a full table scan into a fast lookup. They can help MySQL avoid sorting, reduce disk reads, speed up joins, and make pagination more predictable.

But indexes are also easy to misunderstand.

Many developers know that "indexes make queries faster," but they are not always sure:

  • which columns should be indexed;
  • why composite index order matters;
  • why an index is not used even when it exists;
  • what cardinality and selectivity mean;
  • why LIKE '%text%' is slow;
  • why ORDER BY sometimes ignores an index;
  • why adding too many indexes can hurt writes;
  • how to read EXPLAIN.

This article explains MySQL indexes through real query examples.

The goal is not to memorize rules. The goal is to build the mental model you need when reviewing slow queries in a real backend application.


Learning Plan

Before going deep, here is the plan.

Part 1: Build the Mental Model

You need to understand what an index is, how MySQL uses B-tree indexes, and why indexes are not magic.

You will learn:

  • how a B-tree index is ordered;
  • why equality lookups are fast;
  • why range queries are still efficient;
  • why an index can help with ORDER BY;
  • why a secondary index in InnoDB still needs to find the actual row.

Part 2: Learn Single-Column Indexes

You will see how a simple index helps queries like:

SQL
SELECT * FROM users WHERE email = 'nazar@example.com';

You will also see when a single-column index is not enough.

Part 3: Learn Composite Indexes

Composite indexes are where most real performance work happens.

You will learn why this index:

SQL
CREATE INDEX idx_orders_user_status_created
ON orders (user_id, status, created_at);

can help some queries very well, but not others.

You will learn the leftmost-prefix rule and the common "equality columns first, range column later" strategy.

Part 4: Understand Cardinality and Selectivity

Not all indexed columns are equally useful.

An index on email is usually highly selective.

An index on status may be weak if most rows have the same value.

You will learn how to think about cardinality and why low-cardinality columns are not always useless, but often need to be combined with other columns.

Part 5: Practice With EXPLAIN (and EXPLAIN ANALYZE)

You will learn how to inspect query plans using EXPLAIN, EXPLAIN FORMAT=TREE, and EXPLAIN ANALYZE (MySQL 8.0.18+), and the meaning of the columns that matter most: type, possible_keys, key, rows, filtered, and Extra.

Part 6: Apply Real Indexing Patterns

You will learn practical indexing patterns for login lookups, order history, admin filters, soft deletes, queues, pagination, joins, reporting queries, and covering indexes.

Part 7: Use Modern MySQL 8.x Features

You will learn when to reach for:

  • functional / expression indexes;
  • invisible indexes;
  • descending indexes;
  • multi-valued indexes on JSON;
  • FULLTEXT indexes;
  • index hints.

Part 8: Avoid Common Mistakes

You will learn why these patterns can break index usage:

  • wrapping indexed columns in functions;
  • leading wildcard LIKE;
  • wrong composite index order;
  • too many indexes;
  • redundant indexes;
  • indexing every column;
  • using OFFSET pagination at scale.

Official MySQL References Used

This article is based on practical experience and the MySQL documentation, especially:


1. What Problem Do Indexes Solve?

Imagine a table with 10 million users.

SQL
CREATE TABLE users (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    email VARCHAR(255) NOT NULL,
    name VARCHAR(255) NOT NULL,
    status VARCHAR(30) NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY (id)
);

Now imagine this query:

SQL
SELECT *
FROM users
WHERE email = 'nazar@example.com';

Without an index on email, MySQL may need to scan the table row by row.

That means:

  1. read row 1;
  2. check if email = 'nazar@example.com';
  3. read row 2;
  4. check again;
  5. repeat until the match is found or the table ends.

On a small table, this is fine.

On a large table, this is painful.

Now add an index:

SQL
CREATE UNIQUE INDEX users_email_unique
ON users (email);

Now MySQL can use the index to find the matching email much faster.

The index is a separate data structure that stores values in sorted order, together with a way to find the matching row.

A simplified version looks like this:

Text
email                         row pointer / primary key
-------------------------------------------------------
anna@example.com              101
john@example.com              482
nazar@example.com             917
sara@example.com              1021

Instead of scanning the whole table, MySQL can search the sorted index.

That is the core idea.

An index helps MySQL avoid unnecessary work.


2. B-tree Indexes: The Default Mental Model

Most normal MySQL indexes use a B-tree-like structure.

When you create a regular index like this:

SQL
CREATE INDEX idx_users_email
ON users (email);

you are usually creating a B-tree index.

The exact internal structure is more complex than a simple sorted list, but the mental model is simple:

  • values are stored in sorted order;
  • MySQL can jump quickly to a value;
  • MySQL can scan a range efficiently;
  • MySQL can sometimes return rows already ordered;
  • MySQL can use the index for equality, range, join, grouping, and sorting operations.

InnoDB B+tree index diagram for an email column, showing root, intermediate, and leaf nodes with a highlighted equality-lookup path and the doubly-linked leaf chain.

B-tree Example: Equality Lookup

SQL
SELECT *
FROM users
WHERE email = 'nazar@example.com';

Useful index:

SQL
CREATE INDEX idx_users_email
ON users (email);

Why it helps:

Text
Find exact email in sorted index → find matching row → return result

This is usually very fast because email has high uniqueness.

B-tree Example: Range Lookup

SQL
SELECT *
FROM orders
WHERE created_at >= '2026-05-01'
  AND created_at <  '2026-06-01';

Useful index:

SQL
CREATE INDEX idx_orders_created_at
ON orders (created_at);

Why it helps:

Text
Jump to first date >= 2026-05-01
Walk the linked leaves forward
Stop at first date >= 2026-06-01

This is much better than checking every order.

B-tree Example: ORDER BY

SQL
SELECT id, created_at, total
FROM orders
ORDER BY created_at DESC
LIMIT 50;

Useful index:

SQL
CREATE INDEX idx_orders_created_at
ON orders (created_at);

Because the index is ordered, MySQL may be able to read rows in index order instead of sorting all rows manually.

This is one reason indexes can help even when there is no WHERE clause.


3. Primary Key vs Secondary Indexes in InnoDB

In InnoDB, the primary key is special.

The table data is organized around the primary key. This is often called a clustered index.

Example:

SQL
PRIMARY KEY (id)

The actual table rows are stored in primary key order.

A secondary index is any other index:

SQL
CREATE INDEX idx_users_email
ON users (email);

In InnoDB, a secondary index stores the indexed column values plus the primary key value.

A simplified secondary index on email might look like:

Text
email                         primary key
------------------------------------------------
anna@example.com              101
john@example.com              482
nazar@example.com             917

Then MySQL uses the primary key to find the full row.

This means a secondary index lookup may involve two steps:

Text
1. Search secondary index.
2. Use primary key to fetch full row.

That second step matters.

If your query can be answered completely from the secondary index, MySQL may not need to fetch the full row. That is called a covering index.

We will cover that later.

Side-by-side comparison of InnoDB clustered primary-key index (leaves hold full row data) and secondary index (leaves hold email plus PK pointer with a PK-lookup hop arrow).


4. Single-Column Indexes

A single-column index is an index on one column.

SQL
CREATE INDEX idx_users_status
ON users (status);

It helps queries that filter, join, group, or sort by that column.

Example: Login by Email

SQL
SELECT id, email, password_hash
FROM users
WHERE email = 'nazar@example.com'
LIMIT 1;

Good index:

SQL
CREATE UNIQUE INDEX users_email_unique
ON users (email);

Why this is good:

  • email is highly selective;
  • the query expects one row;
  • a unique index tells MySQL that only one matching row can exist;
  • it also protects data integrity.

This is a perfect index use case.

Example: Search by Status

SQL
SELECT *
FROM users
WHERE status = 'active';

Possible index:

SQL
CREATE INDEX idx_users_status
ON users (status);

But this may or may not help.

Why?

Because status often has low cardinality.

Maybe your table has only these values:

Text
active
inactive
banned
pending

If 95% of users are active, then this query returns most of the table:

SQL
WHERE status = 'active'

In that case, using the index may not be much better than scanning the table.

Sometimes it can even be worse because MySQL has to:

  1. scan many entries in the index;
  2. jump back to the table for many rows.

So the rule is not "index every filter column."

The better rule is:

Index columns that help MySQL eliminate a large amount of work.


5. Cardinality and Selectivity

Cardinality means how many distinct values a column has.

Selectivity means how well a condition narrows the result set.

These ideas are related.

High Cardinality

Example: email

Text
10,000,000 rows
9,999,500 distinct emails

This is high cardinality.

A condition like this is highly selective:

SQL
WHERE email = 'nazar@example.com'

It probably returns one row.

Great index candidate.

Low Cardinality

Example: status

Text
10,000,000 rows
4 distinct statuses

This is low cardinality.

A condition like this may not be very selective:

SQL
WHERE status = 'active'

If most users are active, the index may not help much.

Medium Cardinality

Example: country_code

Text
10,000,000 rows
200 distinct country codes

This is medium cardinality.

An index may help if the filtered value is selective enough.

For example:

SQL
WHERE country_code = 'IS'

may return a small number of rows.

But:

SQL
WHERE country_code = 'US'

may return a large number of rows.

The usefulness depends on data distribution, not only the number of distinct values.

How to Estimate Selectivity

You can run:

SQL
SELECT
    COUNT(*) AS total_rows,
    COUNT(DISTINCT email) AS distinct_emails,
    COUNT(DISTINCT status) AS distinct_statuses,
    COUNT(DISTINCT country_code) AS distinct_countries
FROM users;

You can also check distribution:

SQL
SELECT status, COUNT(*) AS total
FROM users
GROUP BY status
ORDER BY total DESC;

Example result:

status total
active 9,500,000
pending 300,000
inactive 150,000
banned 50,000

Now you can reason clearly.

An index on status may be useful for:

SQL
WHERE status = 'banned'

but weak for:

SQL
WHERE status = 'active'

Same column. Different value. Different selectivity.


6. Composite Indexes

A composite index is an index on multiple columns.

SQL
CREATE INDEX idx_orders_user_status_created
ON orders (user_id, status, created_at);

This index is not three separate indexes.

It is one ordered structure.

Think of it as being sorted like this:

Text
user_id → status → created_at

First by user_id.

Inside each user_id, sorted by status.

Inside each (user_id, status) group, sorted by created_at.

This order matters a lot.

Nested block diagram of a composite MySQL index on (user_id, status, created_at), showing how rows are physically sorted and how each equality predicate narrows the scan.

Example Table

SQL orders.sql
CREATE TABLE orders (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    user_id BIGINT UNSIGNED NOT NULL,
    status VARCHAR(30) NOT NULL,
    total DECIMAL(10, 2) NOT NULL,
    created_at DATETIME NOT NULL,
    paid_at DATETIME NULL,
    PRIMARY KEY (id),
    INDEX idx_orders_user_status_created (user_id, status, created_at)
);

Query 1: Excellent Match

SQL
SELECT *
FROM orders
WHERE user_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

Index:

SQL
INDEX idx_orders_user_status_created (user_id, status, created_at)

This is a strong match.

Why?

The query filters by:

SQL
user_id = 123
status = 'paid'

Then orders by:

SQL
created_at DESC

The index is already organized by:

Text
user_id → status → created_at

MySQL can jump to the section for user 123, then to status paid, then scan by created_at.

This is exactly the kind of query composite indexes are designed for.

Query 2: Leftmost Prefix Works

SQL
SELECT *
FROM orders
WHERE user_id = 123;

Same index can help:

SQL
INDEX (user_id, status, created_at)

Because user_id is the leftmost column.

This is called the leftmost-prefix rule.

A composite index on:

SQL
(a, b, c)

can be used for:

Text
(a)
(a, b)
(a, b, c)

But not usually for:

Text
(b)
(c)
(b, c)

because those are not leftmost prefixes.

Query 3: Skipping the First Column Is a Problem

SQL
SELECT *
FROM orders
WHERE status = 'paid';

Index:

SQL
INDEX (user_id, status, created_at)

This index is not ideal.

Why?

Because the index is sorted first by user_id.

Across the whole index, status = 'paid' is scattered inside each user group.

Simplified:

Text
user_id  status
----------------
1        canceled
1        paid
1        pending
2        paid
2        refunded
3        canceled
3        paid

There is no single continuous section for all paid orders.

If you often query by status alone, you may need a different index:

SQL
CREATE INDEX idx_orders_status_created
ON orders (status, created_at);

But do not create it automatically. First confirm that the query is important and selective enough.


7. The Leftmost-Prefix Rule

The leftmost-prefix rule is one of the most important indexing rules in MySQL.

For an index:

SQL
INDEX idx_example (a, b, c)

MySQL can efficiently use:

SQL
WHERE a = ?
SQL
WHERE a = ? AND b = ?
SQL
WHERE a = ? AND b = ? AND c = ?

It can also often use:

SQL
WHERE a = ? AND b > ?

But after a range condition, the remaining columns are less useful for narrowing the index lookup.

Example: Equality + Range

SQL
SELECT *
FROM orders
WHERE user_id = 123
  AND created_at >= '2026-05-01'
  AND created_at <  '2026-06-01';

Good index:

SQL
CREATE INDEX idx_orders_user_created
ON orders (user_id, created_at);

Why?

The index groups rows by user_id, then sorts them by created_at.

MySQL can:

  1. jump to user_id = 123;
  2. scan the date range for that user.

Example: Wrong Order

SQL
CREATE INDEX idx_orders_created_user
ON orders (created_at, user_id);

For this query:

SQL
WHERE user_id = 123
  AND created_at >= '2026-05-01'
  AND created_at <  '2026-06-01'

this index is weaker.

It is sorted first by created_at, then user_id.

MySQL can find the date range, but inside that date range it may need to filter many users.

If the date range includes millions of orders, this is expensive.

Practical Rule

For many OLTP queries, a good composite index order is:

Text
equality filters first → range filter → ordering column if compatible

Example:

SQL
WHERE tenant_id = ?
  AND status = ?
  AND created_at >= ?
ORDER BY created_at DESC
LIMIT 50

Good index:

SQL
CREATE INDEX idx_orders_tenant_status_created
ON orders (tenant_id, status, created_at);

Why?

  • tenant_id = ? narrows to one tenant;
  • status = ? narrows further;
  • created_at >= ? scans a time range;
  • ORDER BY created_at can often follow index order.

8. Composite Index Column Order: Real Examples

Choosing column order is one of the most important indexing decisions.

Let's use this table:

SQL payments.sql
CREATE TABLE payments (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    tenant_id BIGINT UNSIGNED NOT NULL,
    user_id BIGINT UNSIGNED NOT NULL,
    status VARCHAR(30) NOT NULL,
    provider VARCHAR(30) NOT NULL,
    amount DECIMAL(10, 2) NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY (id)
);

Query A: Tenant Dashboard

SQL
SELECT *
FROM payments
WHERE tenant_id = 10
  AND status = 'failed'
ORDER BY created_at DESC
LIMIT 50;

Good index:

SQL
CREATE INDEX idx_payments_tenant_status_created
ON payments (tenant_id, status, created_at);

Why not this?

SQL
CREATE INDEX idx_payments_status_tenant_created
ON payments (status, tenant_id, created_at);

It might also work for this exact query because both tenant_id and status are equality conditions.

But in multi-tenant systems, tenant_id is often a strong leading column because most queries are tenant-scoped.

It also helps more query shapes:

SQL
WHERE tenant_id = 10
ORDER BY created_at DESC

can use (tenant_id, status, created_at) only partially for filtering by tenant, but not perfectly for ordering across all statuses.

For this query:

SQL
SELECT *
FROM payments
WHERE tenant_id = 10
ORDER BY created_at DESC
LIMIT 50;

this index is better:

SQL
CREATE INDEX idx_payments_tenant_created
ON payments (tenant_id, created_at);

That is an important lesson:

One composite index cannot be perfect for every query.

Design indexes for your most important query patterns.

Query B: User Payment History

SQL
SELECT id, status, amount, created_at
FROM payments
WHERE user_id = 777
ORDER BY created_at DESC
LIMIT 20;

Good index:

SQL
CREATE INDEX idx_payments_user_created
ON payments (user_id, created_at);

This is a classic "history page" query.

The index lets MySQL find all payments for one user and return the newest rows quickly.

Query C: Failed Payments Report

SQL
SELECT id, tenant_id, provider, amount, created_at
FROM payments
WHERE status = 'failed'
  AND created_at >= '2026-05-01'
  AND created_at <  '2026-06-01'
ORDER BY created_at DESC;

Possible index:

SQL
CREATE INDEX idx_payments_status_created
ON payments (status, created_at);

This works if status = 'failed' is selective enough.

If many rows are failed, the query may still scan many rows.

If reports are large, you may need a reporting table, summary table, partitioning strategy, or async analytics pipeline. Indexes help, but they do not make every analytical query cheap.


9. Equality, Range, and Sorting

A common query pattern is:

SQL
WHERE a = ?
  AND b = ?
  AND c BETWEEN ? AND ?
ORDER BY c

Good index:

SQL
(a, b, c)

Why?

  • a equality narrows the index;
  • b equality narrows it more;
  • c range scans within that smaller group;
  • ORDER BY c can use the same index order.

Example: Orders Page

SQL
SELECT id, total, created_at
FROM orders
WHERE user_id = 123
  AND status = 'paid'
  AND created_at >= '2026-01-01'
ORDER BY created_at DESC
LIMIT 50;

Good index:

SQL
CREATE INDEX idx_orders_user_status_created
ON orders (user_id, status, created_at);

This is strong because the sort column is also the range column.

When ORDER BY Cannot Use the Index Well

SQL
SELECT *
FROM orders
WHERE user_id = 123
  AND created_at >= '2026-01-01'
ORDER BY total DESC;

Index:

SQL
(user_id, created_at)

This helps filter by user and date.

But it does not provide rows ordered by total.

MySQL may still need a filesort.

That does not always mean disaster. Sorting 100 rows is fine. Sorting 5 million rows is not.

Always ask:

How many rows are left after filtering before sorting?


10. Covering Indexes

A covering index is an index that contains all columns needed by a query.

Example:

SQL
SELECT id, user_id, status, created_at
FROM orders
WHERE user_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

Covering index:

SQL
CREATE INDEX idx_orders_user_status_created
ON orders (user_id, status, created_at);

The query needs:

Text
id, user_id, status, created_at

All of these are available in the index. Note that id is "free" because in InnoDB the primary key is appended to every secondary index.

So MySQL may not need to fetch the full row from the table.

In EXPLAIN, you may see:

Text
Extra: Using index

That usually means a covering index is being used.

Two-path flow diagram: normal secondary index (Query &gt; Secondary lookup &gt; Full row fetch &gt; Result) vs covering index (Query &gt; Secondary lookup &gt; Result), with the extra hop crossed out.

Covering Index vs Normal Index

Normal index:

SQL
CREATE INDEX idx_orders_user_status_created
ON orders (user_id, status, created_at);

Query:

SQL
SELECT *
FROM orders
WHERE user_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

Because of SELECT *, MySQL needs columns not stored in the index.

So it must fetch full rows.

Covering query:

SQL
SELECT id, status, created_at
FROM orders
WHERE user_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

This may be served mostly from the index.


11. EXPLAIN: How to See What MySQL Is Doing

Use EXPLAIN before your query:

SQL
EXPLAIN
SELECT id, status, created_at
FROM orders
WHERE user_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

Example output may look like this:

Text
id | select_type | table  | type | possible_keys                  | key                            | rows | filtered | Extra
---+-------------+--------+------+--------------------------------+--------------------------------+------+----------+----------------
1  | SIMPLE      | orders | ref  | idx_orders_user_status_created | idx_orders_user_status_created | 120  | 100.00   | Using where

type

This shows the access type. Common values, from best to worst:

type Meaning
system Table has 0 or 1 rows. Trivial.
const At most one row matched, via PRIMARY KEY or UNIQUE on equality.
eq_ref One row from this table for each row from a previous joined table.
ref Non-unique index lookup on equality. Common and good.
range Index range scan (BETWEEN, >, <, IN (...)).
index Full index scan. Cheaper than ALL only if index is much smaller.
ALL Full table scan. Acceptable for small tables; concerning for big ones.

You usually want to avoid ALL on large tables. But context matters, a full scan of a 500-row table is fine.

Horizontal gradient bar showing MySQL EXPLAIN access types from cheap (system, const, eq_ref) to expensive (index, ALL), with one-line descriptions under each segment.

possible_keys

Indexes MySQL could potentially use.

key

The index MySQL actually chose.

If this is NULL, MySQL did not use an index for that table.

rows

Estimated number of rows MySQL expects to examine.

This is an estimate, not an exact count.

If rows is very large, your query may be expensive.

filtered

Estimated percentage of rows that remain after conditions are applied.

Extra

This contains useful notes like:

Text
Using where
Using index
Using filesort
Using temporary
Using index condition
Extra value What it means
Using index Often a covering index, query answered from index without row lookup.
Using where A WHERE clause is applied. Common; not a problem by itself.
Using index condition Index Condition Pushdown is in play (more in Section 21).
Using filesort Extra sorting step. Cheap on small result sets, expensive on large ones.
Using temporary A temporary table is built, usually for GROUP BY or DISTINCT.
Using join buffer The join is using a buffer because no usable index. Often a sign of trouble.

EXPLAIN ANALYZE and FORMAT=TREE (MySQL 8.0.18+)

EXPLAIN shows what MySQL plans to do. Two newer forms show what actually happened, or describe the plan more precisely:

SQL
EXPLAIN FORMAT=TREE
SELECT id, status, created_at
FROM orders
WHERE user_id = 123
  AND status = 'paid';
SQL
EXPLAIN ANALYZE
SELECT id, status, created_at
FROM orders
WHERE user_id = 123
  AND status = 'paid';

Differences:

Form Runs the query? Best for
EXPLAIN No Quick plan check; columns are familiar.
EXPLAIN FORMAT=TREE No Reading plan as a tree of iterators; clearer for joins.
EXPLAIN ANALYZE Yes Comparing estimated vs actual rows and timing.

12. Real EXPLAIN Walkthrough

Suppose we have this table:

SQL audit_logs.sql
CREATE TABLE audit_logs (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    tenant_id BIGINT UNSIGNED NOT NULL,
    actor_id BIGINT UNSIGNED NULL,
    action VARCHAR(100) NOT NULL,
    entity_type VARCHAR(100) NOT NULL,
    entity_id BIGINT UNSIGNED NOT NULL,
    created_at DATETIME NOT NULL,
    metadata JSON NULL,
    PRIMARY KEY (id)
);

Slow query:

SQL
SELECT id, actor_id, action, entity_type, entity_id, created_at
FROM audit_logs
WHERE tenant_id = 42
  AND entity_type = 'Order'
  AND entity_id = 999
ORDER BY created_at DESC
LIMIT 50;

Without a useful index, EXPLAIN may show:

Text
type: ALL
key: NULL
rows: 8000000
Extra: Using where; Using filesort

That means MySQL may scan millions of rows and sort matching rows.

Good index:

SQL
CREATE INDEX idx_audit_tenant_entity_created
ON audit_logs (tenant_id, entity_type, entity_id, created_at);

Now the query can use:

Text
tenant_id = 42
entity_type = 'Order'
entity_id = 999
created_at order

Better EXPLAIN may show:

Text
type: ref
key: idx_audit_tenant_entity_created
rows: 75
Extra: Using where

This is a huge improvement. The index matches the query shape.


13. Optimizer Statistics, ANALYZE TABLE, and Histograms

The optimizer makes plan decisions using statistics about your data. If those statistics are wrong, the plan is wrong, even when the right index exists.

When Statistics Get Stale

InnoDB samples index pages to estimate cardinality. After massive inserts, deletes, or bulk imports, those samples can drift. The query that ran fine yesterday picks the wrong index today, and EXPLAIN's rows estimate is suddenly off by an order of magnitude.

Refresh With ANALYZE TABLE

SQL
ANALYZE TABLE orders;

This re-samples the index and refreshes cardinality estimates. It's cheap, online (with some short metadata locking), and is the first thing to try when an EXPLAIN ANALYZE shows a huge gap between estimated rows and actual rows.

Histograms (MySQL 8.0+)

Plain index statistics know how many distinct values a column has, but not how those values are distributed. That's why WHERE status = 'active' and WHERE status = 'banned' can't be told apart by the optimizer using only cardinality.

Histograms fill that gap. They store an actual distribution snapshot for a column.

SQL
ANALYZE TABLE orders
UPDATE HISTOGRAM ON status, country_code WITH 32 BUCKETS;

After that, the optimizer knows that status = 'active' matches 95% of rows and status = 'banned' matches 0.5%, and can choose plans accordingly.

SQL
-- Drop a histogram you no longer want maintained:
ANALYZE TABLE orders DROP HISTOGRAM ON country_code;

14. Indexes for Joins

Indexes are critical for joins.

Example:

SQL
SELECT o.id, o.total, u.email
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.status = 'paid'
  AND o.created_at >= '2026-05-01'
ORDER BY o.created_at DESC
LIMIT 100;

Tables:

SQL
CREATE TABLE users (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    email VARCHAR(255) NOT NULL,
    PRIMARY KEY (id)
);

CREATE TABLE orders (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    user_id BIGINT UNSIGNED NOT NULL,
    status VARCHAR(30) NOT NULL,
    total DECIMAL(10, 2) NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY (id)
);

Useful indexes:

SQL
CREATE INDEX idx_orders_status_created_user
ON orders (status, created_at, user_id);

users.id is already indexed because it is the primary key.

The orders index helps MySQL find paid orders in the date range and includes user_id for the join.

Important:

Index foreign key columns used in joins.

For example:

SQL
orders.user_id
comments.post_id
line_items.order_id
payments.user_id

In many schemas, foreign keys are frequent join columns and should usually be indexed.


15. Indexes and Soft Deletes

In Laravel and many other applications, tables often have soft deletes:

SQL
deleted_at DATETIME NULL

Common query:

SQL
SELECT *
FROM posts
WHERE tenant_id = 10
  AND deleted_at IS NULL
ORDER BY created_at DESC
LIMIT 20;

Possible index:

SQL
CREATE INDEX idx_posts_tenant_deleted_created
ON posts (tenant_id, deleted_at, created_at);

Why?

  • tenant_id = 10 narrows to one tenant;
  • deleted_at IS NULL filters active rows;
  • created_at supports ordering.

But be careful.

If almost all rows have deleted_at IS NULL, then deleted_at alone is not selective.

This index is useful not because deleted_at is amazing alone, but because it participates in a composite index that matches a common query pattern.


16. Indexes for Queues and Background Jobs

Suppose you have a jobs table:

SQL jobs.sql
CREATE TABLE jobs (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    queue VARCHAR(60) NOT NULL,
    status VARCHAR(30) NOT NULL,
    available_at DATETIME NOT NULL,
    attempts INT NOT NULL DEFAULT 0,
    reserved_at DATETIME NULL,
    payload JSON NOT NULL,
    PRIMARY KEY (id)
);

Worker query:

SQL
SELECT *
FROM jobs
WHERE status = 'pending'
  AND available_at <= NOW()
ORDER BY available_at ASC
LIMIT 1;

Good index:

SQL
CREATE INDEX idx_jobs_status_available
ON jobs (status, available_at);

Why?

  • filter pending jobs;
  • scan jobs that are available now;
  • return oldest available job first.

If your workers also filter by queue name:

SQL
SELECT *
FROM jobs
WHERE queue = 'emails'
  AND status = 'pending'
  AND available_at <= NOW()
ORDER BY available_at ASC
LIMIT 1;

Better index:

SQL
CREATE INDEX idx_jobs_queue_status_available
ON jobs (queue, status, available_at);

This is a classic high-impact index because queue polling happens very frequently.


17. Indexes and Pagination

Offset pagination is common:

SQL
SELECT id, title, created_at
FROM posts
WHERE tenant_id = 10
ORDER BY created_at DESC
LIMIT 20 OFFSET 100000;

Even with an index, large offsets can be expensive.

MySQL still needs to walk past many rows before returning the next 20.

Better pattern: cursor pagination.

SQL
SELECT id, title, created_at
FROM posts
WHERE tenant_id = 10
  AND created_at < '2026-05-08 12:00:00'
ORDER BY created_at DESC
LIMIT 20;

Good index:

SQL
CREATE INDEX idx_posts_tenant_created
ON posts (tenant_id, created_at);

Even better if created_at is not unique:

SQL
SELECT id, title, created_at
FROM posts
WHERE tenant_id = 10
  AND (
      created_at < '2026-05-08 12:00:00'
      OR (created_at = '2026-05-08 12:00:00' AND id < 987654)
  )
ORDER BY created_at DESC, id DESC
LIMIT 20;

Index:

SQL
CREATE INDEX idx_posts_tenant_created_id
ON posts (tenant_id, created_at, id);

Cursor pagination usually scales better than deep offset pagination.


18. Prefix Indexes for Long Strings

For long string columns, you can index only the first N characters.

Example:

SQL
CREATE INDEX idx_users_name_prefix
ON users (name(20));

This can reduce index size.

MySQL requires prefix length when indexing BLOB or TEXT columns.

SQL
CREATE INDEX idx_articles_title_prefix
ON articles (title(100));

But prefix indexes have a tradeoff.

If the prefix is too short, many values may look the same to the index.

Text
"Introduction to MySQL indexes..."
"Introduction to MySQL performance..."
"Introduction to MySQL query plans..."

A short prefix like title(12) may not be selective.

Practical approach:

SQL
SELECT
    COUNT(DISTINCT title) AS full_distinct,
    COUNT(DISTINCT LEFT(title, 20)) AS prefix_20,
    COUNT(DISTINCT LEFT(title, 50)) AS prefix_50,
    COUNT(DISTINCT LEFT(title, 100)) AS prefix_100
FROM articles;

Choose a prefix that keeps good selectivity while reducing index size.


19. LIKE Queries and Indexes

This can use a B-tree index:

SQL
SELECT *
FROM users
WHERE email LIKE 'nazar%';

Useful index:

SQL
CREATE INDEX idx_users_email
ON users (email);

Why?

Because the prefix is known. MySQL can find the range of values starting with nazar.

But this usually cannot use a normal B-tree index efficiently:

SQL
SELECT *
FROM users
WHERE email LIKE '%nazar%';

The index is sorted from the start of the string. If you search for something in the middle, MySQL cannot jump to the right section.

For contains-search, consider:

  • FULLTEXT indexes (next section);
  • external search engines (Elasticsearch, Meilisearch, OpenSearch, Typesense);
  • generated columns for special cases;
  • n-gram indexing strategies;
  • application-specific search tables.

FULLTEXT indexes are a separate index type designed for natural-language and boolean searches over text columns. InnoDB has supported them since MySQL 5.6.

SQL
CREATE TABLE articles (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    title VARCHAR(255) NOT NULL,
    body TEXT NOT NULL,
    PRIMARY KEY (id),
    FULLTEXT KEY ft_articles_title_body (title, body)
);

Query with natural-language mode:

SQL
SELECT id, title, MATCH(title, body) AGAINST ('mysql indexes' IN NATURAL LANGUAGE MODE) AS score
FROM articles
WHERE MATCH(title, body) AGAINST ('mysql indexes' IN NATURAL LANGUAGE MODE)
ORDER BY score DESC
LIMIT 20;

Or boolean mode for explicit operators (+, -, *, quoted phrases):

SQL
SELECT id, title
FROM articles
WHERE MATCH(title, body) AGAINST ('+mysql +"composite index" -mongodb' IN BOOLEAN MODE);
FULLTEXT good for FULLTEXT bad for
Searching long text columns Exact substring matching that doesn't respect word boundaries
Word-based and phrase queries Languages with no whitespace word breaks (without ngram)
Relevance ranking for free Anything that needs faceting, fuzzy matching, or typo-tolerance
Lightweight in-database search at small/mid scale High-volume search at scale (use a real search engine)

21. Functions on Indexed Columns (and How Functional Indexes Solve It)

This query is common and problematic:

SQL
SELECT *
FROM orders
WHERE DATE(created_at) = '2026-05-08';

Even if you have:

SQL
CREATE INDEX idx_orders_created_at
ON orders (created_at);

MySQL cannot use it efficiently because you wrapped the indexed column in a function.

Better, rewrite to a range:

SQL
SELECT *
FROM orders
WHERE created_at >= '2026-05-08 00:00:00'
  AND created_at <  '2026-05-09 00:00:00';

This allows a range scan on the raw indexed column.

General rule:

Do not transform the indexed column in the WHERE clause if you want a normal B-tree index to be used efficiently.

Avoid:

SQL
WHERE LOWER(email) = 'nazar@example.com'
WHERE DATE(created_at) = '2026-05-08'
WHERE YEAR(created_at) = 2026
WHERE price * 100 > 5000

Functional / Expression Indexes (MySQL 8.0.13+)

When you genuinely need to query on an expression, for example case-insensitive email lookup, or "everything from 2026", MySQL 8.0.13 introduced functional indexes: indexes on the result of an expression rather than a raw column.

SQL
CREATE INDEX idx_users_email_lower
ON users ((LOWER(email)));

Note the double parentheses, that's how MySQL distinguishes a functional key part from a column name.

Now this query can use the index:

SQL
SELECT id, email
FROM users
WHERE LOWER(email) = 'nazar@example.com';

Other useful functional indexes:

SQL
-- Index by date part of a timestamp:
CREATE INDEX idx_orders_created_date
ON orders ((DATE(created_at)));

-- Index by extracted JSON field:
CREATE INDEX idx_events_user
ON events ((CAST(payload->>'$.user_id' AS UNSIGNED)));

If you're on MySQL ≤ 5.7, the alternative is a stored generated column plus an index on it:

SQL
ALTER TABLE users
  ADD COLUMN email_lower VARCHAR(255) GENERATED ALWAYS AS (LOWER(email)) STORED,
  ADD INDEX idx_users_email_lower (email_lower);

22. Index Condition Pushdown

Index Condition Pushdown (ICP) is an optimization where MySQL evaluates parts of the WHERE condition using index columns at the storage engine level, before reading the full row from the clustered index.

Example:

SQL
CREATE INDEX idx_people_zip_last_first
ON people (zipcode, lastname, firstname);

Query:

SQL
SELECT *
FROM people
WHERE zipcode = '55431'
  AND lastname LIKE 'Boy%'
  AND firstname = 'Nazar';

Without ICP, MySQL would seek by zipcode = '55431', fetch every full row in that zipcode, then check lastname and firstname in the server layer.

With ICP, MySQL pushes the lastname LIKE 'Boy%' and firstname = 'Nazar' checks down to the storage engine. Rows that fail those tests in the index are discarded before a full-row read happens. Fewer disk hops, less I/O.

In EXPLAIN, this can appear as:

Text
Using index condition

This does not mean the query is fully covered (that would be Using index). It means MySQL pushed part of the condition down to the storage engine while scanning the index.


23. Index Merge

Sometimes a query has two predicates on two different indexed columns and no single composite index matches. MySQL can sometimes use both single-column indexes at once. This is called index merge.

SQL
SELECT *
FROM users
WHERE email = 'nazar@example.com'
   OR phone = '+15551234567';

With:

SQL
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_users_phone ON users (phone);

MySQL may use both indexes and merge the results.

Three flavours:

Index merge type Used for Extra shows
Union OR between equality predicates on different indexes Using union(idx_a,idx_b); Using where
Intersection AND between equality predicates on different indexes Using intersect(idx_a,idx_b); Using where
Sort-Union OR where at least one side is a range Using sort_union(idx_a,idx_b); Using where

Index merge is real but it isn't free, sorting and deduplicating row-IDs from two indexes has its own cost. A single composite index is usually faster when one matches the access pattern. If EXPLAIN shows Using sort_union on a hot query, that's a hint to consider a different index.


24. NULL and Indexes

NULL semantics trip up a lot of indexing intuition.

  • B-tree indexes do include NULL values. WHERE x IS NULL and WHERE x IS NOT NULL can use an index.
  • WHERE x = NULL is always false, use IS NULL.
  • For unique indexes, MySQL allows multiple NULL values (NULL is not equal to NULL for uniqueness purposes). This catches people writing migrations that "should have prevented duplicates."

Examples:

SQL
-- Uses idx_users_deleted_at:
WHERE deleted_at IS NULL

-- Also uses it:
WHERE deleted_at IS NOT NULL

-- Returns nothing, ever:
WHERE deleted_at = NULL

If you're indexing a nullable column where most rows are NULL and queries always look for IS NOT NULL, that index can be very selective for the non-null minority, useful for soft-delete-style flags inverted (e.g., archived_at).


25. Indexes for GROUP BY and DISTINCT

GROUP BY and DISTINCT can use indexes to avoid building a temporary result.

Tight Index Scan

SQL
SELECT user_id, COUNT(*)
FROM orders
WHERE status = 'paid'
GROUP BY user_id;

With:

SQL
CREATE INDEX idx_orders_status_user
ON orders (status, user_id);

MySQL filters status = 'paid' in the index and reads user_id in already-sorted order, grouping is "free." EXPLAIN's Extra should not show Using temporary or Using filesort.

Loose Index Scan

For queries that pull only the leading column(s) of a composite index, MySQL can sometimes use a loose index scan, jumping between distinct prefixes instead of reading every row.

SQL
SELECT user_id, MAX(created_at)
FROM orders
GROUP BY user_id;

With (user_id, created_at), MySQL can jump per-user-id and read just the last entry of each group.

When DISTINCT Wins

SQL
SELECT DISTINCT status
FROM orders;

With an index on status (or a leftmost prefix that starts with status), this is instant, MySQL walks the unique values in the index without scanning the table.


26. Modern MySQL 8.0 Index Features

Three index features added in MySQL 8.0 that change how you operate indexes in production.

Invisible Indexes (8.0+)

An invisible index exists on disk and is maintained on writes, but the optimizer pretends it doesn't exist when planning queries.

SQL
ALTER TABLE orders ALTER INDEX idx_orders_status INVISIBLE;

Why this matters: before MySQL 8.0, the only way to see whether dropping an index would hurt performance was to drop it and find out. With invisible indexes, you can flip the index off, watch production, and flip it back if anything breaks, without re-creating it from scratch on a large table.

SQL
-- Restore:
ALTER TABLE orders ALTER INDEX idx_orders_status VISIBLE;

Descending Indexes (8.0+)

Pre-8.0, the DESC keyword in CREATE INDEX was parsed but ignored, every index was effectively ascending, and MySQL handled ORDER BY x DESC by reading the index backwards. That worked for one column but broke for mixed-direction sorts:

SQL
SELECT *
FROM orders
WHERE tenant_id = 10
ORDER BY status ASC, created_at DESC
LIMIT 50;

Pre-8.0, no single index could serve this without a filesort. In 8.0+ you can declare directions per column:

SQL
CREATE INDEX idx_orders_tenant_status_created
ON orders (tenant_id, status ASC, created_at DESC);

Now that exact ORDER BY status ASC, created_at DESC reads the index forward and skips the sort.

Multi-Valued Indexes (8.0.17+): JSON Arrays

If you store arrays inside a JSON column, regular indexes can't help membership queries. Multi-valued indexes can.

SQL
CREATE TABLE products (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    data JSON NOT NULL,
    PRIMARY KEY (id),
    INDEX idx_products_tags ((CAST(data->'$.tags' AS CHAR(40) ARRAY)))
);

Query:

SQL
SELECT id
FROM products
WHERE JSON_CONTAINS(data->'$.tags', '"sale"');

-- Or, equivalently and often clearer:
SELECT id
FROM products
WHERE 'sale' MEMBER OF (data->'$.tags');

MySQL stores one index entry per array element, so a single row with tags: ["sale", "new", "featured"] occupies three entries, meaning the index can be much larger than a row count would suggest. Use this for tag-style data where membership lookup is the dominant access pattern, not for everything-JSON.


27. Index Hints (USE / FORCE / IGNORE INDEX)

Sometimes the optimizer chooses the wrong index. Index hints let you suggest, force, or exclude indexes.

SQL
-- Suggest (optimizer may still ignore):
SELECT * FROM orders USE INDEX (idx_orders_user_created)
WHERE user_id = 123 ORDER BY created_at DESC LIMIT 20;

-- Strongly prefer (treats other plans as much more expensive):
SELECT * FROM orders FORCE INDEX (idx_orders_user_created)
WHERE user_id = 123 ORDER BY created_at DESC LIMIT 20;

-- Exclude:
SELECT * FROM orders IGNORE INDEX (idx_orders_status)
WHERE user_id = 123 ORDER BY created_at DESC LIMIT 20;

28. Redundant and Duplicate Indexes

Indexes are not free.

Every index takes storage. Every insert, update, and delete may need to update indexes. Too many indexes slow down writes and increase memory pressure.

Duplicate Index Example

Bad:

SQL
CREATE INDEX idx_orders_user_id
ON orders (user_id);

CREATE INDEX idx_orders_user_id_duplicate
ON orders (user_id);

These are duplicate indexes. One should be removed.

Redundant Index Example

SQL
CREATE INDEX idx_orders_user
ON orders (user_id);

CREATE INDEX idx_orders_user_status
ON orders (user_id, status);

Is idx_orders_user redundant?

Maybe.

Because (user_id, status) can be used for queries filtering by user_id.

But do not blindly drop it.

The shorter index may still be useful because:

  • it is smaller;
  • it may be cheaper for some queries (less data per index entry, more entries per page);
  • it may support constraints or foreign keys;
  • the optimizer may choose it for specific plans.

Still, when you see both, review them, and use invisible indexes to test the drop safely.

A common cleanup task is finding indexes that are:

  • exact duplicates;
  • left-prefix duplicates;
  • never used (sys.schema_unused_indexes is the friendly view for this on MySQL 8.0+);
  • too wide;
  • created for old query patterns that no longer exist.

29. Write Cost of Indexes

Indexes speed up reads, but slow down writes.

When you insert a row into orders, MySQL must update the primary key and every secondary index.

SQL
CREATE TABLE orders (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    user_id BIGINT UNSIGNED NOT NULL,
    status VARCHAR(30) NOT NULL,
    total DECIMAL(10,2) NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY (id),
    INDEX idx_user_id (user_id),
    INDEX idx_status (status),
    INDEX idx_created_at (created_at),
    INDEX idx_user_status_created (user_id, status, created_at),
    INDEX idx_status_created (status, created_at)
);

Every insert must maintain all those indexes. Every update to status must update indexes that contain status.

Indexes are a tradeoff.

For read-heavy tables, more indexes may be acceptable. For write-heavy tables, every index must justify its cost.


30. Practical Index Design Workflow

Here is a practical workflow for indexing a real query.

Step 1: Start From the Query

Do not start from the table. Start from the query.

SQL
SELECT id, total, created_at
FROM orders
WHERE tenant_id = 10
  AND user_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

Step 2: Identify Equality Filters

SQL
tenant_id = 10
user_id = 123
status = 'paid'

Step 3: Identify Range Filters

None here. If we had created_at >= '2026-01-01', that would be a range filter.

Step 4: Identify Sorting

SQL
ORDER BY created_at DESC

Step 5: Build Candidate Index

SQL
CREATE INDEX idx_orders_tenant_user_status_created
ON orders (tenant_id, user_id, status, created_at);

Step 6: Check Selectivity and Query Patterns

Ask:

  • Is tenant_id always present?
  • Is user_id always present?
  • Is status always present?
  • Do we also query only by tenant and created date?
  • Is this query hot enough to deserve this index?
  • Does another existing index already cover it?

Step 7: Run EXPLAIN (and EXPLAIN ANALYZE)

SQL
EXPLAIN
SELECT id, total, created_at
FROM orders
WHERE tenant_id = 10
  AND user_id = 123
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

Check:

  • Did MySQL choose the index?
  • Are estimated rows low?
  • Is there Using filesort?
  • Is there Using index?
  • Does EXPLAIN ANALYZE confirm low actual rows?

Step 8: Test With Realistic Data

Indexes behave differently on tiny dev databases.

A query that scans 500 rows locally may scan 50 million rows in production.

Use staging data that resembles production when possible.


Example 1: User Login

SQL
SELECT id, email, password_hash
FROM users
WHERE email = 'nazar@example.com'
LIMIT 1;

Index:

SQL
CREATE UNIQUE INDEX users_email_unique
ON users (email);

Reason: high selectivity, uniqueness constraint, common hot query.

Example 2: User Profile by Public ID

SQL
SELECT id, public_id, name
FROM users
WHERE public_id = 'usr_abc123'
LIMIT 1;

Index:

SQL
CREATE UNIQUE INDEX users_public_id_unique
ON users (public_id);

Example 3: User Orders Page

SQL
SELECT id, status, total, created_at
FROM orders
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 20;

Index:

SQL
CREATE INDEX idx_orders_user_created
ON orders (user_id, created_at);

Example 4: Admin Order Filter

SQL
SELECT id, user_id, status, total, created_at
FROM orders
WHERE status = 'failed'
  AND created_at >= '2026-05-01'
  AND created_at <  '2026-06-01'
ORDER BY created_at DESC
LIMIT 100;

Index:

SQL
CREATE INDEX idx_orders_status_created
ON orders (status, created_at);

But test selectivity. If failed is common, this can still scan many rows.

Example 5: Tenant-Scoped List

SQL
SELECT id, title, created_at
FROM posts
WHERE tenant_id = 10
  AND deleted_at IS NULL
ORDER BY created_at DESC
LIMIT 20;

Index:

SQL
CREATE INDEX idx_posts_tenant_deleted_created
ON posts (tenant_id, deleted_at, created_at);

Example 6: Product Lookup by Slug

SQL
SELECT id, name, price
FROM products
WHERE slug = 'macbook-pro-16'
LIMIT 1;

Index:

SQL
CREATE UNIQUE INDEX products_slug_unique
ON products (slug);

If slugs are tenant-specific:

SQL
CREATE UNIQUE INDEX products_tenant_slug_unique
ON products (tenant_id, slug);

Example 7: Many-to-Many Pivot Table

SQL
CREATE TABLE role_user (
    user_id BIGINT UNSIGNED NOT NULL,
    role_id BIGINT UNSIGNED NOT NULL,
    PRIMARY KEY (user_id, role_id)
);

The composite primary key on (user_id, role_id) handles "what roles does user X have?" perfectly. But this query:

SQL
SELECT user_id
FROM role_user
WHERE role_id = 5;

needs another index because it doesn't use the leftmost column:

SQL
CREATE INDEX idx_role_user_role_user
ON role_user (role_id, user_id);

Example 8: Webhook Deduplication

SQL
SELECT id
FROM webhooks
WHERE provider = 'stripe'
  AND external_event_id = 'evt_123'
LIMIT 1;

Index:

SQL
CREATE UNIQUE INDEX webhooks_provider_event_unique
ON webhooks (provider, external_event_id);

Reason: fast lookup, prevents duplicate processing, enforces a real business rule at the database level.

Example 9: Latest Event Per User

SQL
SELECT id, event_type, created_at
FROM events
WHERE user_id = 123
  AND event_type = 'login'
ORDER BY created_at DESC
LIMIT 1;

Index:

SQL
CREATE INDEX idx_events_user_type_created
ON events (user_id, event_type, created_at);

Example 10: API Token Lookup

SQL
SELECT id, user_id, revoked_at
FROM api_tokens
WHERE token_hash = '...'
LIMIT 1;

Index:

SQL
CREATE UNIQUE INDEX api_tokens_token_hash_unique
ON api_tokens (token_hash);

Reason: token hash is unique, lookup must be fast, and you should never store raw tokens, only a secure hash.


32. Common Mistakes

Mistake 1: Indexing Every Column

Bad:

SQL
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_users_name ON users (name);
CREATE INDEX idx_users_status ON users (status);
CREATE INDEX idx_users_created_at ON users (created_at);
CREATE INDEX idx_users_updated_at ON users (updated_at);

This may look safe, but it often creates write overhead without solving real query patterns.

Better:

  • identify slow queries;
  • design indexes for those queries;
  • remove unused or redundant indexes carefully (use invisible indexes to test removal safely).

Mistake 2: Wrong Composite Index Order

Query:

SQL
SELECT *
FROM orders
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 20;

Weak index:

SQL
CREATE INDEX idx_orders_created_user
ON orders (created_at, user_id);

Better:

SQL
CREATE INDEX idx_orders_user_created
ON orders (user_id, created_at);

Mistake 3: Assuming Low-Cardinality Indexes Are Always Bad

status alone may be weak. But this can be excellent:

SQL
CREATE INDEX idx_orders_tenant_status_created
ON orders (tenant_id, status, created_at);

Low-cardinality columns can be useful inside composite indexes.

Mistake 4: Using SELECT *

Bad on hot paths:

SQL
SELECT *
FROM orders
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 20;

Better:

SQL
SELECT id, status, total, created_at
FROM orders
WHERE user_id = 123
ORDER BY created_at DESC
LIMIT 20;

This reduces row size and may allow covering indexes.

Mistake 5: Ignoring Sorts

A query can filter quickly but still sort slowly.

SQL
SELECT *
FROM orders
WHERE tenant_id = 10
ORDER BY created_at DESC
LIMIT 100;

Index:

SQL
CREATE INDEX idx_orders_tenant_created
ON orders (tenant_id, created_at);

This helps both filtering and ordering.

Mistake 6: Trusting Dev Data

A query that is fast with 1,000 rows may fail with 100 million rows.

Always test with realistic data volume and distribution.

Mistake 7: Creating Indexes Without Measuring

Before adding an index:

  1. capture the slow query;
  2. run EXPLAIN and EXPLAIN ANALYZE;
  3. check existing indexes;
  4. create candidate index;
  5. test again with realistic data;
  6. monitor write overhead and slow-query logs.

Mistake 8: Forgetting About Statistics

A query that worked yesterday breaks today after a bulk import. Plan flips because cardinality estimates went stale. Run ANALYZE TABLE. If a column is skewed (think status, country_code), build a histogram on it.


33. Practical Checklist

Use this checklist when reviewing a slow MySQL query.

Query Shape

  • What table is large?
  • What are the WHERE conditions?
  • Which conditions are equality?
  • Which conditions are ranges?
  • Is there ORDER BY?
  • Is there GROUP BY or DISTINCT?
  • Is there LIMIT?
  • Is the query using SELECT *?
  • Is the query part of a hot path?

Index Design

  • Does an existing index match this query?
  • Is the leading column useful?
  • Does the index follow the leftmost-prefix rule?
  • Are equality columns before range columns?
  • Can the index help with sorting (including mixed ASC/DESC if needed)?
  • Can the index be covering?
  • Is the index too wide?
  • Is it redundant with another index?

EXPLAIN Review

Check type, key, rows, filtered, Extra.

Warning signs:

Text
type: ALL
key: NULL
rows: huge number
Extra: Using filesort
Extra: Using temporary

These are not always bad, but they deserve attention on large tables.

Data Distribution

SQL
SELECT status, COUNT(*)
FROM orders
GROUP BY status;
SQL
SELECT
    COUNT(*) AS total,
    COUNT(DISTINCT user_id) AS distinct_users,
    COUNT(DISTINCT status) AS distinct_statuses
FROM orders;

Do not assume selectivity. Measure it.


34. Senior-Level Indexing Advice

1. Index Query Patterns, Not Columns

Bad thinking:

We filter by status, so index status.

Better thinking:

Our hot query filters by tenant_id, status, and sorts by created_at, so we need an index that matches that access pattern.

2. Composite Indexes Are Usually More Valuable Than Many Single Indexes

This:

SQL
INDEX (tenant_id, status, created_at)

is often better than:

SQL
INDEX (tenant_id)
INDEX (status)
INDEX (created_at)

because the composite index matches how the query actually narrows and orders data.

3. Low Cardinality Is Not Automatically Bad

A standalone index on status may be weak. But (tenant_id, status, created_at) can be excellent. Context matters.

4. The Best Index Depends on Data Distribution

The same query can behave differently in two databases. If one tenant has 500 rows and another tenant has 50 million rows, the index strategy may need to account for tenant skew. Histograms help the optimizer pick the right plan per value.

5. Watch the Write Path

Indexes are not free. For high-write tables like logs, events, metrics, and jobs, avoid unnecessary indexes.

6. Use Covering Indexes Carefully

Covering indexes are powerful, but making every index wide is dangerous. A wide index uses more disk, more memory, slows writes, and may reduce buffer-pool cache efficiency. Use covering indexes for hot, stable, high-value queries.

7. Be Careful With OR Conditions

SQL
SELECT *
FROM users
WHERE email = 'nazar@example.com'
   OR phone = '+15551234567';

MySQL may use index merge in some cases, but sometimes splitting into two queries can be better:

SQL
SELECT * FROM users WHERE email = 'nazar@example.com'
UNION ALL
SELECT * FROM users WHERE phone = '+15551234567';

Only do this when it's semantically safe and measured.

8. Do Not Ignore Application Behavior

A query called once per day may not need a perfect index. A query called 10,000 times per minute probably does.

Index priority depends on:

  • frequency;
  • latency impact;
  • rows scanned;
  • business importance;
  • write overhead;
  • operational risk.

9. Operate Indexes, Don't Just Create Them

Production index work is a continuous loop, not a one-time design:

  • monitor sys.schema_unused_indexes for indexes that haven't earned their cost;
  • run ANALYZE TABLE after large data changes;
  • mark indexes invisible before dropping them;
  • review EXPLAIN ANALYZE for queries where estimated and actual rows diverge;
  • audit redundant indexes quarterly.

35. Mini Interview Section: Questions and Strong Answers

What is a MySQL index?

An index is a separate data structure that helps MySQL find rows faster without scanning the whole table. Most normal MySQL indexes are B-tree indexes (B+tree variant in InnoDB), which keep values ordered. That makes equality lookups, range scans, joins, sorting, and grouping more efficient when the index matches the query pattern.

What is a composite index?

A composite index is an index on multiple columns, such as (user_id, status, created_at). It is ordered by the first column, then the second, then the third. It is useful when queries filter or sort by those columns together.

What is the leftmost-prefix rule?

For an index (a, b, c), MySQL can efficiently use the index for (a), (a, b), and (a, b, c). It usually cannot use it efficiently for (b) or (c) alone because those columns are not the leftmost part of the index.

What is cardinality?

Cardinality is the number of distinct values in a column. A column like email usually has high cardinality. A column like status usually has low cardinality. High-cardinality columns are often better index candidates for exact lookups.

What is selectivity?

Selectivity describes how much a condition narrows the result set. WHERE email = ? is usually highly selective. WHERE status = 'active' may not be selective if most rows are active.

Why can an index on status be bad?

If status has only a few values and one value matches most rows, MySQL may still need to read a large part of the table. In that case, the index does not eliminate much work. But status can still be useful as part of a composite index, and a histogram can teach the optimizer to behave differently for different status values.

What is a covering index?

A covering index contains all columns needed by a query. MySQL can answer the query from the index without fetching the full table row. In EXPLAIN, this often appears as Using index.

Why should we avoid SELECT *?

SELECT * forces MySQL to read all columns, including columns the application may not need. It can prevent covering index usage and increase I/O, memory, and network cost.

What does EXPLAIN show?

EXPLAIN shows how MySQL plans to execute a query. It tells you which table access method is used, which indexes are possible, which index is selected, how many rows MySQL estimates it will examine, and extra operations such as sorting or temporary tables. EXPLAIN ANALYZE (8.0.18+) actually runs the query and shows real timing and row counts.

What does Using filesort mean?

Using filesort means MySQL needs an extra sorting step instead of returning rows directly in index order. It is not always bad, but it can be expensive for large result sets.

Why do indexes slow down writes?

Every insert, update, or delete must maintain indexes. More indexes mean more work during writes, more storage, and more memory pressure. The change buffer softens this for non-unique secondary indexes when affected pages aren't in the buffer pool, but it doesn't make writes free.

What's the difference between a clustered and a secondary index in InnoDB?

The clustered index is the table, leaf nodes contain full rows, ordered by the primary key. A secondary index is a separate structure whose leaf nodes contain the indexed columns plus the primary key, used as a pointer back into the clustered index. That's why a secondary-index lookup of all columns may need a second hop, and why covering indexes (which avoid the second hop) are valuable.

What's a functional index, and when would you use one?

A functional index (MySQL 8.0.13+) indexes the result of an expression, for example LOWER(email) or DATE(created_at). You use it when the application has to query on a transformed value and you can't change the query to use the raw column. Before 8.0.13, the workaround was a stored generated column with a regular index on it.

What's an invisible index for?

It's an index that exists and is maintained but is hidden from the optimizer. The point is risk-free index removal, flip it invisible, watch production for a day or two, and either drop it or restore it.


36. Final Mental Model

A MySQL index is not just a performance checkbox.

It is a data access path.

When you design an index, ask:

Text
How will MySQL find the first matching row?
How many rows will it scan after that?
Can it stop early because of LIMIT?
Can it return rows already sorted?
Does it need to fetch the full row?
How expensive is this index for writes?

Good indexing is not about adding many indexes.

Good indexing is about matching your most important query patterns with the least amount of extra write and storage cost.

The best index is the one that helps MySQL do less work.


37. Quick Reference

Good Index Candidates

  • primary keys;
  • unique lookup columns;
  • foreign keys used in joins;
  • high-cardinality filter columns;
  • composite query patterns;
  • columns used for sorting after selective filters;
  • hot queries with LIMIT;
  • queue polling queries;
  • cursor pagination queries.

Risky Index Candidates

  • low-cardinality columns alone;
  • columns rarely queried;
  • columns frequently updated;
  • very wide string columns;
  • JSON columns without a specific access strategy (consider multi-valued indexes if the access pattern is array membership);
  • indexes created only because "maybe we need them."

Composite Index Rule of Thumb

For many queries:

Text
equality filters → range filter → order by column

Example:

SQL
WHERE tenant_id = ?
  AND status = ?
  AND created_at >= ?
ORDER BY created_at DESC

Index:

SQL
(tenant_id, status, created_at)

EXPLAIN Red Flags

Text
type = ALL
key = NULL
rows = very large
Using filesort on huge result set
Using temporary on huge result set
estimated rows >> actual rows in EXPLAIN ANALYZE

Practical Rule

Do not ask:

Do we have an index on this column?

Ask:

Do we have an index that matches this query?

That difference is what separates basic indexing from real database performance work.