Most backend engineers do not need to become database optimizer authors.

But they do need to understand why a query became slow, why MySQL ignored an index, why a join suddenly scans millions of rows, and why adding ORDER BY created_at DESC LIMIT 20 can turn a simple endpoint into a production problem.

That is where EXPLAIN becomes one of the most useful tools in your backend toolbox.

EXPLAIN does not execute the query in the normal way and then tell you what happened. It shows the execution plan MySQL expects to use. It tells you how MySQL plans to read tables, which indexes it may use, which index it actually chose, how many rows it expects to examine, and whether extra work like sorting or temporary table creation may happen.

This article is a practical guide to reading EXPLAIN like a backend engineer: not as a database researcher, but as someone responsible for API latency, queue throughput, admin dashboards, reporting queries, and production incidents.


Plan

Before going deep, here is the plan for the article.

  1. Why backend engineers should care about EXPLAIN.
  2. Build a mental model for how MySQL reads data.
  3. How MySQL stores data, clustered index vs secondary index.
  4. The most important EXPLAIN columns.
  5. Access types from best to worst.
  6. Reading possible_keys, key, key_len, and ref.
  7. Why rows and filtered are estimates, not guarantees.
  8. Decoding common Extra values: Using index, Using where, Using filesort, Using temporary.
  9. Realistic query examples from backend applications.
  10. How composite indexes change execution plans.
  11. Why MySQL sometimes ignores your index.
  12. Joins, sorting, pagination, covering indexes, and index merge.
  13. Beyond the basics: histograms, skip scan, invisible indexes, hash join.
  14. A production checklist for investigating slow queries.

1. Why Backend Engineers Should Care About EXPLAIN

A backend engineer usually sees slow queries through symptoms:

  • An API endpoint starts timing out.
  • A queue worker processes fewer jobs per minute.
  • A dashboard page becomes slow as data grows.
  • A database CPU spike appears after a deployment.
  • A query that was fast with 10,000 rows becomes slow with 10 million rows.

The SQL may look harmless:

SQL
SELECT *
FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;

But whether this query is fast depends on the execution plan.

If MySQL can use an index like this:

SQL
CREATE INDEX idx_orders_customer_created
ON orders (customer_id, created_at DESC);

then the query can quickly jump to one customer's orders and read the newest rows in order.

Without a useful index, MySQL may scan many rows, filter them, sort them, and then return only 20.

That difference is often the difference between a 5 ms query and a 5 second query.

EXPLAIN helps you see the difference before guessing.


2. The Basic Mental Model

When MySQL executes a query, it has to answer a few questions:

  1. Which table should I read first?
  2. Can I use an index?
  3. If multiple indexes exist, which one is cheapest?
  4. How many rows do I expect to read?
  5. Do I need to sort?
  6. Do I need a temporary table?
  7. Can I satisfy the query from the index alone?
  8. For joins, how do I find matching rows in the next table?

EXPLAIN gives you clues for those decisions.

Horizontal architecture diagram showing how MySQL executes a query: Client, SQL Parser, Query Optimizer, Execution Engine, InnoDB storage engine, Buffer Pool, with EXPLAIN and EXPLAIN ANALYZE taps off the pipeline.

Run it like this:

SQL
EXPLAIN
SELECT *
FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;

A traditional tabular EXPLAIN output contains columns like:

Text
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra

The exact output can vary by MySQL version and query shape, but these columns are the main ones backend engineers should understand.

You can also use JSON or tree format:

SQL
EXPLAIN FORMAT=JSON
SELECT * FROM orders WHERE customer_id = 42
ORDER BY created_at DESC LIMIT 20;

EXPLAIN FORMAT=TREE
SELECT * FROM orders WHERE customer_id = 42
ORDER BY created_at DESC LIMIT 20;

3. How MySQL Stores Data: Clustered vs Secondary Indexes

This is the single most important concept missing from most "intro to indexes" articles, and it explains half of what you will see in EXPLAIN.

In InnoDB (the default MySQL storage engine), a table is physically stored as a B-tree organized by the primary key. The leaf pages of that B-tree contain the full row. This is called a clustered index.

Every other index, what people usually call "indexes", is a secondary index. A secondary index is a separate B-tree whose leaves contain the indexed columns plus the primary key value. They do not contain the full row.

So when you query through a secondary index, InnoDB does two lookups:

  1. Walk the secondary index B-tree to find matching primary key values.
  2. Walk the clustered index B-tree using each primary key to fetch the full row.

That second step is sometimes called a bookmark lookup or back to clustered index. It is one random I/O per row.

This is why these things matter:

  • Primary key lookups (type: const) only walk one B-tree. They are the cheapest possible access.
  • Secondary index lookups that need columns not in the index pay the bookmark lookup cost on every matching row.
  • A covering index (Using index) avoids bookmark lookups because every column the query needs is already in the secondary index leaf. That is why Using index in Extra is such a strong signal.
  • Inserting a row updates the clustered index plus every secondary index. That is why "just add another index" has a real cost.
  • The primary key is implicitly appended to every secondary index. That is why a (customer_id, created_at) index can also be used for ordering by (customer_id, created_at, id), id is already there.

Keep this picture in mind for the rest of the article. Almost every optimization decision below, when to add a covering index, why SELECT * makes things harder, why secondary indexes on huge tables are expensive, comes back to this two-tree structure.


4. Example Schema

We will use a simple e-commerce style schema.

SQL
CREATE TABLE customers (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    email VARCHAR(255) NOT NULL,
    status VARCHAR(30) NOT NULL,
    created_at DATETIME NOT NULL,
    UNIQUE KEY uniq_customers_email (email),
    KEY idx_customers_status_created (status, created_at)
);

CREATE TABLE orders (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    customer_id BIGINT NOT NULL,
    status VARCHAR(30) NOT NULL,
    total_cents INT NOT NULL,
    created_at DATETIME NOT NULL,
    paid_at DATETIME NULL,
    KEY idx_orders_customer_created (customer_id, created_at),
    KEY idx_orders_status_created (status, created_at),
    KEY idx_orders_paid_at (paid_at)
);

CREATE TABLE order_items (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    order_id BIGINT NOT NULL,
    product_id BIGINT NOT NULL,
    quantity INT NOT NULL,
    price_cents INT NOT NULL,
    KEY idx_order_items_order_id (order_id),
    KEY idx_order_items_product_order (product_id, order_id)
);

This schema is intentionally normal. It has primary keys, secondary indexes, composite indexes, and common query patterns.


5. The EXPLAIN Columns That Matter Most

Here is the practical meaning of the most important columns.

table

The table MySQL is reading at that step. For joins, each table usually appears as a separate row in the execution plan.

Text
table: customers
table: orders
table: order_items

This tells you the join order MySQL selected. The first table is usually important because it often determines how many rows are passed into later join steps.

type

Despite the name, this is not a data type. It means the access type, how MySQL plans to access rows in this table. This is one of the most important columns.

Common values:

Text
system, const, eq_ref, ref, range, index, ALL

A rough ranking from best to worst is:

Text
system / const → eq_ref → ref → range → index → ALL

This ranking is not absolute, but it is a very useful starting point. We will look at each one in detail in the next section.

possible_keys

Indexes MySQL thinks could be used for this table.

Text
possible_keys: idx_orders_customer_created,idx_orders_status_created

It does not mean MySQL will use them.

key

The index MySQL actually chose.

Text
key: idx_orders_customer_created

If key is NULL, MySQL did not choose an index for that table access. That does not always mean the query is bad, but it is a warning sign when the table is large.

key_len

The number of bytes of the index MySQL plans to use.

Backend engineers often ignore this column, but it is very useful for composite indexes. Suppose you have this index:

SQL
KEY idx_orders_status_created (status, created_at)

For this query MySQL may use only the status part:

SQL
SELECT * FROM orders WHERE status = 'paid';

But for this query MySQL may use both:

SQL
SELECT * FROM orders
WHERE status = 'paid' AND created_at >= '2026-01-01';

key_len helps you infer how much of the composite index is actually used. Byte length depends on data type, character set, and whether the column is nullable, so the absolute number is rarely meaningful, but a change in key_len after you add a predicate is a strong signal that MySQL is using more of the index.

ref

Shows what value or column is compared to the index.

Text
ref: const            -- compared to a constant: WHERE x = 42
ref: app.customers.id -- compared to a column: orders.customer_id = customers.id

rows

The estimated number of rows MySQL expects to examine for this table. This is not the number of rows returned.

Text
rows: 125000

filtered

The estimated percentage of examined rows that will pass the table condition.

Text
rows: 100000
filtered: 10.00

Roughly, MySQL expects to examine 100,000 rows and keep about 10% of them. So the next step might receive around 10,000 rows.

Extra

Additional notes about what MySQL will do.

Text
Using where, Using index, Using filesort,
Using temporary, Using index condition, Using join buffer

This column often contains the biggest clues. Using filesort and Using temporary are not always bad, but they are common signs that a query may become expensive at scale.

partitions

If the table is partitioned, this column lists the partitions MySQL will read. Seeing fewer partitions than the table has means partition pruning worked, the optimizer eliminated whole partitions before scanning.

Text
partitions: p202601,p202602

Most backend applications do not partition tables, so this column is often empty. When it is populated, it is one of the highest-leverage things to check on time-series tables.


6. Access Types Explained

The type column tells you how MySQL accesses rows. Think of it as the query's access strategy.

Vertical ranking infographic of EXPLAIN access types from best to worst: const, eq_ref, ref, range, index, ALL, each with a short example and rows-examined indicator.

system

The table has only one row. Rare in normal application queries.

Text
type: system

const

MySQL can find at most one matching row using a primary key or unique index compared to constants.

SQL
EXPLAIN SELECT * FROM customers WHERE id = 123;

Likely plan:

Text
type: const
key: PRIMARY
rows: 1

This is excellent. MySQL knows there can be only one customer with id = 123.

SQL
EXPLAIN SELECT * FROM customers
WHERE email = 'nazar@example.com';

Because email has a unique index, MySQL can also treat this as a constant lookup.

Text
type: const
key: uniq_customers_email
rows: 1

eq_ref

eq_ref usually appears in joins when MySQL uses a primary key or unique index and expects exactly one matching row for each row from the previous table.

SQL
EXPLAIN
SELECT orders.id, customers.email
FROM orders
JOIN customers ON customers.id = orders.customer_id
WHERE orders.status = 'paid';

Possible plan:

Text
table: orders     type: ref     key: idx_orders_status_created
table: customers  type: eq_ref  key: PRIMARY  ref: app.orders.customer_id  rows: 1

For every matching order, MySQL looks up exactly one customer by primary key.

ref

ref means MySQL uses a non-unique index lookup. It may find multiple rows for each value.

SQL
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
Text
type: ref
key: idx_orders_customer_created
ref: const
rows: 150

The index can quickly find orders for customer 42, but there may be many orders. Usually good if the result set is reasonably selective.

But ref can still be expensive if the matching value is common:

SQL
SELECT * FROM orders WHERE status = 'paid';

If 80% of orders are paid, the index on status may not help much.

range

range means MySQL scans a range of index values.

SQL
SELECT * FROM orders
WHERE created_at >= '2026-01-01' AND created_at < '2026-02-01';

SELECT * FROM orders WHERE id BETWEEN 1000 AND 2000;

SELECT * FROM orders WHERE paid_at IS NOT NULL;
Text
type: range
key: idx_orders_paid_at
rows: 50000

range can be very good or very bad depending on how wide the range is. A range that reads 100 rows is great. A range that reads 10 million rows is not.

index

index means MySQL scans the index. Better than scanning the full table only when the index is smaller than the table or when the index covers the query.

SQL
EXPLAIN SELECT customer_id FROM orders;
Text
type: index
key: idx_orders_customer_created
Extra: Using index

This can be acceptable for analytics-style queries, but dangerous for high-traffic endpoints if the table is large.

ALL

ALL means full table scan.

SQL
EXPLAIN SELECT * FROM orders WHERE total_cents > 10000;

If there is no useful index on total_cents:

Text
type: ALL
key: NULL
rows: 5000000
Extra: Using where

This is the classic red flag. But do not panic automatically.

A full scan of a 50-row lookup table is fine. A full scan of a 50-million-row orders table inside a public API endpoint is not fine.


7. Access Type Cheat Sheet

Access type Meaning Usually good? Backend interpretation
system One-row table Rare; ignore most of the time
const Single row by primary/unique key Ideal lookup
eq_ref One matching row per outer row in a join Healthy join pattern
ref Non-unique index lookup Check row count and selectivity
range Index range scan ⚠️ Good for narrow ranges; bad for huge ranges
index Full index scan ⚠️ Still a scan; okay if covering or small
ALL Full table scan Red flag on large tables

8. Reading possible_keys vs key

A common mistake is to see an index in possible_keys and assume the query is optimized.

SQL
EXPLAIN
SELECT *
FROM orders
WHERE status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

Suppose output says:

Text
possible_keys: idx_orders_status_created
key: idx_orders_status_created

Good. MySQL considered the index and chose it.

But now imagine:

Text
possible_keys: idx_orders_status_created
key: NULL
type: ALL

This means MySQL knew an index might apply but chose not to use it.

Common reasons:

  1. The index is not selective enough.
  2. MySQL estimates a full scan is cheaper.
  3. The query returns too many rows.
  4. The condition uses a function on the indexed column.
  5. The data distribution makes the index unattractive.
  6. Statistics are stale or inaccurate.
  7. Another index is better for sorting or joining.

Example of a function blocking normal index usage:

SQL
SELECT * FROM orders WHERE DATE(created_at) = '2026-01-01';

Even with an index on created_at, wrapping the column in DATE() can prevent MySQL from using the index efficiently. Better:

SQL
SELECT * FROM orders
WHERE created_at >= '2026-01-01 00:00:00'
  AND created_at <  '2026-01-02 00:00:00';

This keeps the indexed column directly comparable to constants.


9. Understanding rows: The Most Important Estimate

The rows column tells you how much work MySQL expects to do.

Text
type: ref
key: idx_orders_customer_created
rows: 25

Likely fine.

Text
type: ref
key: idx_orders_status_created
rows: 1800000

May not be fine. Even though MySQL is using an index, it expects to read 1.8 million rows.

This is why "using an index" is not enough.

If almost every order is paid, then status = 'paid' is not selective. Low-cardinality columns like these often make weak standalone indexes:

Text
status, is_active, is_deleted, type, category, country

They can still be useful as the first column of a composite index when paired with a more selective column or a sorting column:

SQL
CREATE INDEX idx_orders_status_created
ON orders (status, created_at);

Now MySQL can use status for filtering and created_at for ordered access.


10. Understanding filtered

filtered estimates how many examined rows survive the condition.

Text
rows: 100000
filtered: 5.00

Approximate rows passed forward:

Text
100000 * 5% = 5000

For joins, this matters because rows from one step feed into the next step.

Text
table: customers  rows: 10000  filtered: 10.00
table: orders     rows: 50     filtered: 100.00

A rough estimate:

Text
customers output: 10000 * 10%        = 1000 customers
orders lookup:    1000 * 50          = 50000 order rows examined

This is not exact math, but it helps you reason about nested work.


11. The Extra Column: Where The Warnings Live

The Extra column often tells you what additional work MySQL performs.

Using where

MySQL applies a filter condition after reading rows. Common and not automatically bad. It simply means not all filtering was solved purely by index lookup.

SQL
SELECT *
FROM orders
WHERE customer_id = 42 AND total_cents > 5000;

If the index is only on customer_id, MySQL can use the index to find customer orders, then apply total_cents > 5000 as a filter. That may be perfectly fine if the customer has 20 orders. It may be bad if the customer has 500,000 orders.

Using index

MySQL can satisfy the query using only the index, a covering index. No bookmark lookup needed.

SQL
EXPLAIN
SELECT customer_id, created_at
FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;

With index idx_orders_customer_created (customer_id, created_at):

Text
key: idx_orders_customer_created
Extra: Using where; Using index

Using index condition

This refers to index condition pushdown (ICP). MySQL pushes part of the WHERE clause down into the storage engine, applying it during the index scan before fetching full rows. This reduces bookmark lookups.

Text
Extra: Using index condition

Usually good. MySQL is using the index more intelligently to reduce row access.

Using filesort

MySQL cannot produce the requested order directly from an index and must perform a separate sort operation.

SQL
EXPLAIN
SELECT *
FROM orders
WHERE status = 'paid'
ORDER BY total_cents DESC
LIMIT 20;

If there is no useful index for (status, total_cents):

Text
Extra: Using where; Using filesort

A better index:

SQL
CREATE INDEX idx_orders_status_total
ON orders (status, total_cents DESC);

Now MySQL can find paid orders already ordered by total_cents.

Using filesort is one of the first things to investigate for slow ORDER BY queries, especially with large filtered result sets. But sorting 100 rows is fine, sorting 5 million rows for every request is not.

Using temporary

MySQL uses a temporary table to process the query. Common with some GROUP BY, DISTINCT, and complex sorting queries.

SQL
EXPLAIN
SELECT customer_id, COUNT(*)
FROM orders
WHERE created_at >= '2026-01-01'
GROUP BY customer_id
ORDER BY COUNT(*) DESC;
Text
Extra: Using where; Using temporary; Using filesort

Using join buffer

MySQL cannot use an index efficiently for a join and falls back to a join buffer strategy (block nested loop, or in MySQL 8.0.18+, hash join).

Text
Extra: Using join buffer

Often appears when joining on unindexed columns:

SQL
SELECT *
FROM orders
JOIN customers ON customers.email = orders.customer_email;

Usually a sign you need an index on the join column.


12. Example 1: Simple Primary Key Lookup

SQL
EXPLAIN SELECT * FROM orders WHERE id = 1001;
Text
table: orders
type: const
possible_keys: PRIMARY
key: PRIMARY
rows: 1
Extra: NULL

MySQL uses the primary key. One row. No sorting. No temporary table. Optimal, nothing to fix.


13. Example 2: Customer Orders Page

Show the latest 20 orders for one customer.

SQL
SELECT *
FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;

Without a good composite index:

Text
type: ALL
key: NULL
rows: 5000000
Extra: Using where; Using filesort

MySQL scans the orders table, filters rows for customer 42, sorts by created_at, then returns 20.

Add a composite index:

SQL
CREATE INDEX idx_orders_customer_created
ON orders (customer_id, created_at DESC);
Text
type: ref
key: idx_orders_customer_created
ref: const
rows: 150
Extra: Using where

14. Example 3: Admin Filter By Status And Date

Admin page lists paid orders in a date range.

SQL
SELECT id, customer_id, total_cents, created_at
FROM orders
WHERE status = 'paid'
  AND created_at >= '2026-01-01'
  AND created_at <  '2026-02-01'
ORDER BY created_at DESC
LIMIT 100;

Useful index:

SQL
CREATE INDEX idx_orders_status_created
ON orders (status, created_at DESC);

Possible plan:

Text
type: range
key: idx_orders_status_created
rows: 80000
Extra: Using index condition

This may be good or bad depending on the date range size. One day may be fast. One year may still read millions of rows.


15. Example 4: The Function-On-Column Problem

Bad query:

SQL
SELECT * FROM orders
WHERE DATE(created_at) = '2026-01-01';
Text
type: ALL
key: NULL
rows: 5000000
Extra: Using where

Better query:

SQL
SELECT * FROM orders
WHERE created_at >= '2026-01-01 00:00:00'
  AND created_at <  '2026-01-02 00:00:00';
Text
type: range
key: idx_orders_created_at
rows: 120000
Extra: Using index condition

Keep indexed columns clean in predicates. Move transformations to the constant side when possible.


16. Example 5: Composite Index Order Matters

Suppose you have this index:

SQL
CREATE INDEX idx_orders_status_created
ON orders (status, created_at);

Two-panel diagram of the composite index (status, created_at) leftmost prefix rule: left shows the index used when filtering by status then created_at, right shows the index blocked when only created_at is filtered.

This query can use it well:

SQL
SELECT *
FROM orders
WHERE status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

For one status, MySQL can read rows in created_at order.

But this query is different:

SQL
SELECT * FROM orders
WHERE created_at >= '2026-01-01'
ORDER BY status;

The same index may not help much because the query does not constrain the leftmost column status.

Composite indexes follow the leftmost prefix rule. An index on (a, b, c) can usually help with:

Text
a
a, b
a, b, c

But it usually cannot be used efficiently for b only, c only, or b, c without a.


17. Example 6: Covering Index For A Hot Endpoint

Return the latest order IDs and totals for a customer.

SQL
SELECT id, total_cents, created_at
FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;

Index:

SQL
CREATE INDEX idx_orders_customer_created_total
ON orders (customer_id, created_at DESC, id, total_cents);
Text
type: ref
key: idx_orders_customer_created_total
rows: 20
Extra: Using where; Using index

Using index means the query can be answered from the index itself, no clustered index lookup. Excellent for a high-traffic endpoint.

But every extra column in an index makes the index larger, which means:

  • More storage.
  • More memory pressure.
  • Slower inserts.
  • Slower updates to indexed columns.
  • More work for replication and backups.

18. Example 7: Join Plan Reading

SQL
SELECT customers.email, orders.id, orders.total_cents
FROM customers
JOIN orders ON orders.customer_id = customers.id
WHERE customers.email = 'nazar@example.com'
ORDER BY orders.created_at DESC
LIMIT 20;

Useful indexes:

SQL
CREATE UNIQUE INDEX uniq_customers_email ON customers (email);

CREATE INDEX idx_orders_customer_created
ON orders (customer_id, created_at DESC);

Two-panel comparison diagram of a MySQL join: a good plan with customers as driving table and a targeted lookup into orders, vs a bad plan that scans all orders and looks up customers afterwards.

Good plan:

Text
table: customers
type: const
key: uniq_customers_email
rows: 1

table: orders
type: ref
key: idx_orders_customer_created
ref: const
rows: 20
Extra: Using where

MySQL first finds the customer by unique email, that gives one customer row. Then it finds that customer's latest orders using the composite index.

Bad version (same query, no useful customer index):

Text
table: orders
type: ALL
key: NULL
rows: 5000000
Extra: Using filesort

table: customers
type: eq_ref
key: PRIMARY
rows: 1

MySQL scans many orders, sorts them, and joins to customers.


19. Example 8: Using filesort With LIMIT

SQL
SELECT *
FROM orders
WHERE status = 'paid'
ORDER BY total_cents DESC
LIMIT 10;

Bad plan:

Text
type: ref
key: idx_orders_status_created
rows: 2000000
Extra: Using where; Using filesort

This plan uses an index to find paid orders, but then it must sort many paid orders by total_cents.

Better index:

SQL
CREATE INDEX idx_orders_status_total
ON orders (status, total_cents DESC);

Better plan:

Text
type: ref
key: idx_orders_status_total
rows: 10
Extra: Using where

Now MySQL can read paid orders in total_cents order and stop after 10.


20. Example 9: Using temporary With GROUP BY

SQL
SELECT customer_id, COUNT(*) AS order_count
FROM orders
WHERE created_at >= '2026-01-01'
GROUP BY customer_id
ORDER BY order_count DESC
LIMIT 20;
Text
type: range
key: idx_orders_created_at
rows: 1000000
Extra: Using where; Using temporary; Using filesort

MySQL must:

  1. Find orders in the date range.
  2. Group them by customer.
  3. Count each group.
  4. Sort groups by count.
  5. Return the top 20.

An index can help with the date range, but sorting by aggregate count usually requires extra work.

Possible backend solutions:

  1. Accept it if this is an internal report and runs rarely.
  2. Add a narrower date range.
  3. Move it to a background job.
  4. Precompute daily customer order counts.
  5. Use a summary table.
  6. Cache the result.
  7. Move heavy analytics to a reporting database or OLAP store.

21. Example 10: Offset Pagination Problem

SQL
SELECT *
FROM orders
WHERE status = 'paid'
ORDER BY created_at DESC
LIMIT 50 OFFSET 500000;
Text
type: ref
key: idx_orders_status_created
rows: 500050
Extra: Using where

Even with a good index, deep offset pagination is expensive. MySQL still has to walk past many rows before returning the requested page.

Better approach: cursor (keyset) pagination.

SQL
SELECT *
FROM orders
WHERE status = 'paid'
  AND created_at < '2026-05-01 12:00:00'
ORDER BY created_at DESC
LIMIT 50;

22. Why MySQL Ignores Your Index

This happens all the time. You create an index, rerun EXPLAIN, and MySQL does not use it.

Common reasons:

1. The index has low selectivity

If most rows have status = 'paid', an index on status may not help. MySQL may prefer a table scan.

2. The query reads too many rows anyway

SQL
SELECT * FROM orders WHERE created_at >= '2020-01-01';

If this matches 95% of the table, a full scan may be cheaper than bouncing through an index and then doing 4.75 million bookmark lookups.

3. The index does not match the leftmost prefix

Index on (status, created_at), but query only filters by created_at, leftmost rule blocks efficient use.

4. The query uses a function on the column

SQL
WHERE DATE(created_at) = '2026-01-01'  -- bad
SQL
WHERE created_at >= '2026-01-01'
  AND created_at <  '2026-01-02'        -- good

5. Data types do not match cleanly

SQL
WHERE user_id = '123'  -- when user_id is an integer

MySQL can often handle implicit conversion, but implicit conversions can lead to bad plans or confusing behavior. Use correct types in application code.

6. The optimizer estimates another path is cheaper

Sometimes MySQL chooses an unexpected plan because its cost model thinks another path is better. Investigate with:

SQL
EXPLAIN FORMAT=JSON SELECT ...;

Or compare with index hints, carefully:

SQL
SELECT *
FROM orders FORCE INDEX (idx_orders_status_created)
WHERE status = 'paid'
ORDER BY created_at DESC
LIMIT 20;

7. Statistics are stale

MySQL's optimizer uses cardinality statistics to choose plans. If those are stale (often after large bulk loads or schema changes), it may pick poorly. Run ANALYZE TABLE orders; to refresh them.


23. Beyond The Basics

These are the modern MySQL features that show up in real EXPLAIN plans and that interviewers love to ask about.

Histograms (MySQL 8.0+)

By default, the optimizer estimates selectivity from index cardinality only. For non-indexed columns, or columns with skewed data, that is rough. Histograms give the optimizer a real distribution.

SQL
ANALYZE TABLE orders UPDATE HISTOGRAM ON status WITH 16 BUCKETS;

Now WHERE status = 'cancelled' (rare) and WHERE status = 'paid' (common) get different cost estimates without needing an index. Useful for low-cardinality filter columns where you do not want a full secondary index.

Skip scan (MySQL 8.0.13+)

The leftmost prefix rule has a small loophole. With loose index range scan, also called skip scan, MySQL can sometimes use a composite index (a, b) for a query that only filters on b, by jumping over each distinct value of a.

It only works when a has very low cardinality (a handful of distinct values). When it kicks in, you will see Using index for skip scan in Extra. Do not design indexes assuming it; treat it as a bonus when the optimizer finds it.

Invisible indexes (MySQL 8.0+)

You can mark an index invisible to the optimizer without dropping it:

SQL
ALTER TABLE orders ALTER INDEX idx_orders_paid_at INVISIBLE;

The index still exists and is still maintained on writes, but the optimizer ignores it. Useful for testing whether an index is actually doing work, make it invisible, watch performance, decide whether to drop. Reverse with VISIBLE.

Hash join (MySQL 8.0.18+)

For joins where neither side has a usable index, MySQL 8.0.18 introduced hash joins. Older versions fell back to nested loops with a join buffer, which is slower for large unindexed joins. Hash join only shows up in EXPLAIN FORMAT=TREE output:

Text
-> Inner hash join (orders.customer_id = customers.id)
    -> Table scan on orders
    -> Hash
        -> Table scan on customers

It is a fallback, not a goal, proper indexed joins are still cheaper.

Index merge

When two separate indexes each match part of a query, MySQL can sometimes use both and combine the results, Using union, Using intersect, or Using sort_union in Extra.

SQL
SELECT * FROM orders
WHERE customer_id = 42 OR paid_at IS NOT NULL;

Index merge is real and sometimes useful, but a well-designed composite index almost always beats it. If you see Using union regularly on a hot path, that is a hint to design one composite index instead of two.

Optimizer trace

When the plan really does not make sense:

SQL
SET optimizer_trace="enabled=on";
SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 20;
SELECT * FROM information_schema.OPTIMIZER_TRACE\G

You get a verbose JSON dump of every cost estimate the optimizer considered. Heavy artillery, reach for it when you have eliminated the obvious things.


24. EXPLAIN Does Not Tell You Everything

EXPLAIN shows the plan and estimates. It does not always show actual runtime behavior.

For real performance investigation, combine it with:

SQL
EXPLAIN ANALYZE
SELECT ...;

EXPLAIN ANALYZE executes the query and reports actual timing and row counts per iterator. The output looks like this (in tree format):

Text
-> Limit: 20 row(s)  (cost=0.42 rows=20) (actual time=0.085..0.241 rows=20 loops=1)
    -> Index lookup on orders using idx_orders_customer_created (customer_id=42)
       (cost=0.42 rows=150) (actual time=0.078..0.230 rows=20 loops=1)

The two numbers to compare are estimated rows versus actual rows. A large mismatch (an order of magnitude or more) is a sign the optimizer has bad information, usually stale statistics or missing histograms.

Also use:

  • Slow query log.
  • Performance Schema and sys schema views.
  • Query digest tools (pt-query-digest).
  • SHOW INDEX FROM table_name, index cardinality at a glance.
  • SHOW TABLE STATUS, table size and row estimates.
  • Application tracing (request → query correlation).
  • Realistic production-shaped data.

A query can look fine on local data and fail badly in production.


25. Practical Debugging Workflow

When you find a slow query, use this workflow.

Vertical flowchart of a backend engineer&#39;s workflow for debugging a slow MySQL query: capture the SQL, run EXPLAIN, identify the biggest problem, match index design to query shape, re-run EXPLAIN ANALYZE, and check write cost.

Step 1: Capture the exact SQL

Do not optimize a simplified version too early.

Capture:

  • Full SQL with bound parameter values.
  • Endpoint or job name.
  • Frequency.
  • Average and p95/p99 latency.
  • Rows returned.
  • Table sizes.

Parameter values matter. status = 'paid' may match millions of rows. status = 'failed' may match only thousands.

Step 2: Run EXPLAIN

SQL
EXPLAIN
SELECT *
FROM orders
WHERE status = 'paid'
  AND created_at >= '2026-01-01'
ORDER BY created_at DESC
LIMIT 50;

Look first at: type, key, rows, filtered, Extra.

Step 3: Identify the biggest problem

Ask:

  • Is there a full table scan on a large table?
  • Is MySQL using the wrong index?
  • Is rows much larger than expected?
  • Is there Using filesort on a huge result set?
  • Is there Using temporary on a hot endpoint?
  • Is the join order surprising?
  • Is a join using ALL instead of eq_ref or ref?

Do not randomly add indexes. Find the main source of work first.

Step 4: Match index design to query shape

For WHERE customer_id = ? ORDER BY created_at DESC LIMIT 20(customer_id, created_at).

For WHERE status = ? AND created_at >= ? ORDER BY created_at DESC LIMIT 50(status, created_at).

For WHERE tenant_id = ? AND status = ? ORDER BY created_at DESC LIMIT 50(tenant_id, status, created_at).

General rule:

  1. Equality filters first.
  2. Then range or ordering column.
  3. Then optional covering columns if the endpoint is hot and the tradeoff is worth it.

Step 5: Re-run EXPLAIN

After adding or changing an index, check again. You want to see improvements like:

Text
type: ALL → ref/range
key: NULL → your_index
rows: 5000000 → 50
Extra: Using filesort → removed

But also benchmark actual runtime. The optimizer plan is not the same as user-visible latency.

Step 6: Check write cost

Every index has a write cost. Before adding an index, ask:

  • How often is this table written to?
  • Does this index duplicate another index?
  • Can an existing composite index be reused?
  • Is this endpoint hot enough to justify the index?
  • Will the index fit in memory?
  • Does it slow bulk imports or queue workers?

Indexes are not free.


26. Common EXPLAIN Smells

Smell Looks like Likely cause
Full scan on a large table type: ALL, rows: 12000000, key: NULL Missing index, function on column, low selectivity, or query asks for too much
Using filesort with large rows rows: 2500000, Extra: Using filesort ORDER BY not supported by index; index column order wrong; sorting huge intermediate
Using temporary; Using filesort Extra: Using temporary; Using filesort Expensive GROUP BY/DISTINCT/aggregate sort; consider summary table or async report
Join table uses ALL table: order_items, type: ALL, rows: 8000000 Missing index on join column
Index considered but skipped possible_keys: ..., key: NULL Optimizer decided index is not worth it; predicate not selective; statistics stale
Estimated vs actual rows mismatch EXPLAIN ANALYZE shows estimated 100, actual 1,000,000 Stale statistics; missing histogram on skewed column

27. Composite Index Design Patterns

Pattern 1: Tenant-scoped query

SQL
SELECT * FROM invoices
WHERE tenant_id = 100
ORDER BY created_at DESC
LIMIT 50;
SQL
CREATE INDEX idx_invoices_tenant_created
ON invoices (tenant_id, created_at DESC);

tenant_id narrows data, created_at supports ordering, LIMIT allows early stop.

Pattern 2: Tenant + status + latest

SQL
SELECT * FROM invoices
WHERE tenant_id = 100 AND status = 'overdue'
ORDER BY created_at DESC
LIMIT 50;
SQL
CREATE INDEX idx_invoices_tenant_status_created
ON invoices (tenant_id, status, created_at DESC);

Equality columns first (tenant_id, status), ordering column next (created_at).

Pattern 3: Date range report

SQL
SELECT * FROM orders
WHERE created_at >= '2026-01-01' AND created_at < '2026-02-01';
SQL
CREATE INDEX idx_orders_created_at ON orders (created_at);

Range scan over time. But if the query is tenant-scoped, prefer (tenant_id, created_at).

Pattern 4: Covering index for small projection

SQL
SELECT id, created_at, total_cents
FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;
SQL
CREATE INDEX idx_orders_customer_created_cover
ON orders (customer_id, created_at DESC, id, total_cents);

Supports filter, supports order, covers selected columns. Tradeoff: larger index, higher write cost.


28. Reading EXPLAIN Like A Backend Engineer

A database specialist may deeply analyze optimizer internals. A backend engineer should first answer practical questions.

Q1: Is this query reading too much data? Look at rows, filtered, type. If rows is huge, the query may be expensive even if it uses an index.

Q2: Is MySQL using the index I expected? Look at possible_keys, key, key_len. If possible_keys includes your index but key chooses another, ask why. If key_len suggests only the first column of a composite index is used, check your predicates.

Q3: Is MySQL sorting too much? Look for Using filesort. Then check ORDER BY columns, index column order, equality filters before ordering, LIMIT usage, number of rows before sorting.

Q4: Is MySQL creating temporary tables? Look for Using temporary. Then check GROUP BY, DISTINCT, aggregate sorting, reporting-style query shape.

Q5: Is the join efficient? For each joined table, check type, key, ref, rows. Healthy joins show eq_ref or ref. Risky joins show ALL and Using join buffer.


29. Production Checklist

Query capture

  • Do I have the exact SQL?
  • Do I have real bound parameter values?
  • Do I know how often this query runs?
  • Do I know the endpoint, job, or report that runs it?
  • Do I know whether it is user-facing or background work?

EXPLAIN reading

  • Is any large table using type: ALL?
  • Is any large intermediate result using Using filesort?
  • Is Using temporary expected or suspicious?
  • Is the chosen key the index I expected?
  • Are rows estimates reasonable (compared to EXPLAIN ANALYZE actuals)?
  • Is filtered very low after reading many rows?
  • Are joins using eq_ref or ref where possible?

Index design

  • Do composite indexes match the query shape?
  • Are equality predicates first?
  • Is the range or ordering column placed correctly?
  • Is the index too wide?
  • Does this new index duplicate an existing one?
  • Is the read improvement worth the write cost?

Query design

  • Am I wrapping indexed columns in functions?
  • Am I using deep offset pagination?
  • Am I selecting too many columns?
  • Can I use a narrower date range?
  • Can I precompute or cache reporting data?
  • Can I move heavy work out of the request path?

Validation

  • Did I rerun EXPLAIN after changes?
  • Did I run EXPLAIN ANALYZE to confirm actual timing?
  • Did I test with production-like data?
  • Did I check p95/p99 latency, not only average?
  • Did I consider write overhead?

30. Final Mental Model

When reading EXPLAIN, do not ask only:

Is MySQL using an index?

Ask better questions:

How many rows will MySQL read?

Is it using the right index for this query shape?

Can it avoid sorting?

Can it avoid temporary tables?

Are joins using indexed lookups?

Does the index match filtering, ordering, and limiting?

Is this query asking for too much data in the first place?

Good MySQL optimization is not about adding indexes everywhere. It is about matching access patterns to indexes, keeping queries predictable as data grows, and understanding when a query needs a different product or architecture decision.

EXPLAIN gives you the map. Your job as a backend engineer is to connect that map to real application behavior.


References