PostgreSQL Materialized Views: Refresh and Stale Data

So you've got a dashboard.

Someone in the business asked for "revenue by region by week, with year-over-year comparison, broken down by product category." You wrote the SQL. It joins orders, order_items, customers, products, and a regions table, aggregates over a year of data, and does a window function for the YoY column. The first time it ran, it took 14 seconds. After you added a few indexes and rewrote one CTE, you got it down to 4. The dashboard reloads every time someone opens it. Eighty people open it a few times a day.

Your CPU graph hates you.

You don't actually need this query to run live. The data underneath it is mostly yesterday's data. Orders close at midnight, the dashboard is a morning ritual. A four-second query running two hundred times a day to answer a question whose answer barely changes is wasted work. You need a cache.

If you've gone down the redis-or-application-layer path here before, you know how that ends. Invalidation logic, serialization, schema drift between cache and DB, a Slack message at 7am from the analyst saying "the numbers don't match prod." There's a quieter answer Postgres has had since 9.3, and most teams don't reach for it until they have to: a materialized view.

A View vs A Materialized View

A regular CREATE VIEW in Postgres is just a saved query. The view itself doesn't store anything. Every time you SELECT from it, Postgres expands the view definition and runs the underlying query against your tables. It's a convenience, not a cache.

SQL

CREATE VIEW weekly_revenue AS
SELECT
  date_trunc('week', o.created_at) AS week,
  r.name AS region,
  sum(oi.price * oi.quantity) AS revenue
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
JOIN customers c ON c.id = o.customer_id
JOIN regions r ON r.id = c.region_id
GROUP BY 1, 2;

SELECT * FROM weekly_revenue WHERE week >= '2025-01-01';

Every SELECT against weekly_revenue re-runs the full join-and-aggregate. The view didn't make anything faster, it just gave the heavy query a name and a stable interface.

A materialized view is the same query, but Postgres actually runs it once and writes the result rows to disk as a real table you can query and index. After that, SELECT against it is just a table scan over a precomputed result. No joins, no group-bys, no recomputation.

SQL

CREATE MATERIALIZED VIEW weekly_revenue_mv AS
SELECT
  date_trunc('week', o.created_at) AS week,
  r.name AS region,
  sum(oi.price * oi.quantity) AS revenue
FROM orders o
JOIN order_items oi ON oi.order_id = o.id
JOIN customers c ON c.id = o.customer_id
JOIN regions r ON r.id = c.region_id
GROUP BY 1, 2;

CREATE INDEX ON weekly_revenue_mv (week, region);

SELECT * FROM weekly_revenue_mv WHERE week >= '2025-01-01';

Same answer, different cost shape. The first query that built the view did all the work. Every subsequent query does almost none.

The catch, and the entire reason this article exists, is that the rows in weekly_revenue_mv are frozen at the moment the view was built. If a new order comes in five minutes later, the materialized view doesn't know. Postgres won't keep it up to date for you. You have to refresh it yourself.

That's the whole mental model. A materialized view is a query result you saved as a table, plus a button labeled REFRESH that you have to remember to press.

When You Actually Want One

Materialized views are not a general-purpose cache. They're a specific tool for a specific shape of problem. If your workload doesn't look like one of these, reach for something else.

The clearest cases are reporting and analytics. A query that aggregates a lot of rows into a small result, runs often, doesn't need second-by-second freshness, and is cheap to recompute on a schedule. Daily revenue rollups, weekly cohort retention, monthly funnel summaries, top-N most-purchased products in the last 30 days. These are the textbook uses, and they're textbook for a reason. The math behind the report is expensive, the input data changes slowly relative to how often the report is read, and a freshness window of "as of last night" or "as of an hour ago" is fine.

You also see them used for expensive joins that show up in a lot of places. If your application has one canonical "user with their org and their plan and their billing status" join that ten different endpoints reuse, materializing that join once and pointing the endpoints at the materialized view can simplify a surprising amount of code, even if the freshness story has to be carefully thought through. This is closer to a denormalized read model than a true report, but the mechanism is the same.

The cases where they're a bad fit are more interesting:

Read-after-write requirements. A user just placed an order. They open their order history. The materialized view hasn't refreshed yet. The order isn't there. The user files a bug. Don't materialize anything that participates in a flow where the user expects to see their own writes.
High write volume on the source tables. If orders takes a thousand inserts a minute and your refresh strategy is "every five minutes," you're paying the full cost of recomputing the aggregate every five minutes for what amounts to a slightly less stale snapshot. At some point a real streaming aggregation, or a maintained summary table you update inside the same transaction as the write, beats this.
Tiny base tables. If your underlying query is fast (under 50ms), there's nothing to cache. Materialized views are for queries whose runtime is measured in seconds.
Queries you only run once a day anyway. If the dashboard's already a scheduled job that writes a CSV at 6am, you don't need a materialized view. You need the CSV.

The mental check before reaching for one is: is the recompute cost paid by the refresh dramatically less than the cost of running the query on every read? If yes, materialize. If no, you're moving cost around, not eliminating it.

REFRESH MATERIALIZED VIEW: The Two Modes That Matter

Once you have a materialized view, you keep it current with REFRESH MATERIALIZED VIEW. There are two flavours, and the difference between them is the difference between a usable production feature and a 3am incident.

The default refresh:

SQL

REFRESH MATERIALIZED VIEW weekly_revenue_mv;

This re-runs the underlying query and replaces the contents. While it's running, it takes an ACCESS EXCLUSIVE lock on the materialized view. That means no one can read from it until the refresh finishes. If your refresh takes 90 seconds and a dashboard request comes in during that window, the dashboard waits 90 seconds. If twenty dashboard requests come in, they all wait, and then they all run at the same instant the refresh ends and your DB load spikes. This is fine for a view that refreshes in 200ms during a quiet window. It is not fine for the kind of view you actually want to materialize, which is the slow one.

The other mode, the one you almost always want:

SQL

REFRESH MATERIALIZED VIEW CONCURRENTLY weekly_revenue_mv;

CONCURRENTLY builds the new contents in a temporary location, computes the diff against the existing view, and applies the diff with a much weaker lock that lets readers keep reading the old contents the whole time. The dashboard sees yesterday's numbers right up until the moment the diff applies, then it sees today's. No 90-second pause.

This sounds like a free upgrade. It is not. CONCURRENTLY has two costs.

First, it requires a unique index on the materialized view. Without one, Postgres will refuse:

SQL

REFRESH MATERIALIZED VIEW CONCURRENTLY weekly_revenue_mv;
-- ERROR:  cannot refresh materialized view "public.weekly_revenue_mv" concurrently
-- HINT:  Create a unique index with no WHERE clause on one or more columns of the materialized view.

The unique index is how Postgres knows which row in the new result corresponds to which row in the old one, so it can compute the delete/insert/update set instead of replacing wholesale. For a rollup like weekly_revenue_mv, the natural unique key is (week, region):

SQL

CREATE UNIQUE INDEX weekly_revenue_mv_pk
  ON weekly_revenue_mv (week, region);

If your aggregation doesn't have a natural unique key, that's a sign that either you need to add one (an artificial row identifier from a window function) or you're materializing the wrong query.

Second, CONCURRENTLY is slower than the non-concurrent refresh, sometimes much slower, because it has to compute the symmetric difference between the old and new result sets. For a view with millions of rows, the diff itself can take longer than a wholesale rebuild. You're trading total refresh time for refresh availability. Readers don't wait, but the database does more work. For most workloads this is exactly the trade you want. For a small view that almost never gets read, the plain REFRESH is fine and faster.

When To Refresh: Picking A Strategy

The refresh decision is the part most teams underthink. "We'll refresh it every hour" is a sentence people say without examining whether hourly is right or even what hourly means in this context. There's no single correct strategy. There's a small menu of patterns and a question you ask the table to choose between them.

The question is: what is the longest acceptable lag between a write to the underlying tables and that write being visible in the materialized view? Not what's nice. What the business will accept. Sometimes that's "a day is fine, this is a monthly report." Sometimes it's "five minutes max, the sales team uses this on calls." Once you have that number, the strategy follows.

Scheduled refresh

The simplest pattern. A cron job, a pg_cron extension entry, or a scheduled task in your application calls REFRESH MATERIALIZED VIEW CONCURRENTLY ... at a fixed cadence. Every night, every hour, every five minutes, whatever your acceptable lag allows.

SQL

-- Using pg_cron, scheduled inside the database
SELECT cron.schedule(
  'refresh-weekly-revenue',
  '0 4 * * *',  -- 4am every day
  $$ REFRESH MATERIALIZED VIEW CONCURRENTLY weekly_revenue_mv $$
);

This is the right default for reporting use cases. The data underneath is changing constantly, but the report doesn't need to keep up. Pick a refresh time when load is low (early morning) and a cadence that stays well under the acceptable lag.

The thing to watch out for is refresh duration creeping past the cadence. If your hourly refresh starts taking 70 minutes, you eventually have two refresh jobs running simultaneously, fighting each other, and you find out about it because one of them errors with "could not obtain lock on materialized view". Always log the refresh duration, alert when it gets above some fraction of the cadence (say, 50%), and either move to a less frequent cadence or break the materialized view into smaller pieces before it becomes a problem.

After-the-fact refresh

For pipelines that have a clear "the data is now ready" moment, schedule the refresh after that moment instead of on a clock. If your ETL job loads yesterday's orders into the warehouse and finishes around 3am, the right time to refresh weekly_revenue_mv is when the ETL job tells you it's done, not at 4am because you guessed it would be.

SQL

-- Final step of the ETL job, in the same transaction or right after:
REFRESH MATERIALIZED VIEW CONCURRENTLY weekly_revenue_mv;

This couples the refresh to the data lifecycle and removes a class of "the report shows yesterday's data but the ETL hadn't finished yet" bugs.

On-demand refresh

A user clicks "refresh" in the UI, an admin triggers it from a button in an internal tool, an API endpoint accepts a request to refresh and returns when it's done. Useful for reports that are read rarely but need to be exact when they are.

Don't expose this to end users on a hot path. REFRESH MATERIALIZED VIEW CONCURRENTLY on a 50M-row view is a real workload. The database will absolutely refresh it, and absolutely chew on its CPU while doing so. Treat the on-demand button as an admin tool, throttle it (refuse to refresh more than once every N minutes), and run it asynchronously so the request doesn't time out.

Trigger-based refresh

It's tempting to put a trigger on the source table that refreshes the materialized view every time a row is inserted. Don't. Each refresh is the full cost of the underlying query. You will burn the database to the ground.

The only version of this that ever makes sense is a trigger that enqueues a refresh request (sets a flag, writes to a queue table) and a separate worker that batches and debounces the refreshes. At which point you've reinvented "refresh on a schedule, but only if there's been a change," which is rarely worth the complexity over the simple scheduled version.

Refresh-strategy comparison across four lanes on a shared time axis: Scheduled (cyan ticks at 04:00 every day), After ETL (refresh fires immediately after each ETL load completes), On-demand (sparse irregular refreshes tied to user click cursors), and Trigger (a dense orange wall of refresh symbols with an AVOID warning sticker, every write fires a full recompute).

Stale Data Is A Feature, Not A Bug

The hardest part of running materialized views in production isn't the SQL. It's setting expectations with the humans who read the data.

A regular query is "live". Whatever the database says right now is what the user sees. A materialized view is, by definition, behind. Sometimes by minutes, sometimes by a day. If you don't make that explicit, you'll spend the rest of your career answering some version of "why does the dashboard say we're at $1.2M but I just looked at the orders table and we're at $1.3M?"

There are two parts to handling this well, and they're both unglamorous.

The first is to expose the freshness of the data alongside the data. Anywhere a materialized view's contents are shown (internal dashboard, customer-facing report, exported PDF), show the timestamp of the last refresh.

SQL

SELECT
  matviewname,
  last_refresh
FROM pg_matviews
LEFT JOIN (
  SELECT
    oid,
    -- pg_stat_user_tables.last_analyze is a reasonable proxy on most installs;
    -- if you want exact refresh time, store it yourself (see below).
    NULL::timestamp AS last_refresh
  FROM pg_class
) c ON c.oid = (matviewname::regclass)
WHERE schemaname = 'public';

Postgres doesn't actually expose "the last time this materialized view was refreshed" through a built-in column. The fields in pg_matviews and pg_stat_*_tables are close approximations and often good enough, but the cleanest pattern is to track refresh time yourself: have your refresh job write the now() of when it finished into a small mat_view_refreshes table, and join against that table when you display the data.

SQL

CREATE TABLE mat_view_refreshes (
  view_name text PRIMARY KEY,
  last_refreshed_at timestamptz NOT NULL,
  duration_ms int NOT NULL
);

-- The refresh wrapper your scheduler calls
CREATE OR REPLACE FUNCTION refresh_with_log(name text) RETURNS void AS $$
DECLARE
  started timestamptz := clock_timestamp();
BEGIN
  EXECUTE format('REFRESH MATERIALIZED VIEW CONCURRENTLY %I', name);

  INSERT INTO mat_view_refreshes(view_name, last_refreshed_at, duration_ms)
  VALUES (
    name,
    clock_timestamp(),
    extract(milliseconds FROM clock_timestamp() - started)::int
  )
  ON CONFLICT (view_name)
  DO UPDATE SET
    last_refreshed_at = EXCLUDED.last_refreshed_at,
    duration_ms = EXCLUDED.duration_ms;
END;
$$ LANGUAGE plpgsql;

Now you have an authoritative record of when each view was last refreshed and how long it took. The dashboard can render "data as of 04:12 today" in the corner, support engineers can debug "is the view stale or is the math wrong?" by reading one row, and you can graph refresh duration over time to spot the slow creep before it crosses your cadence.

The second part is picking a freshness window the business has agreed to and sticking to it. "As of an hour ago, give or take" is a perfectly acceptable contract for almost every reporting use case. "Live" is a contract you can't honour with materialized views, and you shouldn't pretend you can.

When Refresh Becomes The Bottleneck

If your strategy is working (you've materialized the right query, you're using CONCURRENTLY, you're refreshing on a schedule that matches the freshness window), there's a single failure mode that eventually catches up with everyone: the refresh itself becomes too slow.

A view that refreshed in 4 seconds when you launched it refreshes in 4 minutes a year later. The data didn't get more interesting, there's just more of it. At some point a 4-minute refresh, run every five minutes, is a database that's spending most of its time refreshing and the rest of its time barely keeping up. The diff that CONCURRENTLY computes is itself a join over the old and new contents, and as both grow, that join grows.

The conventional moves, in roughly increasing order of effort:

Add the right indexes to the underlying tables. The refresh runs the materialized view's defining query, and that query is subject to the same EXPLAIN ANALYZE you'd run on any other slow query. If the plan is doing a sequential scan over orders because there's no index on created_at, fix that first. The fix is usually not in the materialized view at all.
Narrow the view's scope. A common pattern is a materialized view that aggregates all historical data when the dashboard only ever shows the last two years. Add a WHERE created_at >= now() - interval '2 years' to the view definition and watch the refresh time fall by an order of magnitude. The history before that point isn't going to change, so anyone who needs it can read it from a separate, less frequently refreshed view.
Split one big view into several smaller ones. If your weekly revenue rollup also computes monthly totals and yearly totals, that's three different aggregations crammed into one materialized view, refreshing all three every time any of them is needed. Splitting them lets each one refresh on its own cadence. The yearly view doesn't need to update every five minutes.
Move to a maintained summary table. A materialized view's defining query is the contract. Postgres has to recompute it from scratch (or diff the recomputation, with CONCURRENTLY). A regular table you maintain yourself with triggers or with the application layer can be incrementally updated: when a new order arrives, increment the right week-region cell in the summary table by the order's value. This is more code, more invariants to protect, and more places for bugs to hide. But it scales further than a materialized view ever will, because the cost of an update is proportional to the size of the change, not the size of the result.

The order matters. Most teams jump straight to step 4 because it sounds more sophisticated, then spend six months chasing summary-table invariants when narrowing the view's scope (step 2) would have bought them another two years. Materialized views are not the most powerful tool in the box, but they're often enough, and the moment they stop being enough is also the moment you've earned the right to write something more complex.

A Few Things The Docs Don't Emphasize

A handful of practical edges that don't get headline space in the official documentation but bite people in production.

Materialized views aren't replicated to logical replicas the way you might expect. The defining query is replicated, but the stored data isn't always. Depending on your replication setup, the replica may have an empty matview that needs its own REFRESH to populate, and refreshing on the primary doesn't always propagate. Test this on your topology before assuming it works.

Indexes on the materialized view stick around across refreshes, which is exactly what you want. You don't have to recreate them after every REFRESH. The data in the table is replaced (or diffed, with CONCURRENTLY); the indexes are maintained as part of the operation.

REFRESH MATERIALIZED VIEW is transactional. You can wrap it in a BEGIN ... COMMIT and roll it back, and other transactions won't see the new contents until the commit. Useful when you want a refresh to happen as part of a larger atomic operation, like "finish loading today's data and refresh the rollup as one logical step."

A materialized view participates in pg_dump and backups like a regular table. Its contents go in the dump, not just the definition. If you're paying attention to dump size and one of your matviews holds 40GB, that's why your nightly backup got slower.

The Decision

The tool itself isn't complicated. A materialized view is a query result you stored as a table, plus a refresh button you remember to press. The interesting decisions are the ones around it: is the freshness window honest about what the business needs, is the refresh cheap enough to be sustainable a year from now, are you exposing the staleness clearly enough that no one is surprised by it.

If you can answer those three questions, you've got the right cache for reporting. If you can't, if the freshness window is "as live as possible," if the refresh is already slow at launch, if no one in the room is comfortable saying "this is data from an hour ago", you don't have a materialized view problem. You have a different problem, and a materialized view is going to make it worse before it makes it better.

Postgres gave you the button. Pressing it on the right schedule, for the right query, on the right report. That part is on you.

PostgreSQL Materialized Views: The Reporting Cache You Have To Refresh Yourself

A View vs A Materialized View

When You Actually Want One

REFRESH MATERIALIZED VIEW: The Two Modes That Matter

When To Refresh: Picking A Strategy

Scheduled refresh

After-the-fact refresh

On-demand refresh

Trigger-based refresh

Stale Data Is A Feature, Not A Bug

When Refresh Becomes The Bottleneck

A Few Things The Docs Don't Emphasize

The Decision

Let’s make something great together

Links

Contacts

A View vs A Materialized View

When You Actually Want One

REFRESH MATERIALIZED VIEW: The Two Modes That Matter

When To Refresh: Picking A Strategy

Scheduled refresh

After-the-fact refresh

On-demand refresh

Trigger-based refresh

Stale Data Is A Feature, Not A Bug

When Refresh Becomes The Bottleneck

A Few Things The Docs Don't Emphasize

The Decision

You might also like

MySQL Query Optimization: Reading EXPLAIN Like A Backend Engineer

MySQL Indexes Explained Through Real Query Examples

MySQL Transactions And Isolation Levels Explained With Real Sessions

Let’s make something great together