Django Advanced

Advanced Caching for Django: Redis, Per-View, Fragment, and Invalidation Patterns That Actually Work

A pragmatic guide to caching in Django: choose the right level (site, view, fragment, low-level), avoid the stampede, and solve the only hard problem — invalidation — with versioned keys and signal-driven busts.

DjangoZen Team Apr 25, 2026 18 min read 134 views

Caching is the cheapest performance win there is — and the easiest to get subtly, dangerously wrong. The fastest query is the one you never run, but a cache that serves stale or, worse, another user's data turns a speedup into an incident. This tutorial covers Django caching the way it works in production: the layers from per-view to fragment to low-level, a Redis backend, and the invalidation patterns that are the actual hard part of caching.

Why caching is the highest-leverage optimization

Most expensive work in a web app is repeated unnecessarily — the same homepage rendered thousands of times a second, the same permission set computed on every request, the same aggregate recalculated for every visitor. Caching stores the result of expensive work and serves it again without redoing it, and because so much work is identical across requests, the payoff is enormous. Often a day spent caching the right things buys more headroom than a month of database scaling, because it removes load rather than spreading it. The art is identifying what is expensive, repeated, and tolerably stable, then caching it at the right layer with a sound invalidation story.

Choosing a cache backend

Django's cache framework is pluggable, and the backend choice matters. The local-memory backend is per-process, so with multiple workers each has its own cache and they never share — fine for development, misleading in production. Redis is the right production backend: it is shared across every worker and server, fast, and supports the TTLs and atomic operations caching needs.

CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.redis.RedisCache",
        "LOCATION": "redis://127.0.0.1:6379/1",
    }
}

Using a shared backend is not optional at scale — a per-worker cache means inconsistent results depending on which worker serves you, and a cache hit rate that quietly collapses as you add workers.

The caching layers

Django offers caching at several granularities, and choosing the right layer is half the skill. Per-view caching stores an entire rendered response, ideal for pages that are identical for many users and change slowly. Template fragment caching caches expensive pieces of a page — a sidebar, a navigation menu, a rendered list — while the rest stays dynamic, perfect when most of a page is cheap but one part is costly. Low-level caching with cache.get/cache.set caches arbitrary values — a queryset result, a computed number, an API response — giving you full control over exactly what is cached and for how long. Production apps use all three, each where it fits.

Per-view caching and its trap

Caching a whole view is powerful but carries the most dangerous trap in caching: serving one user's page to another. A view cached naively will store the first user's personalized response and hand it to everyone, leaking data. Per-view caching is only safe for genuinely public, identical-for-all pages, or when the cache key includes everything that varies the response — the user, their language, their permissions. Django's vary_on mechanisms and careful key construction handle this, but the discipline is to ask, for every cached view, "could this response differ between two users?" If yes, the user must be in the key, or you must not cache the whole view at all.

Fragment caching

Most pages are a mix of cheap dynamic content and a few expensive pieces, which is exactly what fragment caching addresses. You wrap the costly part of a template — a heavy rendered list, an aggregated widget — in a cache tag with a key and TTL, leaving the rest of the page dynamic:

{% load cache %}
{% cache 600 sidebar_popular request.LANGUAGE_CODE %}
    {# expensive query and rendering here #}
{% endcache %}

Note the language code in the key — anything that changes the fragment's output must be part of its key, or you serve the wrong variant. Fragment caching is the pragmatic middle ground: it removes the expensive work without forcing the whole page to be cacheable, and without the all-or-nothing risk of per-view caching.

Low-level caching with get_or_set

For precise control, cache individual values directly. The get_or_set pattern is the workhorse: return the cached value if present, otherwise compute it, store it, and return it.

def popular_products():
    return cache.get_or_set(
        "popular_products",
        lambda: list(Product.objects.expensive_ranking()[:20]),
        timeout=600,
    )

This caches exactly the expensive computation, keyed exactly as you choose, with full control over the TTL and invalidation. Low-level caching is where you handle the cases the higher layers cannot — caching a specific costly queryset, an external API response, or a computed value reused across views — and it composes with the others freely.

Invalidation: the actual hard part

There is a famous joke that the two hard problems in computer science are naming things and cache invalidation, and it is true. Storing a value is trivial; knowing when it has become stale and removing it is where caching gets genuinely difficult. A cache that never invalidates serves wrong data; one that invalidates too eagerly provides no benefit. Every cached value needs an answer to "when does this stop being correct?" — and getting that answer right, consistently, across a codebase is the real engineering challenge of caching. The rest of this tutorial's patterns exist to manage invalidation rather than to store data.

TTL-based expiry

The simplest invalidation strategy is a time-to-live: the cached value expires after a set duration, and the next request recomputes it. This is ideal when slight staleness is acceptable and the data changes on a roughly predictable cadence — a "popular this week" list that can be a few minutes out of date is a perfect fit. TTLs require no invalidation logic at all, which makes them robust and easy to reason about, at the cost of serving data up to the TTL old. For the large category of data where "fresh within a few minutes" is fine, TTL expiry is the right default — reach for more complex invalidation only when staleness genuinely matters.

Event-based invalidation

When data must be fresh immediately after it changes, TTLs are not enough — you invalidate on the event that changes the data. When a product is updated, you delete its cached entries so the next read recomputes. Django signals are a natural hook: a post_save handler clears the relevant keys. The challenge is knowing which keys a change affects — a product update might invalidate its detail cache, several listing caches, and a category aggregate. This is why cache keys must be designed deliberately, so you can find and clear everything a change touches. Event-based invalidation gives immediacy at the cost of this bookkeeping, and getting the "which keys" question wrong is how stale data slips through.

Key design and namespacing

Good cache keys are the foundation that makes invalidation tractable. Keys must encode everything that varies the value — user, language, version, relevant parameters — so two different results never collide on one key. Namespacing keys by entity (product:42:detail) makes it possible to find and clear all entries related to a thing. A powerful trick is version-based invalidation: include a version number in keys and bump it to invalidate a whole class of entries at once, sidestepping the need to enumerate them. Thoughtful key design up front is what turns invalidation from an intractable guessing game into a systematic operation.

Cache stampedes

A subtle production failure: a popular cached value expires, and in the moment before it is recomputed, hundreds of simultaneous requests all miss the cache and all recompute the expensive value at once — a stampede that can overwhelm the database precisely because the item was popular. Defenses include recomputing slightly before expiry, using a lock so only one request regenerates while others briefly serve the old value, or adding jitter to TTLs so many keys do not expire at the same instant. Awareness of stampedes matters because they strike hardest at exactly the hot keys caching was meant to protect, turning a cache miss into an outage if unhandled.

Caching in multi-user and multi-tenant apps

Caching and per-user or per-tenant data are a dangerous combination, because the worst caching bug is serving one user's or tenant's data to another. The rule is absolute: any value that differs by user or tenant must include that identity in its cache key. A key like dashboard_stats will leak across tenants; tenant:7:dashboard_stats will not. Audit every cache write in a multi-tenant app for this, because a single un-namespaced key is a data breach waiting to be discovered. The convenience of caching must never override the isolation guarantees your application promises — when in doubt, include the identity in the key.

Cache-aside and other strategies

There is more than one way to relate a cache to its underlying data, and the pattern you choose shapes behavior. The most common is cache-aside (lazy loading): the application checks the cache, and on a miss loads from the database and populates the cache — simple and robust, which is what get_or_set implements. Write-through updates the cache whenever the database is written, keeping it fresh at the cost of write overhead. Each strategy trades freshness, complexity, and write cost differently. Knowing the named patterns helps you reason about your caching deliberately rather than ad hoc, choosing the relationship between cache and source that fits each piece of data's freshness needs.

Cache warming

A cold cache — just after a deploy, a restart, or a flush — means the first requests all miss and hit the database at once, which can cause a latency spike or even a stampede precisely when the system is most fragile. Cache warming pre-populates important entries before they are needed, often via a background task that computes and stores the hot values proactively. For predictable, expensive, frequently-accessed data, warming the cache ahead of traffic smooths the post-deploy experience and prevents the thundering-herd effect of many simultaneous cold misses. It is the proactive complement to lazy loading: rather than waiting for the first user to pay the cost, you pay it in advance, off the request path.

Monitoring cache effectiveness

A cache is only worth its complexity if it is actually hit, and the way to know is to measure the hit ratio — the fraction of lookups served from cache rather than recomputed. A low hit ratio means the cache is providing little benefit while adding complexity and a staleness risk, signalling that your keys are too granular, your TTLs too short, or you are caching the wrong things. Monitoring hit ratios per cache use, along with the load you are removing from the database, tells you whether your caching is paying off and where to tune it. Caching without measuring is faith; measuring turns it into engineering you can verify and improve.

Distributed caching concerns

Once your cache is shared across many workers and servers via Redis, distributed-systems concerns appear. Invalidating an entry must propagate to all readers, which a shared store handles, but race conditions between readers and writers need thought — two requests can recompute the same expired value simultaneously. Redis itself becomes a critical dependency whose availability and capacity you must manage, and a cache that grows unbounded will evict unpredictably or run out of memory, so configure eviction policies and memory limits deliberately. Treating the shared cache as the piece of distributed infrastructure it is — monitored, capacity-planned, with sensible eviction — keeps it a reliable accelerator rather than a new single point of failure.

Summary

Caching is the cheapest, highest-leverage performance work available, because so much expensive work is repeated identically across requests. Use a shared Redis backend, never a per-worker local cache, and apply the right layer for each case: per-view for public pages, fragment caching for expensive page pieces, and low-level get_or_set for specific costly values. But storing data is the easy half — invalidation is the real engineering. Lean on TTLs where slight staleness is fine, invalidate on events where freshness matters, and design keys deliberately so you can find and clear what a change affects. Defend hot keys against stampedes, and above all, namespace every per-user and per-tenant value by identity, because the one caching bug you cannot afford is serving the wrong person's data. Get invalidation right and caching is a superpower; get it wrong and it is an incident — the difference is entirely in the discipline.