DevOps Advanced

Kubernetes for Django: Deployments, HPA, Ingress, ConfigMaps, Secrets, and PgBouncer

A pragmatic Kubernetes setup for a real Django app: Deployment + Service + Ingress, ConfigMaps and Secrets done right, liveness/readiness probes that work, HPA on the right metric, and PgBouncer in front of PostgreSQL.

DjangoZen Team Apr 25, 2026 23 min read 136 views

Kubernetes is the industry standard for running containers at scale, and it brings real benefits to a Django app — self-healing, horizontal autoscaling, zero-downtime rollouts, and declarative infrastructure. It also brings real complexity, and deploying Django on it well means understanding a handful of core objects and a few Django-specific concerns. This tutorial covers the path from a containerized Django app to a production Kubernetes deployment: Deployments, Services, Ingress, autoscaling, configuration, secrets, and connection pooling.

When Kubernetes is worth it

Kubernetes is powerful but not free — it adds a steep operational layer that a single-server deployment does not need. It earns its complexity when you are running at scale, need high availability across machines, want automated scaling and self-healing, run many services, or have a team that already operates it. For a small app, a managed platform or a couple of well-configured servers is simpler and perfectly adequate. Be honest about which situation you are in: adopting Kubernetes for a low-traffic app means paying its complexity tax for benefits you do not yet need. This tutorial assumes you have a genuine reason — scale, availability, or organizational standard — to run Django on it.

Containerizing Django first

Kubernetes runs containers, so everything starts with a solid container image of your Django app. The image should be lean (a multi-stage build keeps it small), run under an ASGI or WSGI server like gunicorn, run as a non-root user, and contain no secrets. Collect static files at build time and make the image immutable — configuration comes in at runtime, not baked into the image. A clean, well-built container is the foundation; Kubernetes only orchestrates what you give it, so a bloated or insecure image undermines everything above it. Get the container right before worrying about cluster objects.

Pods and Deployments

The smallest deployable unit in Kubernetes is a Pod — one or more containers running together — but you rarely manage Pods directly. Instead you declare a Deployment, which describes the desired state: which image to run, how many replicas, and how to update them. Kubernetes continuously reconciles reality to that desired state, replacing failed Pods and maintaining the replica count automatically.

apiVersion: apps/v1
kind: Deployment
metadata: {name: djzen-web}
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: web
        image: registry/djzen:1.4.2
        ports: [{containerPort: 8000}]

This declarative model — you describe what you want, Kubernetes makes it so — is the heart of how the platform self-heals and scales.

Services: stable networking

Pods are ephemeral — they come and go, each with a new IP — so you never address them directly. A Service provides a stable virtual IP and DNS name that load-balances across the healthy Pods behind it, so the rest of your system talks to "djzen-web" without caring which Pods exist right now. As Pods are replaced or scaled, the Service keeps routing to whatever is currently healthy. This decoupling of a stable address from the churning set of Pods behind it is what makes Kubernetes networking work; your app and your Ingress target the Service, and the Service handles the moving parts.

Ingress: getting traffic in

A Service load-balances inside the cluster, but external HTTP traffic enters through an Ingress, which routes requests by hostname and path to the right Service and typically terminates TLS. The Ingress is where you configure your domain, your certificates (often automated with cert-manager and Let's Encrypt), and routing rules. It is the front door of your cluster. Together, Ingress for entry, Service for stable routing, and Deployment for the running Pods form the core request path: traffic hits the Ingress, is routed to a Service, and load-balanced across the Pods of your Deployment. Understanding this chain is most of understanding how a request reaches your Django code.

Health checks: liveness and readiness

Kubernetes needs to know whether a Pod is healthy, and it asks through two probes. A readiness probe tells it whether a Pod is ready to receive traffic — Kubernetes only routes to ready Pods — which is how it avoids sending requests to a container still starting up or temporarily overloaded. A liveness probe tells it whether a Pod is alive or stuck; a failing liveness probe triggers a restart. For Django, expose a lightweight health endpoint and wire both probes to it. Getting probes right is what enables self-healing and zero-downtime rollouts: without a readiness probe, a deploy routes traffic to Pods that are not ready yet, and users see errors during every release.

Horizontal Pod Autoscaling

One of Kubernetes' headline features is automatic scaling. The HorizontalPodAutoscaler watches a metric — typically CPU, or a custom metric like request rate — and adds or removes Pod replicas to keep it near a target:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  scaleTargetRef: {kind: Deployment, name: djzen-web}
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource: {name: cpu, target: {type: Utilization, averageUtilization: 70}}

Traffic spikes scale you up, quiet periods scale you down, all automatically. For this to work, your Pods must declare resource requests so Kubernetes can reason about utilization, and your app must handle Pods appearing and disappearing gracefully.

Resource requests and limits

Every container should declare how much CPU and memory it requests (what it needs to be scheduled) and its limits (the ceiling it cannot exceed). Requests let Kubernetes pack Pods onto nodes sensibly and underpin autoscaling decisions; limits prevent one container from starving its neighbors or running away with memory. Set these deliberately based on real measurements — too low and Pods get throttled or killed, too high and you waste capacity and pack poorly. Resource configuration is unglamorous but central: it is what lets the scheduler do its job, keeps the cluster stable under load, and makes autoscaling meaningful. Skipping it leads to noisy-neighbor problems and unpredictable behavior under pressure.

Configuration with ConfigMaps

A core principle is keeping configuration out of the image so the same image runs in every environment, configured at runtime. ConfigMaps hold non-secret configuration — feature flags, hostnames, tuning values — injected into Pods as environment variables or files. This means you build one immutable image and configure it per environment with different ConfigMaps, rather than baking settings in or building separate images. It aligns with the twelve-factor principle of strict config-from-environment separation, and it makes promotions between environments clean: the same artifact moves from staging to production, only its configuration changes. ConfigMaps are how Django settings driven by environment variables come together in Kubernetes.

Secrets management

Sensitive configuration — the SECRET_KEY, database passwords, API keys — goes in Secrets rather than ConfigMaps. Secrets are injected the same way (env vars or files) but are handled with more care: kept out of your image and your git repository, and ideally encrypted at rest and backed by an external secrets manager for rotation and audit. Note that base64-encoded Kubernetes Secrets are not encrypted by default, so for real security enable encryption at rest or integrate a dedicated secrets store. The discipline is the same as anywhere: secrets never live in images or source control, they are injected at runtime, and they are rotatable. Treat them as the crown jewels, because a leaked database password defeats everything else.

Running migrations safely

Database migrations need special handling in Kubernetes because multiple replicas roll out together and you must not run migrations from every Pod simultaneously. The clean pattern is to run migrations as a separate step before the new Pods start serving — a Kubernetes Job, or an init container that runs once — so the schema is updated exactly once, in a controlled way. Combined with backward-compatible migrations (the subject of its own discipline), this lets you deploy without downtime. Running migrate inside every Pod's startup is a recipe for race conditions and conflicts; isolate it into a single, deliberate step that the rollout waits on.

PgBouncer for connection pooling

Kubernetes makes a database connection problem worse: as you scale Pods, each gunicorn worker in each Pod opens its own connections, and PostgreSQL's per-connection process model means a few dozen Pods can exhaust the database's connection limit. PgBouncer solves this by multiplexing many client connections onto a small pool of real ones. Run it as a sidecar in each Pod or as a shared Deployment, point Django at it instead of directly at PostgreSQL, and set CONN_MAX_AGE = 0 for transaction pooling. Without pooling, autoscaling Django on Kubernetes will eventually take down the database not through query load but through sheer connection count — PgBouncer is what makes horizontal scaling and a single database coexist.

Zero-downtime rollouts

Deployments support rolling updates by default: new Pods are started and verified ready before old ones are terminated, so there is always capacity serving traffic and users never see an outage during a release. This works only if your readiness probes are correct and your app shuts down gracefully — finishing in-flight requests when it receives the termination signal rather than dropping them. Configure a sensible termination grace period and handle SIGTERM to drain cleanly. With probes and graceful shutdown in place, you get the holy grail of continuous deployment: ship new versions any time, with no maintenance window and no dropped requests, because Kubernetes orchestrates the handover Pod by Pod.

Designing stateless Pods

Kubernetes assumes your Pods are disposable — it will kill and recreate them freely for scaling, rebalancing, or node maintenance — so your Django app must be stateless. Store nothing important on a Pod's local disk: sessions belong in a shared store like Redis or the database, uploaded files in object storage, and caches in a shared backend. Anything written to a Pod's filesystem vanishes when the Pod is replaced, which happens routinely and without warning. Designing for statelessness is not a constraint Kubernetes imposes arbitrarily; it is what makes self-healing, scaling, and rolling updates possible. A Pod that hoards local state breaks the moment Kubernetes does the very things you adopted it for.

Logging and observability in the cluster

In Kubernetes, Pods are ephemeral and numerous, so logging to a Pod's local file is useless — it disappears with the Pod. The convention is to log to stdout/stderr, where the cluster's logging stack collects, aggregates, and ships logs to a central system you can search across all Pods. The structured-logging and observability practices that matter anywhere matter more here, because there is no single server to SSH into and tail a log. Centralized log aggregation, metrics, and tracing are not optional niceties in a cluster — they are how you can see what is happening at all when your app is spread across dozens of disposable Pods on multiple nodes.

Declarative infrastructure and GitOps

Everything in Kubernetes is declared in YAML, which means your entire infrastructure can live in version control and be applied automatically — the practice known as GitOps. Your Deployments, Services, Ingress, and autoscalers are reviewed, versioned, and deployed like code, giving you auditability, easy rollback, and reproducible environments. Tools like Helm or Kustomize template these manifests so you can manage variation across environments without copy-paste. This declarative, version-controlled approach is one of Kubernetes' real strengths over imperative server management: the cluster's desired state is a reviewable artifact, and reconciling reality to it is the platform's job. Infrastructure becomes something you change through pull requests, not ad-hoc commands.

Common pitfalls

Teams new to Django on Kubernetes hit predictable problems. Missing readiness probes cause errors during every deploy. No connection pooling exhausts the database as Pods scale. Running migrations from every Pod causes races. Storing state on local disk breaks under normal Pod churn. Unset resource requests wreck scheduling and autoscaling. Secrets baked into images leak. Each of these stems from not respecting how Kubernetes actually works — that Pods are disposable, scaled, and replaced constantly. Internalize that model, and the pitfalls become obvious to avoid; ignore it, and you fight the platform at every turn.

Namespaces and organizing the cluster

As a cluster hosts more than one application or environment, namespaces provide logical separation — grouping related resources, scoping names, and enabling per-namespace resource quotas and access control. You might run staging and production in separate namespaces, or isolate different teams' workloads. Namespaces keep a shared cluster organized and prevent one workload's resources from colliding with another's, while resource quotas stop a single namespace from consuming the whole cluster. Understanding namespaces is part of operating Kubernetes beyond a single app: they are how you bring order and boundaries to a cluster that grows to host many things, keeping it manageable and preventing noisy-neighbor and naming problems.

Jobs and CronJobs for batch work

Not all work is a long-running web server. Kubernetes Jobs run a task to completion — perfect for a one-off data migration or batch process — and CronJobs run them on a schedule, the cluster-native equivalent of cron for recurring tasks like nightly cleanups or report generation. For a Django app, these handle the periodic and one-off work that does not belong in your web Pods, running in the same cluster with the same image and configuration. Using Jobs and CronJobs for batch and scheduled work keeps it properly isolated from request-serving Pods while still benefiting from Kubernetes' scheduling, retries, and resource management, rather than awkwardly cramming scheduled tasks into your web deployment.

Persistent storage when you need it

Although Django Pods should be stateless, some workloads in a cluster genuinely need persistent storage, and Kubernetes provides it through persistent volumes that survive Pod restarts. You would not typically run your primary database inside the cluster on such volumes — managed database services are usually wiser — but understanding persistent volumes matters for the cases that need durable storage. The key distinction is between your stateless application Pods, which must not rely on local disk, and the deliberately stateful components that use managed persistent storage. Knowing where state belongs — externalized to managed services or persistent volumes, never on an ephemeral Pod's local filesystem — is central to designing a Kubernetes deployment correctly.

Access control and cluster security

A production cluster needs its own security posture beyond the application. Role-based access control governs who and what can do what in the cluster, network policies restrict which Pods can talk to each other, and securing the cluster's control plane and secrets is essential. A Django deployment inherits the cluster's security, so these concerns are part of running it safely — a misconfigured cluster can expose your application and its secrets regardless of how well the app itself is written. Treating cluster security as a first-class concern, with least-privilege access control and network policies limiting Pod-to-Pod communication, is part of the operational responsibility that comes with adopting Kubernetes for production workloads.

Summary

Running Django on Kubernetes means mastering a core chain of objects — Deployments managing disposable Pods, Services giving them stable addresses, and Ingress routing external traffic in — plus the Django-specific concerns the platform demands. Wire readiness and liveness probes so self-healing and zero-downtime rollouts work, set resource requests so scheduling and the HorizontalPodAutoscaler function, and keep configuration in ConfigMaps and secrets in Secrets so one immutable image runs everywhere. Run migrations once as a deliberate step, pool connections through PgBouncer so autoscaling does not exhaust the database, and design Pods to be stateless because Kubernetes will replace them freely. Log to stdout for central aggregation, manage your manifests as version-controlled GitOps, and respect the disposable-Pod model that underlies it all. Adopt Kubernetes when scale and availability genuinely justify its complexity, and these patterns will give you a self-healing, autoscaling, continuously-deployable Django platform.