Stop guessing why your app is slow. Profile a live Django process without restarting it using py-spy, trace per-request queries and timings with django-silk, read flame graphs, and turn findings into concrete fixes.
"The site feels slow" is not a bug report you can act on. Profiling turns a vague complaint into a precise answer: this function, on this query, for 800ms. The good news — you can profile a running production process without restarting it or adding a single line of code. Here's how.
py-spy is a sampling profiler that attaches to a live Python process by PID and reads its stacks from the outside. No code changes, no restart, negligible overhead — safe to run on production.
pip install py-spy
# live top-like view of where a worker spends time
py-spy top --pid 2372612
# record 30s and emit an interactive flame graph
py-spy record -o profile.svg --pid 2372612 --duration 30
Use py-spy dump --pid <PID> to instantly see the stack of a hung worker — invaluable for diagnosing a request stuck on a lock or a slow external call.
A flame graph stacks call frames: the x-axis is time spent (wider = more), not chronological order; the y-axis is call depth. You read it by scanning for the widest bars — those are your hot paths. A wide plateau near the top is a single function eating CPU; a wide bar that's all database driver code means you're query-bound, not CPU-bound.
Where py-spy shows CPU, django-silk shows the request lifecycle — every SQL query, its duration, and crucially duplicate queries (the N+1 smoking gun).
pip install django-silk
# settings.py
INSTALLED_APPS += ["silk"]
MIDDLEWARE = ["silk.middleware.SilkyMiddleware"] + MIDDLEWARE
Silk's UI shows "23 queries, 19 duplicates" on a page that should run two — an instant N+1 diagnosis. Fix it with select_related/prefetch_related and watch the count collapse. Run Silk on staging or behind admin-only access; it adds overhead and stores request data.
Profile, don't guess. py-spy attaches to live production processes for CPU and hung-worker diagnosis with no code changes; django-silk exposes per-request queries and the duplicate-query N+1 patterns that dominate Django slowness; flame graphs point you straight at the widest, hottest path. Change one thing at a time and always measure the delta — that discipline is what separates real optimization from superstition.