Master async programming in Python. Learn event loops, coroutines, tasks, aiohttp for concurrent HTTP requests, and when async actually makes sense vs. threads or multiprocessing.
Async Python confuses people because it looks like threading but works nothing like it. There is no parallelism, no second thread, no preemption — just a single thread that suspends a task whenever it would wait and runs another in the meantime. Once that mental model clicks, asyncio stops being mysterious. This tutorial builds that model from the ground up, then covers the patterns, pitfalls, and real-world uses that make async genuinely faster — and the cases where it does not help at all.
Most web work is not CPU-bound; it is spent waiting — for a database, an external API, a disk, a network socket. A synchronous program waiting on a slow API call sits idle, doing nothing, until the response arrives. If you are making fifty such calls one after another, you wait for the sum of all fifty. Async lets that single thread start all fifty, and while each is waiting, work on the others, so the total time is roughly the slowest single call rather than the sum. This is the entire value proposition: async makes I/O-bound work concurrent without threads. For CPU-bound work — heavy computation — it does nothing, because there is no waiting to overlap.
At the heart of asyncio is the event loop: a single-threaded scheduler that runs tasks until each one hits a point where it must wait, at which moment the task yields control back to the loop, which runs another ready task. When a task's awaited operation completes, the loop resumes it. There is only ever one thing executing at a time — concurrency comes from interleaving many tasks during their wait periods, not from running them simultaneously. Internalizing that "one thread, cooperative switching at await points" is the single insight that makes all of async make sense; everything else is mechanics built on top of it.
A function defined with async def is a coroutine — calling it does not run it, it returns a coroutine object that runs only when awaited or scheduled on the loop. The await keyword is where the magic happens: it suspends the current coroutine until the awaited thing completes, handing control back to the event loop so other tasks can run in the meantime.
import asyncio, aiohttp
async def fetch(session, url):
async with session.get(url) as resp:
return await resp.text()
Every await is a potential switch point. Between awaits, your code runs uninterrupted; at an await, the loop may run something else. That cooperative model is why async is predictable in a way threads are not.
Awaiting coroutines one by one is still sequential — you only get concurrency when multiple operations are in flight at once. asyncio.gather schedules many coroutines together and waits for them all, which is where the speedup appears:
async def fetch_all(urls):
async with aiohttp.ClientSession() as session:
return await asyncio.gather(*(fetch(session, u) for u in urls))
Fifty URLs that would take fifty sequential round trips now overlap, completing in about the time of the slowest one. This is the canonical async win: fan out independent I/O operations, then collect the results together. For finer control over many tasks, asyncio.TaskGroup (in modern Python) offers structured concurrency with cleaner error handling.
Because there is one thread and switching is cooperative, any code that blocks without awaiting freezes everything. A synchronous database call, a time.sleep, a CPU-heavy loop, or a blocking library call holds the single thread and stalls every other task until it returns. This is the number-one async bug, and it is insidious because the program still works — it is just mysteriously not concurrent. The rule is absolute: never call blocking code directly in a coroutine. Use the async equivalent (asyncio.sleep, an async HTTP client, an async database driver), or push the blocking work to a thread pool with run_in_executor or asyncio.to_thread so it does not hold the loop.
A frequent misunderstanding is expecting async to speed up computation. It cannot. Async overlaps waiting, and a CPU-bound task does not wait — it occupies the single thread fully while it computes, blocking every other task exactly like any other non-awaiting code. For parallel computation you need multiple cores, which means multiprocessing or a process pool, not asyncio. The clean division: I/O-bound and high-concurrency work is for async; CPU-bound work is for processes. Mixing them up — reaching for async to make a number-crunching job faster — leads to disappointment and confusing code that is no quicker than the synchronous version.
The most common real-world use of async is making many outbound HTTP calls, and a synchronous client used in async code is the classic loop-blocking mistake. Use an async client — aiohttp or httpx in async mode — which awaits at every network operation so the loop stays free. Reuse a single session across requests so connections are pooled rather than reopened, and bound concurrency with a semaphore so you do not open a thousand sockets at once and overwhelm the target or yourself. Aggregating data from many APIs, calling microservices, scraping — these are where async pays off most visibly, turning minutes of sequential waiting into seconds of overlapped calls.
Unbounded concurrency is its own problem: firing ten thousand requests at once will exhaust file descriptors, trip rate limits, or knock over the service you are calling. An asyncio.Semaphore caps how many operations run simultaneously, letting you say "at most twenty in flight" while still processing thousands in total:
sem = asyncio.Semaphore(20)
async def fetch_limited(session, url):
async with sem:
return await fetch(session, url)
This is essential, real-world hygiene. Concurrency is a resource you must bound, not maximize blindly — the goal is enough parallelism to overlap waits, not so much that you cause failures elsewhere.
When many tasks run together, failures need deliberate handling. By default gather raises on the first exception, abandoning the others; pass return_exceptions=True to collect results and exceptions together so one failure does not lose every other result. TaskGroup takes a stricter stance, cancelling siblings when one fails, which is often what you want for an all-or-nothing operation. Decide per use case: a batch where partial success is fine wants gathered exceptions; a transaction-like group wants cancellation. Either way, never let an exception in one concurrent task silently vanish — unhandled task exceptions are a common, hard-to-debug async failure.
Django supports async views, middleware, and an async ORM interface, but the integration has sharp edges worth understanding. You cannot call the synchronous ORM directly from an async view without wrapping it in sync_to_async, or you block the loop; conversely, calling async code from sync context needs async_to_sync. Async views shine for I/O-bound work — calling several external services, handling many slow connections — but offer nothing for ordinary database-bound views, which are usually faster left synchronous. Adopt async in Django where the work is genuinely I/O-concurrent, and keep the rest synchronous; forcing everything async adds complexity without benefit and reintroduces the loop-blocking trap at every ORM call.
Async generators (async def with yield, consumed with async for) let you stream data as it arrives rather than buffering it all first. This is the backbone of streaming responses — paginating through an API, processing a large download, or streaming tokens from an LLM to a browser as they generate. Each yielded item is produced and consumed without waiting for the whole sequence, and awaits inside the generator keep the loop free between items. Streaming with async generators is how you handle data that is too large or too slow to materialize at once, delivering it incrementally and responsively.
Async bugs have their own flavor. A program that runs but is slow often has a hidden blocking call holding the loop; enable asyncio's debug mode to get warnings when a coroutine blocks too long. "Coroutine was never awaited" warnings mean you called an async function without awaiting or scheduling it, so it never ran. Forgotten await keywords are the most common mistake, producing code that silently does nothing. And unhandled exceptions in fire-and-forget tasks disappear unless you await the task or attach a callback. Knowing these signatures turns baffling async behavior into recognizable, fixable patterns.
Async is a tool with a specific purpose, not a universal upgrade. Use it when you have many concurrent I/O-bound operations — numerous outbound calls, many simultaneous connections, streaming. Skip it for simple request-response views, CPU-bound work, and code where the added complexity buys no concurrency. A synchronous Django app serving database-backed pages is perfectly fast and far simpler than an async rewrite that gains nothing. The mark of good judgment here is reaching for async precisely where overlapping waits matters, and leaving the rest synchronous. The complexity of async is only worth paying when the concurrency it buys is real.
A distinction that trips people up: awaiting a coroutine directly runs it to completion before moving on, while wrapping it in a task with asyncio.create_task schedules it to run concurrently with other code. If you want several things happening at once, you need tasks (or gather, which creates them for you), not a sequence of direct awaits. A common bug is expecting concurrency from code that simply awaits coroutines one after another — it runs sequentially, no faster than synchronous code. Understanding that concurrency comes from scheduling multiple tasks on the loop, not from the async keyword alone, is essential to actually getting the speedup async promises.
Async code can be cancelled, and handling it well matters for robustness. A task can be cancelled — by a timeout, by a parent shutting down — which raises a cancellation exception inside it at the next await point, and your code should clean up resources gracefully when that happens. asyncio.timeout (or wait_for) bounds how long an operation may take, cancelling it if it overruns, which is essential for not hanging forever on a slow external call. Treating cancellation as a normal part of an async task's lifecycle, rather than an edge case, is what makes long-running async services shut down cleanly and recover from stuck operations.
Resources that must be opened and closed — connections, sessions, locks — use async context managers with async with, ensuring cleanup happens even when an await inside raises. This is how you manage an HTTP session, a database connection, or a pool correctly in async code, guaranteeing the resource is released on every path. Forgetting to use proper async resource management leads to leaked connections and exhausted pools, a common source of async production issues. The async with construct is the async counterpart to the ordinary context manager, and using it consistently for every acquirable resource is what keeps an async application from slowly leaking the very resources it depends on.
Real applications are rarely all-async, and the boundaries between async and synchronous code need care. Calling blocking synchronous code from async without offloading it to a thread blocks the loop; calling async code from sync requires running it on an event loop properly. Tools exist on both sides — thread executors for running sync work from async, and helpers for running async from sync — but the key is to be deliberate about every crossing. Many async bugs live exactly at these boundaries, where a synchronous call sneaks into an async path and silently destroys the concurrency. Mapping where your code crosses between the two worlds, and handling each crossing correctly, keeps the async parts actually asynchronous.
Async Python is one thread cooperatively switching between tasks at every await, which makes I/O-bound work concurrent without threads — and does nothing for CPU-bound work, which belongs in processes. The event loop runs tasks until they wait, then runs others, so the cardinal rule is to never block it: use async libraries and push blocking calls to a thread pool. Get concurrency from gather or TaskGroup, bound it with a semaphore, handle errors deliberately so failures in one task do not vanish or sink the rest, and stream with async generators. In Django, adopt async only for genuinely I/O-concurrent views and wrap every ORM call across the sync boundary. Used where overlapping waits actually matters, async turns minutes of sequential I/O into seconds; used everywhere by reflex, it just adds complexity. The skill is knowing which situation you are in.