Rate Limiting (Flask-Limiter)
Rate limiting caps how many requests a single client may make in a time window, rejecting the excess with a 429 Too Many Requests response to protect your API from abuse.
Learn Rate Limiting (Flask-Limiter) in our free Flask course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick…
Part of the free Flask course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
In this lesson you'll build a fixed-window limiter and a token bucket from scratch, return proper 429 responses, key limits on IP or user, and see how Flask-Limiter does it declaratively.
A public API is a target. A buggy client, a scraper, or an attacker can hammer an endpoint thousands of times a second and exhaust your database or bandwidth. Rate limiting caps how many requests one client may make in a window and rejects the rest.
The simplest strategy is a fixed window : allow N requests per window, then reset the counter when the window expires. When a client goes over, you return HTTP 429 Too Many Requests — the standard "slow down" status.
The runnable example below allows 3 requests, then returns 429 for the 4th and 5th. The limit is keyed on the client's IP, so different clients get independent allowances.
Requests 1-3 return 200 with pong ; requests 4 and 5 return 429 with an error. The window holds the line.
A fixed window has a weakness: a client can fire its whole quota at the very end of one window and again at the start of the next, doubling the burst. A token bucket fixes this. The bucket holds up to capacity tokens, refills at a steady rate, and each request must take one token — so requests are smoothed to the refill rate.
The runnable class below starts with 2 tokens and refills 1 per second. The first two requests succeed, the next two fail (the bucket is empty), and after waiting just over a second one token has refilled, so the next request succeeds again.
The key decides who a limit applies to. Keying on IP protects anonymous endpoints, but for an authenticated API you usually key on the user or API key so each account gets a fair, independent allowance. A well-behaved limiter also sets a Retry-After header telling the client how long to wait.
The runnable example keys on an X-User-Id header when present. alice uses up her allowance and gets a 429 with Retry-After , while bob — a different key — is still allowed.
In Flask-Limiter you swap the key_func — for example lambda: current_user.id — to limit per user instead of per IP.
Complete the limiter below. Replace each ___ so the route allows 2 requests per IP and returns the right status when exceeded.
An in-memory dict (or Flask-Limiter's default memory storage) is per-process, so 4 workers allow 4× the limit. Use a shared storage_uri="redis://..." so every worker counts against one store.
Behind a proxy, request.remote_addr is the proxy's IP, so all users share one bucket. Configure ProxyFix / trusted proxies so the real client IP (from X-Forwarded-For ) is used as the key.
Build one limit helper and apply different limits to two routes.
Lesson complete — your API can defend itself!
You built a fixed-window limiter and a token bucket, returned 429 with Retry-After , keyed limits on IP and user, and saw the declarative Flask-Limiter equivalent.
🚀 Up next: Background Tasks — move slow work off the request thread with Celery and RQ.
Practice quiz
Which HTTP status means the client has sent too many requests?
- 429 Too Many Requests
- 403 Forbidden
- 401 Unauthorized
- 404 Not Found
Answer: 429 Too Many Requests. 429 Too Many Requests is the standard rate-limit rejection status.
Which response header tells the client how long to wait before retrying?
- Cache-Control
- Retry-After
- X-Rate-Remaining
- Location
Answer: Retry-After. Retry-After gives the number of seconds to wait before sending another request.
What weakness does a fixed-window limiter have?
- It needs Redis
- It cannot return 429
- Bursts can sneak through at window edges
- It blocks all requests
Answer: Bursts can sneak through at window edges. A client can spend its quota at the end of one window and again at the start of the next.
How does a token bucket smooth bursts?
- It blocks every other request
- It resets nightly
- It caches responses
- Tokens refill at a steady rate and each request takes one
Answer: Tokens refill at a steady rate and each request takes one. Requests are smoothed to the refill rate; a request passes only if a token is available.
Which Flask-Limiter decorator applies a per-route limit?
- @limiter.limit("5/minute")
- @app.rate(5)
- @limit_route(5)
- @throttle(5)
Answer: @limiter.limit("5/minute"). @limiter.limit("5/minute") declares the allowance for that route.
What does key_func=get_remote_address do in Flask-Limiter?
- Caches the response
- Keys limits on the client IP address
- Sets the storage backend
- Returns the 429 body
Answer: Keys limits on the client IP address. get_remote_address makes the limit count per client IP.
Why use Redis (storage_uri) instead of in-memory counts?
- It is required by Flask
- It encrypts requests
- So all worker processes share one count
- To speed up the database
Answer: So all worker processes share one count. In-memory counts are per-process, so multiple workers would each allow the full limit.
For an authenticated API, what is usually the best key for limits?
- The request path
- The User-Agent
- A random value
- The user or API key
Answer: The user or API key. Keying on the user or API key gives each account its own fair allowance.
Why might every user appear rate-limited as one client behind a proxy?
- request.remote_addr is the proxy's IP
- Redis is offline
- The limit is too high
- 429 is disabled
Answer: request.remote_addr is the proxy's IP. Behind a proxy remote_addr is the proxy IP unless ProxyFix / X-Forwarded-For is configured.
Which Flask-Limiter decorator removes limiting from a route?
- @limiter.skip
- @limiter.exempt
- @limiter.off
- @limiter.free
Answer: @limiter.exempt. @limiter.exempt exempts a route from rate limiting.