Checkpoint: Production Backend
This checkpoint pulls the advanced track together by having you build one small service that rate-limits requests and serves reads from a cache — the two layers that keep real production backends fast and resilient under load.
Learn Checkpoint: Production Backend in our free Node.js course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a…
Part of the free Node.js course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
First you'll recap the whole section, then take on a multi-step build challenge with a starter and a full solution, and finally test yourself with a short checkpoint quiz before moving on to the capstone.
What You've Learned So Far
🔍 Warm-Up: the Two Layers, Running
Before the full build, here are the two ideas at the heart of the service — a token-bucket rate limiter and a cache-aside lookup — as a small runnable program with real output. Notice the database is touched only once even across multiple reads:
🛠️ Build Challenge: A Cached, Rate-Limited Service
Here's a complete, working implementation. The comment at the bottom shows the exact output it prints — one database read, two cache hits, one rate-limited rejection, and a single total DB hit:
📝 Checkpoint Quiz
Test yourself. Think through each answer first, then expand it to check.
So an over-limit client is rejected with a 429 before doing any work. Checking the cache (or database) first would let blocked clients still consume resources, defeating the purpose of throttling.
Cache-aside: the first read for an id misses and hits the database; the next reads of the same id come from the cache; and the over-limit request never reaches the database at all.
Redis (via ioredis) for the shared cache with TTLs, and express-rate-limit (backed by Redis) for the limiter — so both are consistent across all instances.
With the cluster module (one worker per core) or, more practically, pm2 start app.js -i max — and keep shared state (cache, limits) in Redis so workers agree.
A TTL bounds how long stale data can survive and caps memory. On a write/update you should also explicitly invalidate (delete) the key so the next read repopulates fresh data.
console.time for a quick block measurement, node --prof or clinic.js to profile CPU, and heap snapshots in chrome://inspect (watching process.memoryUsage().heapUsed grow) for leaks.
Practice quiz
In this service, why must rate limiting run BEFORE the cache lookup?
- Because the cache is slower than the limiter
- Because the cache cannot run without a token
- So an over-limit client gets a 429 before doing any work, never touching the cache or DB
- It does not matter which order they run in
Answer: So an over-limit client gets a 429 before doing any work, never touching the cache or DB. Putting the limiter first means a blocked request returns immediately and never exercises the cache or database, which is the whole point of throttling.
Which status code does the handler return when a client is over its limit?
- 429
- 200
- 401
- 503
Answer: 429. 429 Too Many Requests is the standard code for a client that has exceeded its rate limit.
In the demo, three reads of the same id happen but dbHits is only 1. Why?
- The database deduplicates queries
- The rate limiter blocks the extra DB calls
- slowDb() only runs once per process by design
- Cache-aside: the first read misses and hits the DB; later reads of that id come from the cache
Answer: Cache-aside: the first read misses and hits the DB; later reads of that id come from the cache. The first request for an id is a cache miss that calls slowDb (dbHits becomes 1); subsequent reads of the same id are served from the cache.
How does the token-bucket allow() decide to reject a client?
- It checks the system clock against a window
- It gives a new client limit tokens, returns false when the count hits zero, else spends one
- It compares the client IP against a blocklist
- It rejects every other request
Answer: It gives a new client limit tokens, returns false when the count hits zero, else spends one. A new client starts with limit tokens; each allowed request spends one, and once the bucket is empty allow() returns false.
What does readCache(id) return when the entry exists but is past its expiry time?
- null
- The stale cached user
- undefined
- It throws an error
Answer: null. readCache compares Date.now() to expiresAt and returns null for an expired entry, forcing a fresh slowDb read.
In production, which libraries replace the in-memory cache Map and the token bucket?
- localStorage and setInterval
- A plain JSON file and a for-loop
- Redis (e.g. via ioredis) for the cache, and express-rate-limit (backed by Redis) for the limiter
- MongoDB for both
Answer: Redis (e.g. via ioredis) for the cache, and express-rate-limit (backed by Redis) for the limiter. Redis gives a shared cache with TTLs and a shared rate-limit store, so both stay consistent across every instance.
How would you run this service across all CPU cores?
- Call setMaxListeners(0)
- Use the cluster module or pm2 start app.js -i max, keeping shared state in Redis
- Increase the Node heap size
- Run multiple copies on different ports manually
Answer: Use the cluster module or pm2 start app.js -i max, keeping shared state in Redis. cluster (one worker per core) or PM2 cluster mode scales across cores; shared cache and limit state must live in Redis so workers agree.
Cache entries carry a TTL. Besides expiry, what else should invalidate a key?
- Nothing — TTL is enough
- Restarting the whole process
- A second read of the same key
- An explicit delete on write/update so the next read repopulates fresh data
Answer: An explicit delete on write/update so the next read repopulates fresh data. A TTL bounds staleness and caps memory, but on a write you should also delete (invalidate) the key so the next read fetches current data.
Which tool would you reach for to profile a CPU-bound slow endpoint here?
- npm audit
- node --prof or clinic.js
- eslint
- nodemon
Answer: node --prof or clinic.js. node --prof and clinic.js profile CPU usage to locate hot paths; heap snapshots in chrome://inspect are for tracking memory leaks.
On a successful database read, what does handle() do before returning the user?
- Nothing — it returns immediately
- It refills the token bucket
- It stores the result in the cache with an expiresAt of Date.now() + ttlMs
- It logs the user to Redis
Answer: It stores the result in the cache with an expiresAt of Date.now() + ttlMs. After a cache miss, handle stores the fresh user with an expiry so subsequent reads within the TTL are served from the cache.