Checkpoint: Production Backend

This checkpoint pulls the advanced track together by having you build one small service that rate-limits requests and serves reads from a cache — the two layers that keep real production backends fast and resilient under load.

Learn Checkpoint: Production Backend in our free Node.js course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a…

Part of the free Node.js course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

First you'll recap the whole section, then take on a multi-step build challenge with a starter and a full solution, and finally test yourself with a short checkpoint quiz before moving on to the capstone.

What You've Learned So Far

🔍 Warm-Up: the Two Layers, Running

Before the full build, here are the two ideas at the heart of the service — a token-bucket rate limiter and a cache-aside lookup — as a small runnable program with real output. Notice the database is touched only once even across multiple reads:

🛠️ Build Challenge: A Cached, Rate-Limited Service

Here's a complete, working implementation. The comment at the bottom shows the exact output it prints — one database read, two cache hits, one rate-limited rejection, and a single total DB hit:

📝 Checkpoint Quiz

Test yourself. Think through each answer first, then expand it to check.

So an over-limit client is rejected with a 429 before doing any work. Checking the cache (or database) first would let blocked clients still consume resources, defeating the purpose of throttling.

Cache-aside: the first read for an id misses and hits the database; the next reads of the same id come from the cache; and the over-limit request never reaches the database at all.

Redis (via ioredis) for the shared cache with TTLs, and express-rate-limit (backed by Redis) for the limiter — so both are consistent across all instances.

With the cluster module (one worker per core) or, more practically, pm2 start app.js -i max — and keep shared state (cache, limits) in Redis so workers agree.

A TTL bounds how long stale data can survive and caps memory. On a write/update you should also explicitly invalidate (delete) the key so the next read repopulates fresh data.

console.time for a quick block measurement, node --prof or clinic.js to profile CPU, and heap snapshots in chrome://inspect (watching process.memoryUsage().heapUsed grow) for leaks.

Practice quiz

In this service, why must rate limiting run BEFORE the cache lookup?

Because the cache is slower than the limiter
Because the cache cannot run without a token
So an over-limit client gets a 429 before doing any work, never touching the cache or DB
It does not matter which order they run in

Answer: So an over-limit client gets a 429 before doing any work, never touching the cache or DB. Putting the limiter first means a blocked request returns immediately and never exercises the cache or database, which is the whole point of throttling.

Which status code does the handler return when a client is over its limit?

Answer: 429. 429 Too Many Requests is the standard code for a client that has exceeded its rate limit.

In the demo, three reads of the same id happen but dbHits is only 1. Why?

The database deduplicates queries
The rate limiter blocks the extra DB calls
slowDb() only runs once per process by design
Cache-aside: the first read misses and hits the DB; later reads of that id come from the cache

Answer: Cache-aside: the first read misses and hits the DB; later reads of that id come from the cache. The first request for an id is a cache miss that calls slowDb (dbHits becomes 1); subsequent reads of the same id are served from the cache.

How does the token-bucket allow() decide to reject a client?

It checks the system clock against a window
It gives a new client limit tokens, returns false when the count hits zero, else spends one
It compares the client IP against a blocklist
It rejects every other request

Answer: It gives a new client limit tokens, returns false when the count hits zero, else spends one. A new client starts with limit tokens; each allowed request spends one, and once the bucket is empty allow() returns false.

What does readCache(id) return when the entry exists but is past its expiry time?

null
The stale cached user
undefined
It throws an error

Answer: null. readCache compares Date.now() to expiresAt and returns null for an expired entry, forcing a fresh slowDb read.

In production, which libraries replace the in-memory cache Map and the token bucket?

localStorage and setInterval
A plain JSON file and a for-loop
Redis (e.g. via ioredis) for the cache, and express-rate-limit (backed by Redis) for the limiter
MongoDB for both

Answer: Redis (e.g. via ioredis) for the cache, and express-rate-limit (backed by Redis) for the limiter. Redis gives a shared cache with TTLs and a shared rate-limit store, so both stay consistent across every instance.

How would you run this service across all CPU cores?

Call setMaxListeners(0)
Use the cluster module or pm2 start app.js -i max, keeping shared state in Redis
Increase the Node heap size
Run multiple copies on different ports manually

Answer: Use the cluster module or pm2 start app.js -i max, keeping shared state in Redis. cluster (one worker per core) or PM2 cluster mode scales across cores; shared cache and limit state must live in Redis so workers agree.

Cache entries carry a TTL. Besides expiry, what else should invalidate a key?

Nothing — TTL is enough
Restarting the whole process
A second read of the same key
An explicit delete on write/update so the next read repopulates fresh data

Answer: An explicit delete on write/update so the next read repopulates fresh data. A TTL bounds staleness and caps memory, but on a write you should also delete (invalidate) the key so the next read fetches current data.

Which tool would you reach for to profile a CPU-bound slow endpoint here?

npm audit
node --prof or clinic.js
eslint
nodemon

Answer: node --prof or clinic.js. node --prof and clinic.js profile CPU usage to locate hot paths; heap snapshots in chrome://inspect are for tracking memory leaks.

On a successful database read, what does handle() do before returning the user?

Nothing — it returns immediately
It refills the token bucket
It stores the result in the cache with an expiresAt of Date.now() + ttlMs
It logs the user to Redis

Answer: It stores the result in the cache with an expiresAt of Date.now() + ttlMs. After a cache miss, handle stores the fresh user with an expiry so subsequent reads within the TTL are served from the cache.