Checkpoint: Production Backend

This checkpoint pulls the advanced track together by having you build one small service that rate-limits requests and serves reads from a cache — the two layers that keep real production backends fast and resilient under load.

Learn Checkpoint: Production Backend in our free Node.js course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a…

Part of the free Node.js course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

First you'll recap the whole section, then take on a multi-step build challenge with a starter and a full solution, and finally test yourself with a short checkpoint quiz before moving on to the capstone.

What You've Learned So Far

🔍 Warm-Up: the Two Layers, Running

Before the full build, here are the two ideas at the heart of the service — a token-bucket rate limiter and a cache-aside lookup — as a small runnable program with real output. Notice the database is touched only once even across multiple reads:

🛠️ Build Challenge: A Cached, Rate-Limited Service

Here's a complete, working implementation. The comment at the bottom shows the exact output it prints — one database read, two cache hits, one rate-limited rejection, and a single total DB hit:

📝 Checkpoint Quiz

Test yourself. Think through each answer first, then expand it to check.

So an over-limit client is rejected with a 429 before doing any work. Checking the cache (or database) first would let blocked clients still consume resources, defeating the purpose of throttling.

Cache-aside: the first read for an id misses and hits the database; the next reads of the same id come from the cache; and the over-limit request never reaches the database at all.

Redis (via ioredis) for the shared cache with TTLs, and express-rate-limit (backed by Redis) for the limiter — so both are consistent across all instances.

With the cluster module (one worker per core) or, more practically, pm2 start app.js -i max — and keep shared state (cache, limits) in Redis so workers agree.

A TTL bounds how long stale data can survive and caps memory. On a write/update you should also explicitly invalidate (delete) the key so the next read repopulates fresh data.

console.time for a quick block measurement, node --prof or clinic.js to profile CPU, and heap snapshots in chrome://inspect (watching process.memoryUsage().heapUsed grow) for leaks.

Practice quiz

In this service, why must rate limiting run BEFORE the cache lookup?

  • Because the cache is slower than the limiter
  • Because the cache cannot run without a token
  • So an over-limit client gets a 429 before doing any work, never touching the cache or DB
  • It does not matter which order they run in

Answer: So an over-limit client gets a 429 before doing any work, never touching the cache or DB. Putting the limiter first means a blocked request returns immediately and never exercises the cache or database, which is the whole point of throttling.

Which status code does the handler return when a client is over its limit?

  • 429
  • 200
  • 401
  • 503

Answer: 429. 429 Too Many Requests is the standard code for a client that has exceeded its rate limit.

In the demo, three reads of the same id happen but dbHits is only 1. Why?

  • The database deduplicates queries
  • The rate limiter blocks the extra DB calls
  • slowDb() only runs once per process by design
  • Cache-aside: the first read misses and hits the DB; later reads of that id come from the cache

Answer: Cache-aside: the first read misses and hits the DB; later reads of that id come from the cache. The first request for an id is a cache miss that calls slowDb (dbHits becomes 1); subsequent reads of the same id are served from the cache.

How does the token-bucket allow() decide to reject a client?

  • It checks the system clock against a window
  • It gives a new client limit tokens, returns false when the count hits zero, else spends one
  • It compares the client IP against a blocklist
  • It rejects every other request

Answer: It gives a new client limit tokens, returns false when the count hits zero, else spends one. A new client starts with limit tokens; each allowed request spends one, and once the bucket is empty allow() returns false.

What does readCache(id) return when the entry exists but is past its expiry time?

  • null
  • The stale cached user
  • undefined
  • It throws an error

Answer: null. readCache compares Date.now() to expiresAt and returns null for an expired entry, forcing a fresh slowDb read.

In production, which libraries replace the in-memory cache Map and the token bucket?

  • localStorage and setInterval
  • A plain JSON file and a for-loop
  • Redis (e.g. via ioredis) for the cache, and express-rate-limit (backed by Redis) for the limiter
  • MongoDB for both

Answer: Redis (e.g. via ioredis) for the cache, and express-rate-limit (backed by Redis) for the limiter. Redis gives a shared cache with TTLs and a shared rate-limit store, so both stay consistent across every instance.

How would you run this service across all CPU cores?

  • Call setMaxListeners(0)
  • Use the cluster module or pm2 start app.js -i max, keeping shared state in Redis
  • Increase the Node heap size
  • Run multiple copies on different ports manually

Answer: Use the cluster module or pm2 start app.js -i max, keeping shared state in Redis. cluster (one worker per core) or PM2 cluster mode scales across cores; shared cache and limit state must live in Redis so workers agree.

Cache entries carry a TTL. Besides expiry, what else should invalidate a key?

  • Nothing — TTL is enough
  • Restarting the whole process
  • A second read of the same key
  • An explicit delete on write/update so the next read repopulates fresh data

Answer: An explicit delete on write/update so the next read repopulates fresh data. A TTL bounds staleness and caps memory, but on a write you should also delete (invalidate) the key so the next read fetches current data.

Which tool would you reach for to profile a CPU-bound slow endpoint here?

  • npm audit
  • node --prof or clinic.js
  • eslint
  • nodemon

Answer: node --prof or clinic.js. node --prof and clinic.js profile CPU usage to locate hot paths; heap snapshots in chrome://inspect are for tracking memory leaks.

On a successful database read, what does handle() do before returning the user?

  • Nothing — it returns immediately
  • It refills the token bucket
  • It stores the result in the cache with an expiresAt of Date.now() + ttlMs
  • It logs the user to Redis

Answer: It stores the result in the cache with an expiresAt of Date.now() + ttlMs. After a cache miss, handle stores the fresh user with an expiry so subsequent reads within the TTL are served from the cache.