Profiling and Benchmarking with pprof

Stop guessing where your program spends its time. You'll write benchmarks with testing.B , measure allocations with -benchmem , capture CPU, heap, and goroutine profiles with runtime/pprof and net/http/pprof , and read them with go tool pprof — plus a peek at runtime/trace .

Learn Profiling and Benchmarking with pprof in our free Go course — an interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free Go course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

What You'll Learn in This Lesson

1️⃣ Writing a benchmark

A benchmark is a function named BenchmarkX that takes *testing.B and loops b.N times. The framework chooses b.N to get a stable measurement. Add -benchmem to also see allocations per operation.

2️⃣ Capturing and reading a CPU profile

Add -cpuprofile=cpu.out to your benchmark run, then open it with go tool pprof . The most useful commands are top (hottest functions), list <func> (per-line cost), and web (a call graph).

3️⃣ Profiling from your own code with runtime/pprof

Outside of tests, use runtime/pprof directly: StartCPUProfile / StopCPUProfile for CPU and WriteHeapProfile for a heap snapshot. Both write to a file you open with go tool pprof .

4️⃣ Live profiling with net/http/pprof

For a running server, a blank import of net/http/pprof registers handlers under /debug/pprof/ . You can then pull a CPU, heap, or goroutine profile from a live process without restarting it.

🎯 Your Turn

Complete the benchmark loop. Fill in the blank marked ___ so it iterates the right number of times.

❌ Hard-coding the loop count instead of using b.N — the result is meaningless.

❌ Counting setup in the timing — slow input prep inflates ns/op.

❌ The compiler optimizes the work away because the result is unused.

✅ Assign to a package-level sink or _ = the result so it can't be eliminated.

❌ Forgetting -benchmem — you miss the allocation story behind a slow path.

✅ Run go test -bench=. -benchmem for allocs/op and B/op.

The testing framework. It raises b.N until the run is long enough for a stable ns/op; you never set it yourself.

top . Then list <func> for per-line cost and web for a call graph.

Benchmark appending 10,000 ints to a nil slice versus preallocating with make([]int, 0, 10000) , and compare allocs/op under -benchmem .

Practice quiz

What is the correct signature for a Go benchmark function?

  • func BenchmarkX(b *testing.B)
  • func BenchmarkX(t *testing.T)
  • func Benchmark(x int) int
  • func BenchX(b testing.B)

Answer: func BenchmarkX(b *testing.B). Benchmarks are named BenchmarkX and take a single *testing.B parameter; the testing tool runs them with go test -bench.

What does b.N represent inside a benchmark loop?

  • the number of CPUs
  • the number of nanoseconds elapsed
  • the iteration count the framework chooses to get a stable timing
  • a fixed value of 1000

Answer: the iteration count the framework chooses to get a stable timing. The testing package adjusts b.N upward until the benchmark runs long enough for a reliable measurement; you loop from 0 to b.N.

Why call b.ResetTimer()?

  • to stop the benchmark early
  • to discard expensive setup time before the measured loop
  • to reset b.N to zero
  • to enable the race detector

Answer: to discard expensive setup time before the measured loop. ResetTimer zeroes the elapsed time and memory counters so costly setup before the loop is not counted in the result.

Which flag adds memory allocation stats to benchmark output?

  • -cover
  • -v
  • -race
  • -benchmem

Answer: -benchmem. go test -bench=. -benchmem reports allocs/op and B/op alongside ns/op.

Which package writes profiles programmatically without an HTTP server?

  • runtime/pprof
  • net/http/pprof
  • fmt
  • encoding/json

Answer: runtime/pprof. runtime/pprof lets you call pprof.StartCPUProfile or WriteHeapProfile directly to a file from your code.

What does importing net/http/pprof do?

  • disables the garbage collector
  • registers profiling handlers under /debug/pprof/ on the default mux
  • starts a CPU profile automatically
  • nothing useful

Answer: registers profiling handlers under /debug/pprof/ on the default mux. The blank import _ "net/http/pprof" registers /debug/pprof/ endpoints on http.DefaultServeMux for live profiling.

Which go test flag writes a CPU profile to a file?

  • -trace
  • -blockprofile
  • -memprofile
  • -cpuprofile

Answer: -cpuprofile. go test -bench=. -cpuprofile=cpu.out writes a CPU profile you can open with go tool pprof.

Which go tool pprof command lists the hottest functions?

  • save
  • build
  • top
  • edit

Answer: top. Inside go tool pprof, top shows the functions consuming the most time or memory; list <func> shows annotated source.

Which profile shows where goroutines are currently blocked or stacked?

  • the CPU profile
  • the goroutine profile
  • the mutex profile only
  • the heap profile

Answer: the goroutine profile. The goroutine profile dumps the stack of every running goroutine, which is ideal for diagnosing leaks and deadlocks.

What does the runtime/trace package capture that pprof does not?

  • a timeline of scheduling, GC, and goroutine events
  • the source code of functions
  • the values of variables
  • only allocation counts

Answer: a timeline of scheduling, GC, and goroutine events. runtime/trace records a detailed execution timeline (scheduler, GC, syscalls, goroutines) viewed with go tool trace.