Histograms

Matplotlib is a Python library for creating charts and visualizations — and the histogram is the tool you reach for when you want to see the shape of a single numeric variable.

Learn Histograms in our free Matplotlib course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free Matplotlib course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

In this lesson you'll bucket data into bins, tune the bin count, normalize with density, and overlay two distributions to compare them.

A histogram takes one list of numbers, slices the range into equal bins , and counts how many values fall in each bin. Pass your data to plt.hist() and add edgecolor so the bars stay distinct.

What you'll see: a bell-shaped cluster of blue bars, tallest near a score of 70 and tapering off toward 40 and 100. Each bar is outlined in black, and the y-axis tells you how many students landed in each score range.

The bins argument controls resolution: fewer bins give a smooth summary, more bins reveal fine detail. Setting density=True rescales the y-axis so the bars represent probability density (total area equals 1) instead of raw counts.

What you'll see: a finer, smoother bell curve with 40 narrow green bars peaking near 0. Because density=True is on, the y-axis now reads as probability density rather than counts, so the values are small decimals.

To compare two groups, draw two histograms on the same axes and set alpha below 1 so the overlap is visible through the transparency. A legend tells the two groups apart.

What you'll see: two semi-transparent bell shapes. Class A peaks around 65 and Class B around 75, and where they overlap in the middle the colors blend, making it obvious that Class B generally scored higher.

Replace each ___ to bucket the data into 15 outlined bins.

You passed already-counted data. A histogram needs the raw values; it counts them for you.

Add edgecolor="black" so each bin gets an outline, and lower alpha when overlaying.

Plot the distribution of 600 commute times across 25 bins with a title and labels.

Lesson 8 complete — you can read a distribution's shape!

You built histograms, tuned the bin count, normalized with density, and overlaid two groups to compare them side by side.

🚀 Up next: Scatter Plots — reveal the relationship between two variables, point by point.

Practice quiz

What does a histogram show?

  • The relationship between two variables
  • The distribution of a single numeric variable
  • A proportion as slices
  • Change over exact time

Answer: The distribution of a single numeric variable. A histogram groups one numeric variable into bins to show its distribution.

Which function draws a histogram in pyplot?

  • plt.bar()
  • plt.hist()
  • plt.histogram()
  • plt.dist()

Answer: plt.hist(). plt.hist(data) slices the range into bins and counts values in each.

What does the bins argument control?

  • The bar color
  • The number of buckets the range is sliced into
  • The figure size
  • The transparency

Answer: The number of buckets the range is sliced into. bins sets how many equal-width buckets the data range is divided into.

Why pass raw values rather than pre-counted categories to plt.hist()?

  • It is faster
  • A histogram counts the values for you
  • Counts are required
  • Raw values are ignored

Answer: A histogram counts the values for you. A histogram does the counting itself; pre-counted data yields one giant bar.

What does density=True do?

  • Adds more bins
  • Doubles the counts
  • Normalizes bars so the total area equals 1
  • Removes the edges

Answer: Normalizes bars so the total area equals 1. density=True rescales the y-axis to a probability density with total area 1.

Why add edgecolor='black' to a histogram?

  • To fill the bars
  • To set the title
  • To normalize the data
  • To outline each bar so bins stay distinct

Answer: To outline each bar so bins stay distinct. Without an edge color the bars blend together; edgecolor outlines each bin.

When overlaying two histograms, which argument reveals the overlap?

  • alpha below 1 (transparency)
  • bins=1
  • density=False
  • color='black'

Answer: alpha below 1 (transparency). Setting alpha below 1 makes the overlapping region visible through transparency.

For a fair comparison of two groups, you should...

  • Use different bin edges
  • Use no bins
  • Use the same bin edges for both
  • Plot them on separate figures

Answer: Use the same bin edges for both. Shared bin edges (e.g. np.linspace) make the two distributions directly comparable.

If only one giant bar appears, the likely cause is...

  • Too many bins
  • You passed already-counted data instead of raw values
  • edgecolor was set
  • density=True

Answer: You passed already-counted data instead of raw values. Pre-counted data collapses into one bar; a histogram needs the raw values.

What is a reasonable starting number of bins to try?

  • 1 to 2
  • 500
  • exactly 1000
  • 20 to 30

Answer: 20 to 30. Around 20-30 bins is a common starting point; too few hide detail, too many add noise.