Histograms
Matplotlib is a Python library for creating charts and visualizations — and the histogram is the tool you reach for when you want to see the shape of a single numeric variable.
Learn Histograms in our free Matplotlib course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.
Part of the free Matplotlib course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
In this lesson you'll bucket data into bins, tune the bin count, normalize with density, and overlay two distributions to compare them.
A histogram takes one list of numbers, slices the range into equal bins , and counts how many values fall in each bin. Pass your data to plt.hist() and add edgecolor so the bars stay distinct.
What you'll see: a bell-shaped cluster of blue bars, tallest near a score of 70 and tapering off toward 40 and 100. Each bar is outlined in black, and the y-axis tells you how many students landed in each score range.
The bins argument controls resolution: fewer bins give a smooth summary, more bins reveal fine detail. Setting density=True rescales the y-axis so the bars represent probability density (total area equals 1) instead of raw counts.
What you'll see: a finer, smoother bell curve with 40 narrow green bars peaking near 0. Because density=True is on, the y-axis now reads as probability density rather than counts, so the values are small decimals.
To compare two groups, draw two histograms on the same axes and set alpha below 1 so the overlap is visible through the transparency. A legend tells the two groups apart.
What you'll see: two semi-transparent bell shapes. Class A peaks around 65 and Class B around 75, and where they overlap in the middle the colors blend, making it obvious that Class B generally scored higher.
Replace each ___ to bucket the data into 15 outlined bins.
You passed already-counted data. A histogram needs the raw values; it counts them for you.
Add edgecolor="black" so each bin gets an outline, and lower alpha when overlaying.
Plot the distribution of 600 commute times across 25 bins with a title and labels.
Lesson 8 complete — you can read a distribution's shape!
You built histograms, tuned the bin count, normalized with density, and overlaid two groups to compare them side by side.
🚀 Up next: Scatter Plots — reveal the relationship between two variables, point by point.
Practice quiz
What does a histogram show?
- The relationship between two variables
- The distribution of a single numeric variable
- A proportion as slices
- Change over exact time
Answer: The distribution of a single numeric variable. A histogram groups one numeric variable into bins to show its distribution.
Which function draws a histogram in pyplot?
- plt.bar()
- plt.hist()
- plt.histogram()
- plt.dist()
Answer: plt.hist(). plt.hist(data) slices the range into bins and counts values in each.
What does the bins argument control?
- The bar color
- The number of buckets the range is sliced into
- The figure size
- The transparency
Answer: The number of buckets the range is sliced into. bins sets how many equal-width buckets the data range is divided into.
Why pass raw values rather than pre-counted categories to plt.hist()?
- It is faster
- A histogram counts the values for you
- Counts are required
- Raw values are ignored
Answer: A histogram counts the values for you. A histogram does the counting itself; pre-counted data yields one giant bar.
What does density=True do?
- Adds more bins
- Doubles the counts
- Normalizes bars so the total area equals 1
- Removes the edges
Answer: Normalizes bars so the total area equals 1. density=True rescales the y-axis to a probability density with total area 1.
Why add edgecolor='black' to a histogram?
- To fill the bars
- To set the title
- To normalize the data
- To outline each bar so bins stay distinct
Answer: To outline each bar so bins stay distinct. Without an edge color the bars blend together; edgecolor outlines each bin.
When overlaying two histograms, which argument reveals the overlap?
- alpha below 1 (transparency)
- bins=1
- density=False
- color='black'
Answer: alpha below 1 (transparency). Setting alpha below 1 makes the overlapping region visible through transparency.
For a fair comparison of two groups, you should...
- Use different bin edges
- Use no bins
- Use the same bin edges for both
- Plot them on separate figures
Answer: Use the same bin edges for both. Shared bin edges (e.g. np.linspace) make the two distributions directly comparable.
If only one giant bar appears, the likely cause is...
- Too many bins
- You passed already-counted data instead of raw values
- edgecolor was set
- density=True
Answer: You passed already-counted data instead of raw values. Pre-counted data collapses into one bar; a histogram needs the raw values.
What is a reasonable starting number of bins to try?
- 1 to 2
- 500
- exactly 1000
- 20 to 30
Answer: 20 to 30. Around 20-30 bins is a common starting point; too few hide detail, too many add noise.