Interpolation & Filling Gaps

Interpolation estimates missing values by drawing a smooth line between the known points around them, so a gap between 1 and 4 becomes 2 and 3 instead of staying blank.

Learn Interpolation & Filling Gaps in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick…

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

You'll learn df.interpolate() with the linear and time methods, the limit and limit_direction controls, and how it differs from plain fillna(ffill/bfill) .

The default interpolate() uses linear interpolation: it draws a straight line between each pair of known values and reads off the missing points along that line. A gap between 1.0 and 4.0 spread over two missing rows becomes 2.0 and 3.0 — evenly spaced steps.

When the index is a DatetimeIndex with uneven spacing , linear interpolation gives the wrong answer because it treats every gap as equal. method="time" weights the estimate by the real elapsed time, so a missing value that sits closer to a later reading lands closer to that reading's value.

Two controls keep interpolation honest, and one comparison keeps it in context:

A gap at the very start has nothing earlier to interpolate from:

✅ Fix: give it a DatetimeIndex first, or use "linear":

A weather log has irregular daily gaps. Fill the same series three ways and compare.

Lesson complete — your gaps are filled intelligently!

You can estimate missing values with interpolate() (linear and time), cap fills with limit and limit_direction , and pick between estimating a trend and a flat ffill / bfill .

🚀 Up next: Exploding Lists into Rows — turn a column of lists into one row per item.

Practice quiz

What does interpolate() do to missing values?

  • Deletes the rows
  • Estimates them by drawing a line between known points
  • Replaces them with 0
  • Leaves them as NaN

Answer: Estimates them by drawing a line between known points. interpolate estimates gaps from the surrounding known values.

What is the default method used by interpolate()?

  • time
  • polynomial
  • nearest
  • linear

Answer: linear. The default is linear interpolation: a straight line between known points.

For pd.Series([1.0, NaN, NaN, 4.0, 5.0]).interpolate(), what fills the two gaps?

  • 2.0 and 3.0
  • 1.0 and 1.0
  • 0.0 and 0.0
  • 4.0 and 4.0

Answer: 2.0 and 3.0. Linear interpolation splits the gap into even steps: 2.0 and 3.0.

What does method='time' weight the estimate by?

  • The row count
  • Alphabetical order
  • The actual elapsed time between known points
  • The column width

Answer: The actual elapsed time between known points. method='time' weights by real elapsed time, so it needs a datetime index.

Why are leading NaNs still NaN after a default interpolate()?

  • A bug in pandas
  • There is no earlier value to interpolate from
  • Leading NaNs are always deleted
  • interpolate ignores floats

Answer: There is no earlier value to interpolate from. A gap at the very start has nothing earlier to interpolate from, so it stays NaN.

How do you also fill leading and trailing gaps?

  • limit=0
  • method='edge'
  • fill_both=True
  • limit_direction='both'

Answer: limit_direction='both'. limit_direction='both' extends the nearest value to fill leading/trailing NaNs.

What does the limit parameter control?

  • How many consecutive NaNs in a run get filled
  • The maximum value allowed
  • The number of decimal places
  • The index length

Answer: How many consecutive NaNs in a run get filled. limit caps how many NaNs in a row are filled; longer gaps stay partly blank.

How do ffill() and bfill() differ from interpolate()?

  • They are faster only
  • They copy the last/next known value instead of estimating a trend
  • They only work on dates
  • They sort the data first

Answer: They copy the last/next known value instead of estimating a trend. ffill/bfill carry a value flat across the gap rather than estimating a slope.

Calling interpolate(method='time') on a plain RangeIndex causes what?

  • It silently uses linear
  • It returns the input unchanged
  • A ValueError
  • It fills with zeros

Answer: A ValueError. method='time' requires a datetime (or numeric) index, otherwise it raises a ValueError.

Which tool fits step-like data such as a status that stays the same until it changes?

  • interpolate(method='time')
  • interpolate(method='linear')
  • describe()
  • ffill() / bfill()

Answer: ffill() / bfill(). Use ffill/bfill for step-like data; interpolate is for smoothly changing measurements.