Handling Missing Data (NaN, fillna, dropna)

Missing data in pandas is represented by NaN ("Not a Number"), and handling it means detecting those gaps and deciding whether to remove the affected rows or fill them with a sensible value — a step almost every real dataset requires before analysis.

Learn Handling Missing Data (NaN, fillna, dropna) in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice…

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

Learn to detect NaN with isna()/notna(), count gaps, drop incomplete rows with dropna(), and fill holes with fillna() — including filling with a column's mean.

Pandas marks missing values with NaN . To create one in code you use numpy.nan . Because NaN == NaN is False , you can't find gaps with == — instead use df.isna() (identical to df.isnull() ) which returns a True/False mask, and df.notna() for the opposite.

df.dropna() deletes rows that contain any NaN and returns a new DataFrame. Three arguments fine-tune it:

Often you want to keep every row and replace gaps instead. df.fillna(value) substitutes a constant; method="ffill" forward-fills the last valid value down; and a very common technique is filling a numeric column with its own mean .

❌ Forgetting that dropna/fillna return copies

A survey DataFrame has gaps. Clean it step by step.

Lesson complete — your data is clean and analysis-ready!

You can detect NaN with isna() , tally gaps with isna().sum() , remove incomplete rows with dropna() , and fill holes with fillna() — including with a column's mean and forward-fill.

🚀 Up next: Sorting and Ranking — order your rows and assign ranks.

Practice quiz

How does pandas mark a missing value?

  • As an empty string
  • As NaN (Not a Number)
  • As the integer 0
  • As the text 'null'

Answer: As NaN (Not a Number). Pandas uses NaN, a special float, to mark missing values.

Why does df['x'] == np.nan never find missing values?

  • == is not supported on Series
  • NaN equals only itself
  • NaN is never equal to anything, even itself
  • It needs quotes

Answer: NaN is never equal to anything, even itself. NaN compares unequal to everything, so == always returns False.

Which method returns a True/False mask of missing values?

  • df.isna()
  • df.dropna()
  • df.fillna()
  • df.mean()

Answer: df.isna(). df.isna() (same as isnull()) is True where a value is missing.

What is the most useful one-liner to count gaps per column?

  • df.notna()
  • df.dropna().sum()
  • df.count()
  • df.isna().sum()

Answer: df.isna().sum(). df.isna().sum() tallies missing values column by column.

What does df.dropna() do by default (how='any')?

  • Drops a row if ANY value is NaN
  • Drops only fully-empty rows
  • Fills NaN with 0
  • Drops all columns

Answer: Drops a row if ANY value is NaN. The default how='any' removes any row containing at least one NaN.

What does dropna(how='all') drop?

  • Every row
  • Only rows where EVERY value is NaN
  • Columns with NaN
  • Nothing

Answer: Only rows where EVERY value is NaN. how='all' removes only rows that are entirely missing.

How do you only drop rows missing the 'score' column?

  • dropna(axis=1)
  • dropna(how='all')
  • score

Answer: score. subset=['score'] limits the NaN check to that one column.

What does df.fillna(0) do?

  • Drops NaN rows
  • Replaces every NaN with 0
  • Counts NaN
  • Renames columns

Answer: Replaces every NaN with 0. fillna(value) substitutes the given value for every NaN.

When filling a numeric column with its own mean, what does .mean() do with NaN?

  • Treats NaN as 0
  • Raises an error
  • Returns NaN
  • Ignores NaN automatically

Answer: Ignores NaN automatically. .mean() skips NaN, so the average isn't skewed by the gaps.

What does method='ffill' do?

  • Carries the previous valid value forward
  • Fills with the column max
  • Drops the row
  • Fills with zeros

Answer: Carries the previous valid value forward. Forward-fill copies the last valid value down into the gaps.