Masked Arrays (np.ma)

A masked array pairs your data with a boolean mask that flags invalid or missing values, so NumPy automatically excludes them from means, sums, and other reductions without you deleting or overwriting anything.

Learn Masked Arrays (np.ma) in our free NumPy course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free Numpy course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

You'll learn to build masked arrays, hide sentinel values like −999, run reductions that skip the masked entries, and convert back to a plain array with filled() — plus when a mask beats a NaN.

Real-world data is full of placeholders: a sensor that failed records -999 , a survey leaves a blank as -1 . A masked array lets you mark those entries as invalid while keeping the original numbers intact. NumPy prints masked elements as -- and skips them in calculations. The fastest way to mask a sentinel is np.ma.masked_equal(data, value) .

You can also supply the mask directly with np.ma.masked_array(data, mask=[...]) when you already know which positions are bad. .compressed() returns just the valid values as a plain 1D array.

Once values are masked, every reduction — mean , sum , max , min , std — ignores them. np.ma also gives you helpers that build the mask from a condition: masked_greater , masked_less , masked_outside , and masked_invalid (for NaN/inf). .count() tells you how many valid values remain.

When you are done, filled(value) substitutes a chosen value for every masked element and hands back an ordinary array — useful for exporting or plotting. This is where masked arrays beat NaN: a plain arr.mean() on data containing np.nan returns nan (the NaN "poisons" the result), but a masked array's mean() just works. Masks also handle integer arrays, which cannot hold NaN at all.

Replace each ___ so the program masks the -1 placeholders, averages the rest, then fills the gaps with 0.

Expected output: 11.25 then [ 5 8 0 12 0 20] . (Answers: masked_equal , mean , filled .)

❌ Using np.mean(masked) instead of masked.mean()

Top-level np.mean may not respect the mask the way you expect.

✅ Fix: call the method on the masked array — masked.mean() — or use np.ma.mean(masked) , both of which honor the mask.

masked.data still contains the original sentinel values; only the mask hides them.

✅ Fix: use filled(value) to overwrite masked entries or compressed() to drop them before exporting.

A sales table uses -1 for days a store was closed. Mask them, then compute each column's mean ignoring the closures, and fill any all-masked result with 0.

Lesson complete — masked arrays unlocked!

You can now flag invalid data with np.ma.masked_array and helpers like masked_equal , run reductions that skip masked elements, recover plain arrays with filled() and compressed() , and explain why a mask sidesteps the NaN-poisoning problem.

🚀 Up next: Fourier Transforms (np.fft) — turn signals into their frequency components.

Practice quiz

What does a masked array carry alongside its data?

A boolean mask marking invalid elements
A second copy of the data
A list of column names
A timestamp

Answer: A boolean mask marking invalid elements. A masked array pairs the data with a boolean mask flagging invalid entries.

Which function masks every occurrence of a sentinel value like -999?

np.ma.compressed
np.ma.masked_equal
np.ma.filled
np.ma.average

Answer: np.ma.masked_equal. np.ma.masked_equal(data, -999) masks each element equal to the sentinel.

How does NumPy print a masked element?

NaN
0
--
null

Answer: --. Masked elements are displayed as -- in the array output.

For np.ma.masked_equal([10,20,-999,40,-999,60], -999), what is .mean()?

32.5
0.0
-999.0
21.5

Answer: 32.5. Masking the two -999 values leaves 10,20,40,60 with a mean of 32.5.

What does .compressed() return from a masked array?

The mask only
The fill value
Only the valid values as a plain 1D array
A 2D copy

Answer: Only the valid values as a plain 1D array. .compressed() drops the masked entries and returns the valid values.

What does filled(0) do to a masked array?

Deletes the array
Replaces masked elements with 0 and returns a plain array
Masks every zero
Adds zeros to the end

Answer: Replaces masked elements with 0 and returns a plain array. filled(value) substitutes the value for masked entries, returning an ndarray.

Why does a mask beat NaN for an integer array?

Integers store NaN fine
NaN is faster
Integer arrays cannot hold NaN, but a mask works on any dtype
Masks use less code always

Answer: Integer arrays cannot hold NaN, but a mask works on any dtype. NaN is a float-only value; a mask works with any dtype including integers.

Which helper masks NaN or inf values in an array?

np.ma.masked_invalid
np.ma.masked_equal
np.ma.filled
np.ma.count

Answer: np.ma.masked_invalid. np.ma.masked_invalid masks NaN and inf entries automatically.

What does .count() report on a masked array?

The number of masked elements
The fill value
The array size in bytes
The number of valid (unmasked) values

Answer: The number of valid (unmasked) values. .count() returns how many valid, unmasked values remain.

Why does a plain arr.mean() return nan when arr contains np.nan?

NaN poisons the reduction
mean is undefined
It rounds down
It ignores NaN by default

Answer: NaN poisons the reduction. A single NaN poisons a normal reduction, so the result is nan; a masked mean avoids this.