Filtering with Boolean Masks

A boolean mask is a Series of True/False values that pandas uses to keep only the rows where the condition is True — it's how you ask questions like "show me everyone older than 30" and get back exactly those rows.

Learn Filtering with Boolean Masks in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick…

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

Learn to write single conditions, combine them safely with & and |, and use the powerful helpers .isin(), .between(), and ~.

Filtering happens in two conceptual steps. First, a comparison like df["age"] > 30 produces a boolean mask — a Series of True / False , one per row. Second, you pass that mask back into the DataFrame with df[mask] , and pandas keeps only the True rows.

To combine conditions, use the bitwise operators & for AND and | for OR. Each individual condition must be wrapped in parentheses .

Three helpers make common filters short and readable:

Build a sales DataFrame, then answer three questions with masks.

Lesson complete — you can ask your data questions!

You can build boolean masks, combine them safely with & and | (parentheses always!), and reach for .isin() , ~ , and .between() when they fit.

🚀 Up next: loc vs iloc — precise label-based and position-based selection.

Practice quiz

What is a boolean mask in pandas?

  • A list of column names
  • A Series of True/False values, one per row
  • A way to hide columns
  • A type of index

Answer: A Series of True/False values, one per row. A boolean mask is a True/False Series that selects which rows to keep.

What does df[df['age'] > 30] return?

  • The age column only
  • A count of rows
  • Only the rows where age is greater than 30
  • The whole DataFrame

Answer: Only the rows where age is greater than 30. The mask keeps only rows where the condition is True.

Which operators combine two boolean conditions in pandas?

  • and / or
  • && / ||
  • + / -
  • & / |

Answer: & / |. Use the bitwise & (and) and | (or); the Python keywords and/or do not work on Series.

Why must each condition be wrapped in its own parentheses?

  • Because & binds tighter than comparison operators
  • To make it run faster
  • Pandas requires double quotes
  • It is purely stylistic

Answer: Because & binds tighter than comparison operators. & has higher precedence than > or ==, so parentheses force the comparisons first.

What does .isin(['NYC','LA']) do?

  • Sorts those cities first
  • Renames the column
  • Keeps rows whose value is in the list
  • Counts the cities

Answer: Keeps rows whose value is in the list. .isin(list) keeps rows matching any value in the list, cleaner than chaining ORs.

What does the ~ (tilde) operator do to a mask?

  • Doubles it
  • Inverts it (NOT)
  • Sorts it
  • Sums it

Answer: Inverts it (NOT). ~ flips every True to False, giving you the negation.

By default, .between(30, 40) includes which endpoints?

  • Neither endpoint
  • Only the lower one
  • Only the upper one
  • Both endpoints (inclusive)

Answer: Both endpoints (inclusive). By default both ends are inclusive: 30 <= age <= 40.

Using and/or on Series raises which error?

  • Truth value of a Series is ambiguous
  • SyntaxError
  • KeyError
  • IndexError

Answer: Truth value of a Series is ambiguous. Python's and/or need a single True/False, so a Series raises the 'ambiguous truth value' error.

For ages [25,35,30,45,28], how many rows satisfy .between(28, 35)?

  • 1
  • 2
  • 3
  • 5

Answer: 3. 28, 30 and 35 fall in the inclusive range, so 3 rows match.

Which is the cleanest way to keep rows where city is NYC, LA, or SF?

  • Chaining three == with |
  • city

Answer: city. .isin([...]) is far more readable and scalable than chaining many == comparisons.