Filtering with Boolean Masks
A boolean mask is a Series of True/False values that pandas uses to keep only the rows where the condition is True — it's how you ask questions like "show me everyone older than 30" and get back exactly those rows.
Learn Filtering with Boolean Masks in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick…
Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
Learn to write single conditions, combine them safely with & and |, and use the powerful helpers .isin(), .between(), and ~.
Filtering happens in two conceptual steps. First, a comparison like df["age"] > 30 produces a boolean mask — a Series of True / False , one per row. Second, you pass that mask back into the DataFrame with df[mask] , and pandas keeps only the True rows.
To combine conditions, use the bitwise operators & for AND and | for OR. Each individual condition must be wrapped in parentheses .
Three helpers make common filters short and readable:
Build a sales DataFrame, then answer three questions with masks.
Lesson complete — you can ask your data questions!
You can build boolean masks, combine them safely with & and | (parentheses always!), and reach for .isin() , ~ , and .between() when they fit.
🚀 Up next: loc vs iloc — precise label-based and position-based selection.
Practice quiz
What is a boolean mask in pandas?
- A list of column names
- A Series of True/False values, one per row
- A way to hide columns
- A type of index
Answer: A Series of True/False values, one per row. A boolean mask is a True/False Series that selects which rows to keep.
What does df[df['age'] > 30] return?
- The age column only
- A count of rows
- Only the rows where age is greater than 30
- The whole DataFrame
Answer: Only the rows where age is greater than 30. The mask keeps only rows where the condition is True.
Which operators combine two boolean conditions in pandas?
- and / or
- && / ||
- + / -
- & / |
Answer: & / |. Use the bitwise & (and) and | (or); the Python keywords and/or do not work on Series.
Why must each condition be wrapped in its own parentheses?
- Because & binds tighter than comparison operators
- To make it run faster
- Pandas requires double quotes
- It is purely stylistic
Answer: Because & binds tighter than comparison operators. & has higher precedence than > or ==, so parentheses force the comparisons first.
What does .isin(['NYC','LA']) do?
- Sorts those cities first
- Renames the column
- Keeps rows whose value is in the list
- Counts the cities
Answer: Keeps rows whose value is in the list. .isin(list) keeps rows matching any value in the list, cleaner than chaining ORs.
What does the ~ (tilde) operator do to a mask?
- Doubles it
- Inverts it (NOT)
- Sorts it
- Sums it
Answer: Inverts it (NOT). ~ flips every True to False, giving you the negation.
By default, .between(30, 40) includes which endpoints?
- Neither endpoint
- Only the lower one
- Only the upper one
- Both endpoints (inclusive)
Answer: Both endpoints (inclusive). By default both ends are inclusive: 30 <= age <= 40.
Using and/or on Series raises which error?
- Truth value of a Series is ambiguous
- SyntaxError
- KeyError
- IndexError
Answer: Truth value of a Series is ambiguous. Python's and/or need a single True/False, so a Series raises the 'ambiguous truth value' error.
For ages [25,35,30,45,28], how many rows satisfy .between(28, 35)?
- 1
- 2
- 3
- 5
Answer: 3. 28, 30 and 35 fall in the inclusive range, so 3 rows match.
Which is the cleanest way to keep rows where city is NYC, LA, or SF?
- Chaining three == with |
- city
Answer: city. .isin([...]) is far more readable and scalable than chaining many == comparisons.