MultiIndex (Hierarchical Indexing)

A MultiIndex in pandas is an index with two or more levels — for example labelling each row by a (country, city) pair — which lets a flat two-dimensional DataFrame represent higher-dimensional, hierarchical data and select it with tuple-based lookups.

Learn MultiIndex (Hierarchical Indexing) in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a…

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

The most common way to get a MultiIndex is set_index(["a", "b"]) — promoting two columns into a two-level row index. You can also build one explicitly with pd.MultiIndex.from_tuples or from_product (every combination of the given levels).

Select an exact row with a tuple in .loc . Select everything under an outer label by passing just that label. To filter on an inner level while ignoring the outer one, use .xs (cross-section). Always sort_index() first.

Grouping by two or more keys returns a result with a MultiIndex. When you want plain columns back — for merging, exporting, or plotting — call reset_index() to flatten the index into ordinary columns.

Select the population of Lyon, France. Fill in the tuple you pass to .loc .

Group sales by region and product, sort the index, then drill into one region and one product across all regions.

Lesson complete — hierarchical data holds no fear!

You can create MultiIndexes, select with tuples and cross-sections, sort for correct fast lookups, and flatten back to columns. This unlocks grouped, panel, and time-series data that lives in more than two dimensions.

🚀 Up next: Performance — vectorization & memory — make your pandas code fast and lean.

Practice quiz

What is a MultiIndex?

  • A single-level index
  • A row or column index with more than one level
  • A duplicate column
  • A type of join

Answer: A row or column index with more than one level. A MultiIndex labels each row (or column) with two or more levels, like (country, city).

Which call promotes two columns into a MultiIndex?

  • df.reset_index()
  • df.stack()
  • country
  • city

Answer: country. set_index with a list of columns builds a multi-level row index.

What does pd.MultiIndex.from_product([...]) create?

  • Every combination of the given levels
  • Only matching pairs
  • A single tuple
  • A flat index

Answer: Every combination of the given levels. from_product yields the Cartesian product — every combination of the levels.

How do you select the exact row for (UK, London)?

  • UK

Pass a tuple to .loc to select an exact MultiIndex row.

What does df.xs('London', level='city') do?

  • A cross-section on the inner 'city' level
  • Drops London
  • Sorts by city
  • Renames the level

Answer: A cross-section on the inner 'city' level. .xs takes a cross-section on a single level, ignoring the others.

Why should you sort_index() after building a MultiIndex?

  • To rename levels
  • Partial slicing is reliable and fast only when sorted
  • To drop duplicates
  • It is never needed

Answer: Partial slicing is reliable and fast only when sorted. An unsorted MultiIndex can raise UnsortedIndexError or a PerformanceWarning on partial slices.

Grouping by two keys (groupby(['region','product'])) produces what?

  • A flat index
  • A single column
  • A scalar
  • A result with a MultiIndex

Answer: A result with a MultiIndex. Grouping by multiple keys yields a MultiIndex on the result.

Which method flattens a MultiIndex back into plain columns?

  • reset_index()
  • set_index()
  • stack()
  • xs()

Answer: reset_index(). reset_index() turns the index levels back into ordinary columns.

What does df.loc['UK'] return on a (country, city) index?

  • Only the first UK row
  • An error
  • All rows under the UK outer label
  • The whole frame

Answer: All rows under the UK outer label. Passing just the outer label selects every row under it.

For sales grouped by (region, product), what does .xs('Pen', level='product') give?

  • Pen sales across all regions
  • Only the first Pen row
  • All products in one region
  • A sum of everything

Answer: Pen sales across all regions. The cross-section pulls every Pen entry regardless of region.