Pivot Tables & Crosstab

A pivot table in pandas is a summary grid built with df.pivot_table() that takes long, row-by-row data and reshapes it so one categorical column becomes the rows, another becomes the columns, and a numeric column is aggregated into each cell — exactly like a spreadsheet PivotTable.

Learn Pivot Tables & Crosstab in our free Pandas course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick…

Part of the free Pandas course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

Start with long-format sales data — one row per transaction. pivot_table turns it into a grid: index picks the rows, columns picks the columns, values is the number to summarise, and aggfunc says how (sum, mean, count, ...).

Add margins=True for an "All" row and column of grand totals. Pass a list to aggfunc to compute several summaries at once.

df.pivot() only reshapes and errors if any cell would receive more than one value . pd.crosstab() is a shortcut for frequency tables — counting how often category combinations occur, optionally as proportions with normalize= .

Build a pivot table of the average score for each class. Which aggfunc gives the average?

From transaction-level data, build a store-by-month revenue grid with grand totals, then find the best month overall.

Lesson complete — you can summarise data like a spreadsheet!

You can build pivot tables with custom aggregations and totals, choose between pivot and pivot_table, and count category combinations with crosstab. These are the everyday reporting tools of a data analyst.

🚀 Up next: Reshaping with melt, stack, unstack — convert between wide and long layouts.

Practice quiz

In pivot_table, which argument picks the rows of the grid?

  • columns
  • index
  • values
  • aggfunc

Answer: index. index chooses which column's categories become the rows.

What does the values argument specify?

  • The row labels
  • The column labels
  • The numeric column to aggregate into each cell
  • The sort order

Answer: The numeric column to aggregate into each cell. values is the numeric column summarised in each cell.

What is the default aggfunc of pivot_table?

  • mean
  • sum
  • count
  • max

Answer: mean. pivot_table aggregates with mean by default.

Which aggfunc gives the average score per group?

  • 'sum'
  • 'count'
  • 'max'
  • 'mean'

Answer: 'mean'. aggfunc='mean' computes the average of the values in each cell.

What does margins=True add?

  • An 'All' row and column of grand totals
  • A sort
  • A new index
  • Random noise

Answer: An 'All' row and column of grand totals. margins=True appends grand-total row and column (labelled 'All' by default).

How do you compute several statistics at once in pivot_table?

  • Call it many times
  • sum
  • mean
  • count

Answer: sum. A list of functions in aggfunc returns multiple summary columns.

Why does df.pivot raise an error on duplicate index/column pairs?

  • It only reshapes and can't decide which value to keep
  • It needs an index
  • It needs more memory
  • It is deprecated

Answer: It only reshapes and can't decide which value to keep. pivot can't aggregate, so duplicate cells are ambiguous and raise ValueError.

Which function is the shortcut for frequency (count) tables?

  • df.pivot()
  • pd.crosstab()
  • df.merge()
  • df.stack()

Answer: pd.crosstab(). pd.crosstab counts how often combinations of categories co-occur.

What does crosstab(a, b, normalize='index') produce?

  • Raw counts
  • A sum
  • Each row as proportions that sum to 1
  • A pivot with totals

Answer: Each row as proportions that sum to 1. normalize='index' converts each row's counts into proportions.

If df.pivot raises 'Index contains duplicate entries', what should you switch to?

  • crosstab without values
  • merge
  • concat
  • pivot_table with an aggfunc

Answer: pivot_table with an aggfunc. pivot_table plus an aggfunc decides how to combine the duplicate cells.