Reshaping: pivot_longer & pivot_wider

Reshaping is the act of moving data between a wide layout (a variable spread across many columns) and a long, tidy layout (one row per observation) — the form that dplyr and ggplot2 expect.

Learn Reshaping: pivot_longer & pivot_wider in our free R course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a…

Part of the free R course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

In this lesson you'll use tidyr::pivot_longer() to gather wide columns into tidy long form, pivot_wider() to spread it back out, and you'll see exactly why long data unlocks grouped summaries and clean plots.

What You'll Learn in This Lesson

1️⃣ Wide Data & pivot_longer()

Look at a typical wide table: a variable — here, the year — is hidden in the column headers. That's fine to read, but you can't group_by(year) a value that isn't a column.

pivot_longer() fixes this by stacking the chosen columns into two new ones: names_to captures the old headers, values_to captures the cells.

Listing columns by hand gets tedious, so use selection helpers and let names_transform fix the type of the gathered names in one step.

2️⃣ pivot_wider(): Long → Wide

The inverse operation spreads a long table back out. You tell it which column supplies the new headers ( names_from ) and which fills the cells ( values_from ). This is how you build human-friendly report tables, or place two measures side by side to compute a ratio.

3️⃣ Why Tidy/Long Data Wins

The payoff for reshaping is that the rest of the tidyverse just works. Grouped summaries with dplyr and layered plots with ggplot2 both assume every variable is its own column — which is precisely what long form guarantees.

A useful rule of thumb: pivot longer to compute and plot; pivot wider to present. Most analyses spend their middle in long form and only widen at the very end for a readable table.

Your turn. Fill in the # TODO blanks and confirm you can round-trip the data.

This one chains reshaping with a grouped summary — the everyday pattern in real analysis. Tidy the data, summarise it, then optionally widen the result for presentation.

📋 Quick Reference — Reshaping

Practice quiz

Which tidyr function turns wide data into long (tidy) form?

  • pivot_wider()
  • pivot_longer()
  • spread()
  • separate()

Answer: pivot_longer(). pivot_longer() gathers many columns into name/value pairs, producing long data.

In pivot_longer(), which argument names the new column that holds the old column HEADERS?

  • names_to
  • values_to
  • names_from
  • cols

Answer: names_to. names_to receives the old header names; values_to receives the cell values.

In pivot_longer(), which argument names the column that holds the CELL values?

  • names_to
  • cols
  • values_to
  • values_from

Answer: values_to. values_to is the new column that captures the gathered cell values.

Which pivot_wider() argument supplies the values that fill the new cells?

  • names_from
  • values_to
  • cols
  • values_from

Answer: values_from. values_from names the column whose values populate the widened cells.

Which pivot_wider() argument supplies the new column HEADERS?

  • names_from
  • names_to
  • values_from
  • id_cols

Answer: names_from. names_from gives the column whose values become the new column names.

To gather every column except the id column 'company', what cols selection works?

  • cols = company
  • cols = -company
  • cols = all()
  • cols = only(company)

Answer: cols = -company. cols = -company selects everything except company.

Which selection helper gathers all columns whose names start with '20'?

  • ends_with("20")
  • contains("20")
  • starts_with("20")
  • matches(20)

Answer: starts_with("20"). starts_with("20") matches column names beginning with 20.

Why might pivot_wider() return a list-column and warn?

  • The data has too few rows
  • Each cell is not uniquely identified, so duplicates collide
  • names_from was numeric
  • You forgot library(tidyr)

Answer: Each cell is not uniquely identified, so duplicates collide. Duplicate id combinations map to the same cell; summarise() first to get one value per cell.

Which older tidyr functions did pivot_longer() and pivot_wider() supersede?

  • melt() and dcast()
  • stack() and unstack()
  • gather() and spread()
  • cast() and reshape()

Answer: gather() and spread(). pivot_longer/pivot_wider replaced the retired gather()/spread() pair.

What does names_transform = list(year = as.integer) do during pivot_longer()?

  • Renames the year column
  • Converts the gathered year names to integers
  • Drops missing years
  • Sorts by year

Answer: Converts the gathered year names to integers. names_transform applies a function to convert the gathered name column's type.