tidyr & Pipes

tidyr is a tidyverse package for reshaping data into "tidy" form — one variable per column, one observation per row — which is the shape that dplyr, ggplot2, and models expect.

Learn tidyr & Pipes in our free R course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free R course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

By the end of this lesson you'll reshape data between wide and long with pivot_longer() and pivot_wider(), clean it with separate() and drop_na(), and chain tidyr and dplyr together with the pipe.

What You'll Learn in This Lesson

1️⃣ Wide to Long with pivot_longer()

When several columns really represent the same variable (subjects, months, quarters), pivot_longer() stacks them into two tidy columns: one for the old names, one for the values.

2️⃣ Long to Wide with pivot_wider()

pivot_wider() is the inverse — it spreads a name column and a value column out into multiple columns. Handy for building presentation tables from tidy data.

3️⃣ Cleaning in a Pipeline

Because tidyr and dplyr share the pipe, you can clean and reshape in one flow. drop_na() removes missing rows and separate() splits a packed column into several.

Your turn. Fill in the # TODO blank, run it, and compare with the expected output.

Write the pipeline from the outline, run it, and check it against the example output. Reshaping then filtering is the everyday tidyverse rhythm.

📋 Quick Reference — tidyr

Practice quiz

Which package provides pivot_longer() and pivot_wider()?

base R
tidyr
ggplot2
stringr

Answer: tidyr. pivot_longer() and pivot_wider() come from the tidyr package.

What does pivot_longer() do?

Deletes columns
Sorts rows
Stacks several columns into longer, tidy form
Spreads one column into many

Answer: Stacks several columns into longer, tidy form. pivot_longer() reshapes wide data into long, tidy form.

In pivot_longer(), what does names_to specify?

The values column
The rows to keep
The file name
The new column that holds the old column names

Answer: The new column that holds the old column names. names_to names the new column receiving the former column names.

What does pivot_wider() do?

Spreads a name column and value column into multiple columns
Stacks columns into two
Filters rows
Removes NA

Answer: Spreads a name column and value column into multiple columns. pivot_wider() is the inverse of pivot_longer(): long to wide.

Which arguments does pivot_wider() use to spread data?

cols and into
names_from and values_from
sep and remove
by and on

Answer: names_from and values_from. pivot_wider() takes names_from and values_from.

What does separate() do?

Drops rows
Renames a frame
Splits one column into several using a separator
Joins two frames

Answer: Splits one column into several using a separator. separate() splits a packed column (e.g. 'Paris_FR') into multiple columns.

What does drop_na() do?

Fills NA with 0
Replaces NA with the mean
Counts NA values
Removes rows that contain missing values

Answer: Removes rows that contain missing values. drop_na() removes rows with NA, optionally only in named columns.

What does 'tidy data' mean?

Each variable a column, each observation a row
All numeric columns
No missing values allowed
Data stored in wide form

Answer: Each variable a column, each observation a row. Tidy data: one variable per column, one observation per row.

How do tidyr and dplyr work together cleanly?

They cannot be combined
Only via for-loops
Through the pipe, passing a data frame between steps
By exporting to CSV

Answer: Through the pipe, passing a data frame between steps. Each takes a data frame first and returns one, so the pipe chains them.

tidyr functions like pivot_longer() return what kind of object?

A matrix
A tibble
A plain vector
A list of lists

Answer: A tibble. tidyr returns tibbles, which print with a type row under each column.