tidyr & Pipes
tidyr is a tidyverse package for reshaping data into "tidy" form — one variable per column, one observation per row — which is the shape that dplyr, ggplot2, and models expect.
Learn tidyr & Pipes in our free R course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.
Part of the free R course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
By the end of this lesson you'll reshape data between wide and long with pivot_longer() and pivot_wider(), clean it with separate() and drop_na(), and chain tidyr and dplyr together with the pipe.
What You'll Learn in This Lesson
1️⃣ Wide to Long with pivot_longer()
When several columns really represent the same variable (subjects, months, quarters), pivot_longer() stacks them into two tidy columns: one for the old names, one for the values.
2️⃣ Long to Wide with pivot_wider()
pivot_wider() is the inverse — it spreads a name column and a value column out into multiple columns. Handy for building presentation tables from tidy data.
3️⃣ Cleaning in a Pipeline
Because tidyr and dplyr share the pipe, you can clean and reshape in one flow. drop_na() removes missing rows and separate() splits a packed column into several.
Your turn. Fill in the # TODO blank, run it, and compare with the expected output.
Write the pipeline from the outline, run it, and check it against the example output. Reshaping then filtering is the everyday tidyverse rhythm.
📋 Quick Reference — tidyr
Practice quiz
Which package provides pivot_longer() and pivot_wider()?
- base R
- tidyr
- ggplot2
- stringr
Answer: tidyr. pivot_longer() and pivot_wider() come from the tidyr package.
What does pivot_longer() do?
- Deletes columns
- Sorts rows
- Stacks several columns into longer, tidy form
- Spreads one column into many
Answer: Stacks several columns into longer, tidy form. pivot_longer() reshapes wide data into long, tidy form.
In pivot_longer(), what does names_to specify?
- The values column
- The rows to keep
- The file name
- The new column that holds the old column names
Answer: The new column that holds the old column names. names_to names the new column receiving the former column names.
What does pivot_wider() do?
- Spreads a name column and value column into multiple columns
- Stacks columns into two
- Filters rows
- Removes NA
Answer: Spreads a name column and value column into multiple columns. pivot_wider() is the inverse of pivot_longer(): long to wide.
Which arguments does pivot_wider() use to spread data?
- cols and into
- names_from and values_from
- sep and remove
- by and on
Answer: names_from and values_from. pivot_wider() takes names_from and values_from.
What does separate() do?
- Drops rows
- Renames a frame
- Splits one column into several using a separator
- Joins two frames
Answer: Splits one column into several using a separator. separate() splits a packed column (e.g. 'Paris_FR') into multiple columns.
What does drop_na() do?
- Fills NA with 0
- Replaces NA with the mean
- Counts NA values
- Removes rows that contain missing values
Answer: Removes rows that contain missing values. drop_na() removes rows with NA, optionally only in named columns.
What does 'tidy data' mean?
- Each variable a column, each observation a row
- All numeric columns
- No missing values allowed
- Data stored in wide form
Answer: Each variable a column, each observation a row. Tidy data: one variable per column, one observation per row.
How do tidyr and dplyr work together cleanly?
- They cannot be combined
- Only via for-loops
- Through the pipe, passing a data frame between steps
- By exporting to CSV
Answer: Through the pipe, passing a data frame between steps. Each takes a data frame first and returns one, so the pipe chains them.
tidyr functions like pivot_longer() return what kind of object?
- A matrix
- A tibble
- A plain vector
- A list of lists
Answer: A tibble. tidyr returns tibbles, which print with a type row under each column.