dplyr Basics

dplyr is a tidyverse package for transforming data frames with a small set of readable verbs — filter, select, mutate, and arrange — that you chain together with the pipe.

Learn dplyr Basics in our free R course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free R course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

By the end of this lesson you'll filter rows, select and add columns, sort data, and combine these verbs into a single readable pipeline with the pipe operator.

What You'll Learn in This Lesson

1️⃣ filter() and select()

filter() keeps rows matching a condition; select() keeps columns. Inside dplyr verbs you refer to columns by bare name — no df$ needed.

2️⃣ mutate() and arrange()

mutate() adds new columns computed from existing ones; arrange() sorts rows, with desc() for descending order.

3️⃣ Chaining with the Pipe

The real power comes from chaining. The pipe | passes each step's result into the next, so a multi-step transformation reads top to bottom like a sentence.

Your turn. Fill in the # TODO blank, run it, and compare with the expected output.

Write the whole pipeline from the outline, run it, and check it against the example output. This is the core dplyr workflow you'll use daily.

📋 Quick Reference — dplyr Verbs

Practice quiz

Which dplyr verb keeps only the ROWS that match a condition?

  • select()
  • filter()
  • mutate()
  • arrange()

Answer: filter(). filter() keeps rows matching a condition; select() chooses columns.

Which verb chooses (keeps or drops) COLUMNS?

  • select()
  • filter()
  • summarise()
  • arrange()

Answer: select(). select() picks columns; filter() picks rows.

Which verb ADDS or changes a column computed from existing ones?

  • arrange()
  • filter()
  • mutate()
  • select()

Answer: mutate(). mutate() creates or modifies columns.

Which verb SORTS the rows of a data frame?

  • group_by()
  • select()
  • filter()
  • arrange()

Answer: arrange(). arrange() sorts rows; wrap a column in desc() for descending order.

How do you sort in DESCENDING order with arrange()?

  • arrange(desc(x))
  • arrange(-sort(x))
  • arrange(x, rev = TRUE)
  • arrange(descending(x))

Answer: arrange(desc(x)). arrange(df, desc(x)) sorts by x from high to low.

What does the pipe in df |> filter(x > 1) do?

  • Comments out the filter call
  • Passes df as the first argument to filter()
  • Runs filter() before df is created
  • Compares df with the filter result

Answer: Passes df as the first argument to filter(). The pipe feeds the left-hand value as the first argument to the right-hand function.

Inside a dplyr verb, how do you refer to a column named salary?

  • By the bare name salary
  • Always as df$salary
  • As "salary" in quotes
  • As .salary

Answer: By the bare name salary. dplyr verbs let you use bare column names; no df$ prefix is needed.

Which function loads dplyr at the start of a session?

  • import(dplyr)
  • require.dplyr()
  • library(dplyr)
  • use(dplyr)

Answer: library(dplyr). library(dplyr) attaches the package so its verbs are available.

Inside filter(), how do you test that dept equals "eng"?

  • dept = "eng"
  • dept == "eng"
  • dept -> "eng"
  • dept := "eng"

Answer: dept == "eng". Use == for equality; a single = is assignment and is wrong here.

Which native pipe operator works in base R 4.1+ without any package?

  • %>%
  • ->>
  • |>
  • %|%

Answer: |>. |> is the native pipe in R 4.1+; %>% comes from magrittr/dplyr.