Factors

A factor is R's data structure for categorical data — values drawn from a fixed set of categories called levels, like sizes, survey answers, or country names.

Learn Factors in our free R course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free R course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

By the end of this lesson you'll create factors, inspect and count their levels with table(), build ordered factors for ranked categories, and avoid the classic mistake of converting a factor straight to numbers.

What You'll Learn in This Lesson

1️⃣ Creating and Counting Factors

Wrap a character vector in factor() to turn it into categorical data. R finds the distinct values (the levels ) and stores each entry as a reference to a level. table() gives you an instant frequency count.

2️⃣ Ordered Factors

When categories have a natural ranking, pass ordered = TRUE and list the levels from lowest to highest. Now comparisons and min() / max() respect that order.

Notice the printed levels now show small medium large — R understands the ranking, so sf "small" is meaningful.

3️⃣ Factors Are Integers Underneath

Internally a factor maps each level to an integer code. This makes factors efficient but causes a famous trap: converting a factor of number-like labels with as.integer() gives you the codes , not the labels. Always go via as.character() first.

Your turn. Fill in the # TODO blank, run it, and compare with the expected output.

Write it from the outline, run it, and check it against the example output. Ordered factors make "agree or stronger" a one-line comparison.

📋 Quick Reference — Factors

Practice quiz

What kind of data is a factor designed to store?

  • Categorical data from a fixed set of levels
  • Only continuous numbers
  • Dates and times
  • Raw binary data

Answer: Categorical data from a fixed set of levels. A factor stores categorical data drawn from a fixed set of categories called levels.

Which function creates a factor from a character vector?

  • as.category()
  • factor()
  • levels()
  • table()

Answer: factor(). factor() turns a vector into categorical data.

Which function shows the distinct categories of a factor?

  • table()
  • summary()
  • levels()
  • names()

Answer: levels(). levels(f) returns the factor's distinct categories.

Which function gives a frequency COUNT of each level?

  • count()
  • length()
  • factor()
  • table()

Answer: table(). table(f) counts how many entries fall in each level.

How do you create an ORDERED factor?

  • factor(x, ordered = TRUE)
  • ordered.factor(x)
  • factor(x, rank = TRUE)
  • sort(factor(x))

Answer: factor(x, ordered = TRUE). Pass ordered = TRUE (and set levels low to high) to make an ordered factor.

By default, in what order are a factor's levels arranged?

  • Order of first appearance
  • Alphabetical order
  • Reverse order
  • Random order

Answer: Alphabetical order. Without an explicit levels argument, levels default to alphabetical order.

How are factors stored internally?

  • As plain character strings
  • As dates
  • As integer codes plus a label per level
  • As raw bytes

Answer: As integer codes plus a label per level. A factor maps each level to an integer code, with the labels stored once.

Why can as.integer() on a factor of "10","20","30" give 1,2,3 instead of 10,20,30?

  • It rounds the values
  • It is a bug in R
  • It sorts them first
  • It returns the level codes, not the labels

Answer: It returns the level codes, not the labels. as.integer() returns the underlying level codes, not the numeric labels.

What is the safe way to convert a number-like factor back to numbers?

  • as.numeric(as.character(f))
  • as.numeric(f)
  • as.integer(f)
  • unclass(f)

Answer: as.numeric(as.character(f)). Go via character first: as.numeric(as.character(f)) recovers the original numbers.

Which function removes levels that have no data after filtering?

  • clean()
  • droplevels()
  • trim()
  • compact()

Answer: droplevels(). droplevels(f) drops categories that no longer appear in the data.