Statistical Functions

Statistical functions are R's built-in toolkit for describing and testing data — means, spread, correlation, distributions, and hypothesis tests — all available in base R with no packages.

Learn Statistical Functions in our free R course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick reference.

Part of the free R course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.

By the end of this lesson you'll compute summary statistics, measure correlation between variables, run a t-test and read its p-value, and generate reproducible random numbers.

What You'll Learn in This Lesson

1️⃣ Summary Statistics

The core descriptors: mean() and median() for the center, sd() and var() for spread, and quantile() for percentiles.

2️⃣ Correlation, summary(), and Categories

cor() measures how two variables move together, summary() prints a six-number overview, and table() / prop.table() count and proportion categories.

3️⃣ Hypothesis Tests and Random Numbers

A t.test() compares two groups' means and reports a p-value (below 0.05 = significant). rnorm() draws random values; set.seed() makes them reproducible.

Your turn. Fill in the # TODO blank, run it, and compare with the expected output.

Write it from the outline, run it, and check it against the example output. Computing several descriptors at once is exactly how you'd profile a new variable.

📋 Quick Reference — Statistics

Practice quiz

Which function computes the average of a numeric vector?

  • median()
  • mean()
  • sum()
  • mode()

Answer: mean(). mean() returns the arithmetic average.

Which function measures spread as the standard deviation?

  • sd()
  • range()
  • var()
  • IQR()

Answer: sd(). sd() returns the standard deviation; var() returns the variance.

What does cor(x, y) return for two numeric vectors?

  • Their covariance
  • Their sum
  • A correlation between -1 and 1
  • A p-value

Answer: A correlation between -1 and 1. cor() measures linear association on a -1 to 1 scale.

Which function counts occurrences of each category?

  • mean()
  • summary()
  • quantile()
  • table()

Answer: table(). table() tallies counts; prop.table() turns them into proportions.

In a t.test() result, what does a p-value below 0.05 conventionally indicate?

  • Statistically significant evidence of a difference
  • No difference at all
  • A coding error
  • A large effect size

Answer: Statistically significant evidence of a difference. Below 0.05 is treated as significant, though it says nothing about effect size.

Which function makes random number generation reproducible?

  • rnorm()
  • set.seed()
  • runif()
  • sample()

Answer: set.seed(). set.seed() fixes the random sequence so results repeat.

How do you compute the mean when the vector contains NA?

  • mean(x, skip = TRUE)
  • mean(na.omit)
  • mean(x, na.rm = TRUE)
  • mean(x, drop = NA)

Answer: mean(x, na.rm = TRUE). na.rm = TRUE drops NA values before computing the mean.

Which function returns the value at a given percentile?

  • quantile()
  • median()
  • var()
  • cor()

Answer: quantile(). quantile(x, 0.25) returns the 25th percentile.

What does summary() print for a numeric vector?

  • Only the mean
  • The correlation matrix
  • Min, quartiles, median, mean, and max
  • A p-value

Answer: Min, quartiles, median, mean, and max. summary() gives a six-number overview of the distribution.

Which function draws random values from a normal distribution?

  • sample()
  • runif()
  • table()
  • rnorm()

Answer: rnorm(). rnorm() draws from a normal distribution given a mean and sd.