Regular Expressions (preg)
A regular expression is a tiny pattern language for finding, extracting, and replacing text. PHP exposes it through the preg_* family. Learn the core syntax and four key functions, and validating an email or pulling every link from a page becomes a one-liner.
Learn Regular Expressions (preg) in our free PHP course — a beginner-friendly interactive lesson with worked examples, a practice exercise and a quick recall.
Part of the free Php course at LearnCodingFast — hands-on lessons with examples you run in your browser, plus practice exercises and a quick quiz.
What You'll Learn in This Lesson
1️⃣ Testing with preg_match
preg_match($pattern, $subject, $matches) returns 1 if the pattern is found, 0 if not. The pattern is a string wrapped in delimiters (usually /.../ ). The building blocks: \d a digit, \w a word character, \s whitespace, + one-or-more, ^ start, $ end, \b word boundary. A trailing i makes it case-insensitive.
2️⃣ Capture Groups
Wrap part of a pattern in parentheses ( ... ) to capture it — pull that piece out of the match. Captures land in the matches array: index 0 is the whole match, 1 , 2 , … are the groups left-to-right. Even better, name a group with (?<name>...) and read it as $m['name'] — far clearer than counting parentheses.
3️⃣ Finding Every Match
preg_match_all doesn't stop at the first hit — it collects them all. Its results array is organised by group, so $m[1] is an array of every first-group capture. Note .*? below: the ? makes it lazy , matching as little as possible so each <a> tag is captured separately rather than one giant greedy span.
4️⃣ Replacing & Splitting
preg_replace($pattern, $replacement, $subject) swaps every match. In the replacement string, $1 , $2 … are back-references to your capture groups — perfect for reformatting. And preg_split breaks a string wherever the pattern matches, ideal for splitting on "any run of spaces or commas".
Now you try — fill in each ___ using the 👉 hint, then run it and check against the Output panel.
These lines extract and print the year from a date — but they're scrambled. Put them in the order that prints 2026 .
Set the subject string ( B ), optionally initialise $m ( D ), run preg_match which fills $m ( C ), then echo the first capture group ( A ). You can't read $m[1] before the match populates it.
0 — the anchors ^ and $ require the entire string to be digits, but '12a' ends in a letter, so there's no match.
r*g*x — every vowel in the character class [aeiou] is replaced with an asterisk.
3 — there are three separate runs of digits ( 1 , 22 , 333 ), so preg_match_all finds three matches.
📋 Quick Reference — preg & Syntax
No code is filled in this time — just a brief and an outline. Write it yourself, run it on onecompiler.com/php or your own machine, then check your result against the expected output in the comments.
Practice quiz
What does echo preg_match('/^\d+$/', '12a'); print?
- 1
- false
- 0
- 3
Answer: 0. The ^ and $ anchors require the whole string to be digits, but '12a' ends in a letter, so there is no match (0).
What does echo preg_replace('/[aeiou]/', '*', 'regex'); print?
- r*g*x
- *****
- regex
- rgx
Answer: r*g*x. Every vowel in the character class [aeiou] is replaced with an asterisk: r*g*x.
After preg_match_all('/\d+/', 'a1b22c333', $m); what is count($m[0])?
- 6
- 1
- 0
- 3
Answer: 3. There are three separate runs of digits (1, 22, 333), so preg_match_all finds three matches.
In a capture-group match, what does index 0 of the matches array hold?
- The first capture group
- The whole matched text
- The last capture group
- Always null
Answer: The whole matched text. $m[0] is the entire match; $m[1], $m[2], ... are the capture groups left to right.
What does the i flag at the end of a pattern like /php/i do?
- Makes matching case-insensitive
- Treats the pattern as Unicode
- Makes . match newlines
- Anchors to the line start
Answer: Makes matching case-insensitive. The i modifier makes the pattern case-insensitive, so /php/i matches 'PHP'.
What makes .*? different from .* in a pattern?
- The ? makes it match more
- The ? makes it case-insensitive
- The ? makes it lazy — matches as little as possible
- It is a syntax error
Answer: The ? makes it lazy — matches as little as possible. Appending ? makes a quantifier lazy, grabbing as little as possible instead of being greedy.
What does preg_replace('/(\w+), (\w+)/', '$2 $1', 'Smith, John') return?
- Smith John
- John Smith
- Smith, John
- $2 $1
Answer: John Smith. $1 and $2 are back-references to the capture groups, so the names are swapped to 'John Smith'.
Which function should you use to split a string on a pattern?
- preg_match
- preg_replace
- preg_match_all
- preg_split
Answer: preg_split. preg_split breaks a string wherever the pattern matches and returns the pieces as an array.
Why prefer a named group like (?<year>\d{4}) over a numbered one?
- It matches faster
- It is self-documenting and survives reordering
- It allows more characters
- Numbered groups are not supported in PHP
Answer: It is self-documenting and survives reordering. Named groups read clearly ($m['year']) and don't break when you insert or reorder groups.
When should you NOT use a regular expression?
- To validate a fixed pattern shape
- To extract a list of matches
- To parse structured formats like HTML, XML, or JSON
- To find and replace with variation
Answer: To parse structured formats like HTML, XML, or JSON. Use a real parser (DOMDocument, json_decode) for structured formats; nesting and edge cases defeat regex.