Regex for Beginners - Understanding Pattern Matching from Scratch
Regular expressions (regex) are the Swiss Army knife of text processing. This guide takes you from zero to understanding.
What is Regex?
A regex is a mini-language for describing string patterns. With one pattern string, you can match, find, and replace text that follows a rule.
Pattern: \d{4}-\d{2}-\d{2}
Matches: 2025-01-15, 2024-12-31
Does not match: 25-1-1, 2025/01/15
Core Metacharacters
| Metachar | Meaning | Example |
|---|---|---|
. |
Any character (except newline) | a.c matches abc, a1c |
\d |
Digit 0-9 | \d{3} matches 123 |
\w |
Word char (letter, digit, underscore) | \w+ matches hello_1 |
\s |
Whitespace | |
^ |
Start of line | ^Hello |
$ |
End of line | world$ |
\b |
Word boundary | \bcat\b doesn't match category |
Uppercase versions are negated: \D = non-digit, \W = non-word, \S = non-whitespace.
Quantifiers
| Quantifier | Meaning |
|---|---|
* |
0 or more times |
+ |
1 or more times |
? |
0 or 1 time |
{n} |
Exactly n times |
{n,} |
At least n times |
{n,m} |
Between n and m times |
Default is greedy (matches as much as possible). Add ? for lazy matching: .*?.
Character Classes
[abc]: a or b or c[^abc]: NOT a, b, or c[a-z]: a through z[0-9]=\d
Groups and Backreferences
(abc): Capturing group(?:abc): Non-capturing group\1: Reference to first group(?<name>abc): Named group
Example: Match repeated words \b(\w+)\s+\1\b
Lookarounds (Zero-width assertions)
| Assertion | Meaning |
|---|---|
(?=abc) |
Followed by abc |
(?!abc) |
Not followed by abc |
(?<=abc) |
Preceded by abc |
(?<!abc) |
Not preceded by abc |
Common Use Cases
Email (simplified)
[\w.+-]+@[\w-]+\.[\w.-]+
URL
https?://[\w.-]+\.[a-z]{2,}(/[\w./-]*)?
Phone (US)
\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}
Date (YYYY-MM-DD)
\d{4}-\d{2}-\d{2}
Using Regex in JavaScript
// Test if matches
/^1[3-9]\d{9}$/.test("13800138000") // true
// Extract matches
"2025-01-15".match(/\d{4}-\d{2}-\d{2}/) // ["2025-01-15"]
// Replace
"hello world".replace(/\s/g, "-") // "hello-world"
// Extract all matches
"a1b2c3".match(/\d/g) // ["1","2","3"]
Learning Tips
- Use regex101.com for real-time testing with explanations
- Start simple, add complexity gradually
- Add comments to complex regex
- Don't try to solve everything with one regex โ break it into steps
Common Pitfalls
- Greedy matching:
<.*>matches entire HTML โ use<.*?>instead - Unescaped special chars: To match
., write\. - Performance: Nested quantifiers like
(a+)+can cause catastrophic backtracking