Regex for Beginners - Understanding Pattern Matching from Scratch

Published 2025-03-22 ยท ToolNest

Regular expressions (regex) are the Swiss Army knife of text processing. This guide takes you from zero to understanding.

What is Regex?

A regex is a mini-language for describing string patterns. With one pattern string, you can match, find, and replace text that follows a rule.

Pattern: \d{4}-\d{2}-\d{2}
Matches: 2025-01-15, 2024-12-31
Does not match: 25-1-1, 2025/01/15

Core Metacharacters

Metachar Meaning Example
. Any character (except newline) a.c matches abc, a1c
\d Digit 0-9 \d{3} matches 123
\w Word char (letter, digit, underscore) \w+ matches hello_1
\s Whitespace
^ Start of line ^Hello
$ End of line world$
\b Word boundary \bcat\b doesn't match category

Uppercase versions are negated: \D = non-digit, \W = non-word, \S = non-whitespace.

Quantifiers

Quantifier Meaning
* 0 or more times
+ 1 or more times
? 0 or 1 time
{n} Exactly n times
{n,} At least n times
{n,m} Between n and m times

Default is greedy (matches as much as possible). Add ? for lazy matching: .*?.

Character Classes

  • [abc]: a or b or c
  • [^abc]: NOT a, b, or c
  • [a-z]: a through z
  • [0-9] = \d

Groups and Backreferences

  • (abc): Capturing group
  • (?:abc): Non-capturing group
  • \1: Reference to first group
  • (?<name>abc): Named group

Example: Match repeated words \b(\w+)\s+\1\b

Lookarounds (Zero-width assertions)

Assertion Meaning
(?=abc) Followed by abc
(?!abc) Not followed by abc
(?<=abc) Preceded by abc
(?<!abc) Not preceded by abc

Common Use Cases

Email (simplified)

[\w.+-]+@[\w-]+\.[\w.-]+

URL

https?://[\w.-]+\.[a-z]{2,}(/[\w./-]*)?

Phone (US)

\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}

Date (YYYY-MM-DD)

\d{4}-\d{2}-\d{2}

Using Regex in JavaScript

// Test if matches
/^1[3-9]\d{9}$/.test("13800138000")  // true

// Extract matches
"2025-01-15".match(/\d{4}-\d{2}-\d{2}/)  // ["2025-01-15"]

// Replace
"hello world".replace(/\s/g, "-")  // "hello-world"

// Extract all matches
"a1b2c3".match(/\d/g)  // ["1","2","3"]

Learning Tips

  1. Use regex101.com for real-time testing with explanations
  2. Start simple, add complexity gradually
  3. Add comments to complex regex
  4. Don't try to solve everything with one regex โ€” break it into steps

Common Pitfalls

  1. Greedy matching: <.*> matches entire HTML โ€” use <.*?> instead
  2. Unescaped special chars: To match ., write \.
  3. Performance: Nested quantifiers like (a+)+ can cause catastrophic backtracking

โ† Back to Articles