Regular Expressions for Beginners: Pattern Matching Basics

Learn the fundamentals of regular expressions with simple examples you can use right away.

The Quick Answer

Regular expressions (regex) are patterns used to match text. Here are the most useful patterns:

Pattern Matches Example
. Any character c.t matches "cat", "cut"
* Zero or more ab*c matches "ac", "abc", "abbc"
+ One or more ab+c matches "abc", "abbc"
? Zero or one colou?r matches "color", "colour"
\d Any digit \d+ matches "123"
\w Word character \w+ matches "hello"

Why Learn Regex?

Regular expressions help you:

  • Find text: Search for patterns in files
  • Validate input: Check if email/phone is formatted correctly
  • Extract data: Pull out specific parts of text
  • Replace text: Find and replace with patterns

Basic Building Blocks

Literal Characters

Most characters match themselves:

Pattern: cat
Matches: "cat" in "The cat sat"

Special Characters

These have special meanings and need escaping (\) to match literally:

. * + ? ^ $ { } [ ] \ | ( )

To match a literal period: \.

Character Classes

Match any character in a set:

[aeiou]     → Any vowel
[0-9]       → Any digit
[a-zA-Z]    → Any letter
[^0-9]      → NOT a digit (^ negates)

Shorthand Classes

Shorthand Equivalent Meaning
\d [0-9] Digit
\D [^0-9] Not a digit
\w [a-zA-Z0-9_] Word character
\W [^a-zA-Z0-9_] Not a word character
\s [ \t\n\r] Whitespace
\S [^ \t\n\r] Not whitespace

Quantifiers

Specify how many times to match:

Quantifier Meaning Example
* 0 or more a* matches "", "a", "aaa"
+ 1 or more a+ matches "a", "aaa"
? 0 or 1 a? matches "", "a"
{3} Exactly 3 a{3} matches "aaa"
{2,4} 2 to 4 a{2,4} matches "aa", "aaa", "aaaa"
{2,} 2 or more a{2,} matches "aa", "aaa", ...

Anchors

Match positions, not characters:

^    → Start of string/line
$    → End of string/line
\b   → Word boundary

Examples:

^Hello     → "Hello" at start only
world$     → "world" at end only
\bcat\b    → "cat" as whole word (not "cats")

Practical Examples

Email (Simplified)

\w+@\w+\.\w+

Matches: [email protected]

Breakdown:

  • \w+ one or more word characters
  • @ literal @
  • \w+ domain name
  • \. literal dot
  • \w+ extension

Phone Number (US)

\d{3}-\d{3}-\d{4}

Matches: 555-123-4567

URL

https?://\S+

Matches: http://example.com or https://example.com/page

Zip Code

\d{5}(-\d{4})?

Matches: 12345 or 12345-6789

Groups and Capturing

Parentheses create groups:

(ab)+        → Matches "ab", "abab", "ababab"
(\d{3})-(\d{4})  → Captures area code and number separately

Use (?:...) for non-capturing groups:

(?:ab)+      → Groups but doesn't capture

Alternation (OR)

Use | for alternatives:

cat|dog      → Matches "cat" or "dog"
(red|blue) car → Matches "red car" or "blue car"

Common Regex Patterns

Numbers Only

^\d+$

Letters Only

^[a-zA-Z]+$

Alphanumeric

^[a-zA-Z0-9]+$

Date (YYYY-MM-DD)

\d{4}-\d{2}-\d{2}

Time (HH:MM)

\d{2}:\d{2}

IPv4 Address

\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

Hex Color

#[0-9a-fA-F]{6}

Tips for Writing Regex

  1. Start simple: Build patterns incrementally
  2. Test as you go: Use a regex tester
  3. Be specific: \d{5} is better than \d+ for zip codes
  4. Consider edge cases: What if input has extra spaces?
  5. Keep it readable: Complex regex is hard to maintain

Regex in Different Languages

JavaScript

const pattern = /\d+/;
const text = "Order 123";
text.match(pattern);  // ["123"]

Python

import re
pattern = r'\d+'
text = "Order 123"
re.findall(pattern, text)  # ['123']

Common Differences

  • JavaScript: /pattern/flags
  • Python/PHP: 'pattern' string
  • Some engines support \d, others need [0-9]
Test Your Patterns

Regex Tester

Write regex patterns and see matches highlighted in real-time.

Open Tester

Related Tools

FAQ

What is regex in simple terms?

Regex is a pattern language for matching text. You describe text shapes (like \d{3} for three digits) instead of typing exact strings only.

Is regex case-sensitive by default?

In most engines, yes. Add a case-insensitive flag (commonly i) when you want cat, Cat, and CAT to match the same pattern.

What is the difference between * and +?

* means zero or more. + means one or more. So a* matches an empty string, but a+ does not.

Why does my regex match too much text?

Greedy quantifiers like .* consume as much text as possible. Fix this by using tighter character classes, anchors (^ and $), or non-greedy variants like .*? if your engine supports them.

Can I use the same regex in JavaScript, Python, and other languages?

Core syntax is similar, but engines differ in flags and advanced features. Treat each language as a separate test target and verify matches before shipping.

When should I avoid regex?

Regex is great for pattern matching, but not ideal for fully parsing nested formats. For complex HTML, JSON, or language grammars, use a dedicated parser.

Related Tools