The Quick Answer
Regular expressions (regex) are patterns used to match text. Here are the most useful patterns:
| Pattern | Matches | Example |
|---|---|---|
. |
Any character | c.t matches "cat", "cut" |
* |
Zero or more | ab*c matches "ac", "abc", "abbc" |
+ |
One or more | ab+c matches "abc", "abbc" |
? |
Zero or one | colou?r matches "color", "colour" |
\d |
Any digit | \d+ matches "123" |
\w |
Word character | \w+ matches "hello" |
Why Learn Regex?
Regular expressions help you:
- Find text: Search for patterns in files
- Validate input: Check if email/phone is formatted correctly
- Extract data: Pull out specific parts of text
- Replace text: Find and replace with patterns
Basic Building Blocks
Literal Characters
Most characters match themselves:
Pattern: cat
Matches: "cat" in "The cat sat"
Special Characters
These have special meanings and need escaping (\) to match literally:
. * + ? ^ $ { } [ ] \ | ( )
To match a literal period: \.
Character Classes
Match any character in a set:
[aeiou] → Any vowel
[0-9] → Any digit
[a-zA-Z] → Any letter
[^0-9] → NOT a digit (^ negates)
Shorthand Classes
| Shorthand | Equivalent | Meaning |
|---|---|---|
\d |
[0-9] |
Digit |
\D |
[^0-9] |
Not a digit |
\w |
[a-zA-Z0-9_] |
Word character |
\W |
[^a-zA-Z0-9_] |
Not a word character |
\s |
[ \t\n\r] |
Whitespace |
\S |
[^ \t\n\r] |
Not whitespace |
Quantifiers
Specify how many times to match:
| Quantifier | Meaning | Example |
|---|---|---|
* |
0 or more | a* matches "", "a", "aaa" |
+ |
1 or more | a+ matches "a", "aaa" |
? |
0 or 1 | a? matches "", "a" |
{3} |
Exactly 3 | a{3} matches "aaa" |
{2,4} |
2 to 4 | a{2,4} matches "aa", "aaa", "aaaa" |
{2,} |
2 or more | a{2,} matches "aa", "aaa", ... |
Anchors
Match positions, not characters:
^ → Start of string/line
$ → End of string/line
\b → Word boundary
Examples:
^Hello → "Hello" at start only
world$ → "world" at end only
\bcat\b → "cat" as whole word (not "cats")
Practical Examples
Email (Simplified)
\w+@\w+\.\w+
Matches: [email protected]
Breakdown:
\w+one or more word characters@literal @\w+domain name\.literal dot\w+extension
Phone Number (US)
\d{3}-\d{3}-\d{4}
Matches: 555-123-4567
URL
https?://\S+
Matches: http://example.com or https://example.com/page
Zip Code
\d{5}(-\d{4})?
Matches: 12345 or 12345-6789
Groups and Capturing
Parentheses create groups:
(ab)+ → Matches "ab", "abab", "ababab"
(\d{3})-(\d{4}) → Captures area code and number separately
Use (?:...) for non-capturing groups:
(?:ab)+ → Groups but doesn't capture
Alternation (OR)
Use | for alternatives:
cat|dog → Matches "cat" or "dog"
(red|blue) car → Matches "red car" or "blue car"
Common Regex Patterns
Numbers Only
^\d+$
Letters Only
^[a-zA-Z]+$
Alphanumeric
^[a-zA-Z0-9]+$
Date (YYYY-MM-DD)
\d{4}-\d{2}-\d{2}
Time (HH:MM)
\d{2}:\d{2}
IPv4 Address
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
Hex Color
#[0-9a-fA-F]{6}
Tips for Writing Regex
- Start simple: Build patterns incrementally
- Test as you go: Use a regex tester
- Be specific:
\d{5}is better than\d+for zip codes - Consider edge cases: What if input has extra spaces?
- Keep it readable: Complex regex is hard to maintain
Regex in Different Languages
JavaScript
const pattern = /\d+/;
const text = "Order 123";
text.match(pattern); // ["123"]
Python
import re
pattern = r'\d+'
text = "Order 123"
re.findall(pattern, text) # ['123']
Common Differences
- JavaScript:
/pattern/flags - Python/PHP:
'pattern'string - Some engines support
\d, others need[0-9]
Regex Tester
Write regex patterns and see matches highlighted in real-time.
Open TesterRelated Tools
- Simple Regex Tester - Quick pattern testing
- Email Extractor - Extract emails from text
- Text Diff Tool - Compare text differences
FAQ
What is regex in simple terms?
Regex is a pattern language for matching text. You describe text shapes (like \d{3} for three digits) instead of typing exact strings only.
Is regex case-sensitive by default?
In most engines, yes. Add a case-insensitive flag (commonly i) when you want cat, Cat, and CAT to match the same pattern.
What is the difference between * and +?
* means zero or more. + means one or more. So a* matches an empty string, but a+ does not.
Why does my regex match too much text?
Greedy quantifiers like .* consume as much text as possible. Fix this by using tighter character classes, anchors (^ and $), or non-greedy variants like .*? if your engine supports them.
Can I use the same regex in JavaScript, Python, and other languages?
Core syntax is similar, but engines differ in flags and advanced features. Treat each language as a separate test target and verify matches before shipping.
When should I avoid regex?
Regex is great for pattern matching, but not ideal for fully parsing nested formats. For complex HTML, JSON, or language grammars, use a dedicated parser.