Table of Contents
Why Use Regex in Java?
Regular expressions (regex) are powerful patterns for searching, validating, and manipulating text. Common use cases:
- Validating emails/phone numbers.
- Extracting data (e.g., dates from logs).
- Replacing complex text patterns.
1. Regex Basics: Pattern and Matcher
Step 1: Compile a Pattern
import java.util.regex.*;
Pattern pattern = Pattern.compile("\\d+"); // Matches one or more digits
Step 2: Create a Matcher
Matcher matcher = pattern.matcher("Order 123: 5 items");
2. Key Regex Methods
matches()
: Full String Match
Checks if the entire string matches the regex:
String email = "user@example.com";
boolean isValid = email.matches("^[\\w.-]+@[\\w.-]+\\.\\w+$"); // true
find()
: Partial Matches
Searches for occurrences in the string:
while (matcher.find()) {
System.out.println("Found: " + matcher.group()); // "123", then "5"
}
group()
: Extract Subgroups
Use parentheses ()
to capture groups:
Pattern datePattern = Pattern.compile("(\\d{2})/(\\d{2})/(\\d{4})");
Matcher dateMatcher = datePattern.matcher("Due: 12/31/2023");
if (dateMatcher.find()) {
String month = dateMatcher.group(1); // "12"
String year = dateMatcher.group(3); // "2023"
}
4. Regex Syntax Cheat Sheet
Symbol | Meaning | Example |
---|---|---|
\d | Digit ([0-9] ) | \d{3} → “123” |
\w | Word character ([a-zA-Z0-9_] ) | \w+ → “Hello_1” |
^ | Start of string | ^Java |
$ | End of string | end$ |
* | Zero or more | A* → “”, “A”, “AA” |
+ | One or more | \\d+ → “1”, “123” |
[] | Character set | [A-Za-z] |
5. Flags for Case Insensitivity & More
Modify regex behavior with flags:
Pattern.compile("java", Pattern.CASE_INSENSITIVE); // Matches "JAVA", "Java", etc.
Common Flags:
Pattern.CASE_INSENSITIVE
Pattern.MULTILINE
(Treat^
and$
per line)Pattern.DOTALL
(.
matches newlines)
6. Performance Tips
- Precompile Patterns: Avoid recompiling regex in loops.
// Good:
private static final Pattern EMAIL_PATTERN = Pattern.compile("...");
// Bad:
for (...) { Pattern.compile("..."); }
- Avoid Greedy Quantifiers: Prefer reluctant (
*?
,+?
) for complex text. - Test Regex: Use tools like Regex101 to debug.
Common Mistakes
❌ Unescaped Backslashes:
// Wrong (Java string):
Pattern.compile("\d+"); // Error: Invalid escape
// Correct:
Pattern.compile("\\d+");
❌ Overly Broad Patterns:
// Weak email validation:
".+@.+" // Allows "a@b"
FAQ
Why does matches()
return false even if part of the string matches?
matches()
requires the entire string to match. Use find()
for partial matches.
How to split a string using regex?
Use split()
(covered in Manipulation and Translation)String[] words = "a,b;c".split("[,;]"); // ["a", "b", "c"]
What’s the difference between Pattern.matches()
and String.matches()
?
They behave identically, but Pattern.matches()
recompiles the regex each time.