Table of Contents
Why Use Regex in Java?
Regular expressions (regex) are powerful patterns for searching, validating, and manipulating text. Common use cases:
- Validating emails/phone numbers.
- Extracting data (e.g., dates from logs).
- Replacing complex text patterns.
1. Regex Basics: Pattern and Matcher
Step 1: Compile a Pattern
import java.util.regex.*;
Pattern pattern = Pattern.compile("\\d+"); // Matches one or more digits
Step 2: Create a Matcher
Matcher matcher = pattern.matcher("Order 123: 5 items");
2. Key Regex Methods
matches(): Full String Match
Checks if the entire string matches the regex:
String email = "[email protected]";
boolean isValid = email.matches("^[\\w.-]+@[\\w.-]+\\.\\w+$"); // true
find(): Partial Matches
Searches for occurrences in the string:
while (matcher.find()) {
System.out.println("Found: " + matcher.group()); // "123", then "5"
}
group(): Extract Subgroups
Use parentheses () to capture groups:
Pattern datePattern = Pattern.compile("(\\d{2})/(\\d{2})/(\\d{4})");
Matcher dateMatcher = datePattern.matcher("Due: 12/31/2023");
if (dateMatcher.find()) {
String month = dateMatcher.group(1); // "12"
String year = dateMatcher.group(3); // "2023"
}
4. Regex Syntax Cheat Sheet
| Symbol | Meaning | Example |
|---|---|---|
\d | Digit ([0-9]) | \d{3} → “123” |
\w | Word character ([a-zA-Z0-9_]) | \w+ → “Hello_1” |
^ | Start of string | ^Java |
$ | End of string | end$ |
* | Zero or more | A* → “”, “A”, “AA” |
+ | One or more | \\d+ → “1”, “123” |
[] | Character set | [A-Za-z] |
5. Flags for Case Insensitivity & More
Modify regex behavior with flags:
Pattern.compile("java", Pattern.CASE_INSENSITIVE); // Matches "JAVA", "Java", etc.
Common Flags:
Pattern.CASE_INSENSITIVEPattern.MULTILINE(Treat^and$per line)Pattern.DOTALL(.matches newlines)
6. Performance Tips
- Precompile Patterns: Avoid recompiling regex in loops.
// Good:
private static final Pattern EMAIL_PATTERN = Pattern.compile("...");
// Bad:
for (...) { Pattern.compile("..."); }
- Avoid Greedy Quantifiers: Prefer reluctant (
*?,+?) for complex text. - Test Regex: Use tools like Regex101 to debug.
Common Mistakes
❌ Unescaped Backslashes:
// Wrong (Java string):
Pattern.compile("\d+"); // Error: Invalid escape
// Correct:
Pattern.compile("\\d+");
❌ Overly Broad Patterns:
// Weak email validation:
".+@.+" // Allows "a@b"
FAQ
Why does matches() return false even if part of the string matches?
matches() requires the entire string to match. Use find() for partial matches.
How to split a string using regex?
Use split() (covered in Manipulation and Translation)String[] words = "a,b;c".split("[,;]"); // ["a", "b", "c"]
What’s the difference between Pattern.matches() and String.matches()?
They behave identically, but Pattern.matches() recompiles the regex each time.