In this Java regex word-matching example, we will learn to match a specific word or match a word that contains a specific substring.
Regex | Input Strings | Matches | Not Matches |
---|---|---|---|
\\bcat\\b | “The cat is cute”, “The category is empty” “The noncategory is also empty” | cat | category noncategory |
\\b\\w*cat\\w*\\b | “The cat is cute”, “The category is empty”, “The noncategory is also empty” | cat category noncategory | None |
1. Using Regex to Match an Exact Word Only
To match an exact word in a text using regex, we use the word boundary \b
. This ensures that the pattern matches the word as a whole, rather than as part of a longer word.
Solution Regex : \\bword\\b
Strictly speaking, “\b” matches in these three positions:
- Before the first character in the data, if the first character is a word character
- After the last character in the data, if the last character is a word character
- Between two characters in the data, where one is a word character and the other is not a word character
Regex Pattern | Description | Example Text | Matches | Does Not Match |
---|---|---|---|---|
\\bcat\\b | Matches the word “cat” as a whole word | “The cat is cute” | “cat” | “catch”, “category”, “cats” |
To run a “specific word only” search using a regular expression, simply place the word between two-word boundaries.
import java.util.List;
import java.util.regex.*;
public class StartsWithEndsWith {
public static void main(String[] args) {
List<String> lines = List.of(
"The cat is cute",
"The category is empty",
"The non-category is also empty");
Pattern pattern = Pattern.compile("\\bcat\\b");
for(String line: lines) {
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
System.out.println(STR."Match found: \{matcher.group()}");
}
}
}
}
The program output:
Match found: cat
You can see that it matches the exact word “cat“, but does not match the larger words containing cat i.e. “category” and “non-category“.
2. Using Regex to Match a Word that Contains a Specific Substring
Suppose, you want to match “java
” such that it should be able to match words like “javap
” or “myjava
” or “myjavaprogram
” i.e. java word can lie anywhere in the data string. It could be the start of a word with additional characters at the end or the end of a word with additional characters at the start as well as in between a long word.
To match a word that contains a specific substring, we can use the following regex. It matches any word containing the substring “word“, where the substring may appear at the beginning, middle, or end of the word.
Solution Regex : \\b\\w*word\\w*\\b
List<String> lines = List.of(
"The cat is cute",
"The category is empty",
"The noncategory is also empty");
Pattern pattern = Pattern.compile("\\b\\w*cat\\w*\\b", Pattern.CASE_INSENSITIVE);
for(String line: lines) {
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
System.out.println(STR."Match found: \{matcher.group()}");
}
}
The program output:
Match found: cat
Match found: category
Match found: noncategory
That’s all for this java regex contains word examples related to the boundary and non-boundary matches of a specific word using Java regular expressions.
Happy Learning !!
References: Java regex docs
Comments