In this Java regex tutorial, we will learn to test whether the number of words in input text is within a certain minimum and maximum limit.
1. Regular Expression
The following regex is very similar to the previous tutorial of limiting the number of non-whitespace characters, except that each repetition matches an entire word rather than a single non-whitespace character. It matches between 2 and 10 words, skipping past any non-word characters, including punctuation and whitespace:
^\\W*(?:\\w+\\b\\W*){2,10}$
The regex matches a string that:
- Starts with any number of non-word characters (or no non-word characters at all).
- Contains between 2 and 10 words (each word consisting of one or more word characters).
- Each word is followed by a word boundary (
\\b
). - After each word, there may be any number of non-word characters (including spaces, punctuation, etc.).
- The string must end after the 2nd to 10th word.
Example Matches:
"Hello, world!"
(2 words)"one-two-three 4 5"
(5 words)" test , input "
(2 words, with spaces and commas)
Example Non-Matches:
"One"
(only 1 word, which is fewer than 2)"This is a really long sentence with too many words"
(too many than 10 words)
2. Java Example
The following Java program demonstrates the usage of Pattern and Matcher classes for compiling and executing a regex.
String regex = "^\\W*(?:\\w+\\b\\W*){2,10}$"; // Regex to limit to 3 words
Pattern pattern = Pattern.compile(regex);
// Test input
String input = "Hello World Java";
// Check if the input matches the regex
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println("Valid input: " + input); // Prints this
} else {
System.out.println("Invalid input: " + input);
}
I will advise you to play with the above simple regular expression to try more variation.
Happy Learning !!
Comments