This post originated from an RSS feed registered with Java Buzz
by Joey Gibson.
Original Post: Java Regex APIs and Quoting
Feed Title: Joey Gibson's Blog
Feed URL: http://www.jobkabob.com/index.html
Feed Description: Thoughts, musings, ramblings and rants on Java, Ruby, Python, obscure languages, politics and other exciting topics
I've just been digging through the J2SE 1.4 regex stuff, and every
time I have to do regex work in Java I keep thinking how much easier
it is in other languages. Specifically I'm talking about the
clunkiness of the various regexen APIs in Java and the requirement to
double-backslash regex operators. We need a better way.
Ruby
and Perl both have native regex
support built in to the language, so the backslashes are just fine. Python, which doesn't have native
regex support (it's in the library), does have "raw" string quoting, which allows you
not to double-up the backslashes. So what I have to write like this in
Java:
1 Pattern p =
2 Pattern.compile("(\\(\\d+\\))?\\s*(\\d{3}\\s*\\-\\s*(\\d{3})");
3 Matcher m = p.matcher(my_string);
4 if (m.matches())
5 {
6 ...;
7 }
or
1 if (Pattern.matches("(\\(\\d+\\))?\\s*(\\d{3}\\s*\\-\\s*(\\d{3})", my_string))
2 {
3 ...;
4 }
looks like this in Python:
1 if re.match("(\(\d+\))?\s*(\d{3}\s*\-\s*(\d{3})", my_string):
2 ...
and could be even more easily written in Ruby thus:
1 if my_string =~ /(\(\d+\))?\s*(\d{3}\s*\-\s*(\d{3})/
2 ...
3 end
See the difference? The built-in regex support is really nice and the
ease of quoting is a beautiful thing. I doubt that we'll ever see
either of these in Java since they would certainly be considered
non-trivial to add.