How to Find Valid Email Addresses with Regular Expressions in Java?


Google+ Facebook Twitter LinkedIn Dzone Reddit Digg Blogger Hacker News Addthis

By Markus Sprunck; Revision: 1.2; Status: final; Last Content Change: Jan 28, 2013; 
 
Validating email addresses can be a tricky task. The picture shows a more complex regex for validation of email addresses. If your preferred operating system is unix, you would usually use the grep program to run the regex, but some-times we have to do this job within a Java program.

Basic Regular Expression for Finding Valid Email Addresses

The regular expression used in the example code: 

[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}

is one of the simplest possible. In a lot of cases this simple expression is good enough. It consists of five parts:

 1.
  [A-Z0-9._%+-]+  the first part of mail address may contain all characters, numbers, points, underscores, percent, plus and minus.
 2.   @ the @ character is mandatory
 3.   [A-Z0-9.-]+ the second part of mail address may contain all characters, numbers, points, underscores. 
 4.   \\. the point is mandatory
 5.   [A-Z]{2,4} the domain name may contain all characters. The number of characters is limited between 2 and 4.

To get a better impression how regular expressions work you may visit Regular Expressions - User Guide.

Complex Regular Expression for Finding Valid Email Addresses

To get deeper into the topic you may visit a specialized page. There you find different expressions with more or less complex implementations like the following:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

a very complex expression from [2].

Java Example Code to Call Regular Expression to Validate Email Addresses 

// File #1: RegularExpression.java

// File #2: Test.txt

// Expected output

If the correct encoding is important, the Scanner class could be used - like in the following code snippet:

// Better way to read a file with correct encoding

References

[1] Regular expression, Wikipedia; http://en.wikipedia.org/wiki/Regular_expression
[2] How to Find or Validate an Email Address, Regular-Expressions.info; http://www.regular-expressions.info/email.html

Change History

Revision  Date  Author  Description
 1.0  May 18, 2012  Markus Sprunck   first version
 1.1  Aug 18, 2012  Markus Sprunck  improved layout for tablets
 1.2  Jan 28, 2013  Markus Sprunck  improved introduction, better links and picture

Google+ Comments

You may press the +1 button to share and/or comment