How to Find Valid Email Addresses with Regular Expressions in Java?

BY MARKUS SPRUNCK
 
Validating email addresses can be a tricky task. If your preferred operating system is unix, you would usually use the grep program to run the regex, but some-times we have to do this job within a Java program.

Regular Expression for Email Addresses

The regular expression used in the example code: 


  [A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}

is one of the simplest possible. In many cases this simple expression is good enough. It consists of five parts:

 
[A-Z0-9._%+-]+ the first part of mail address may contain all characters, numbers, points, underscores, percent, plus and minus.
  @ the @ character is mandatory
  [A-Z0-9.-]+ the second part of mail address may contain all characters, numbers, points, underscores. 
  \\. the point is mandatory
  [A-Z]{2,4} the domain name may contain all characters. The number of characters is limited between 2 and 4.

To get a better impression how regular expressions work you may visit Regular Expressions - User Guide.

More Complex Regular Expression

To get deeper into the topic you may visit a specialized page. There you find different expressions with more or less complex implementations like the following:

 
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|
"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")
@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\
[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|
[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

a very complex expression from [2].

Java Example Code to Call Regular Expression

The class RegularExpression.java reads a file and tries to find all valid email addresses.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;

public class RegularExpression {
    public static void main(String[] args) throws IOException {

        // Simple expression to find a valid e-mail address in a file
        Pattern pattern = Pattern.compile("[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}");
        // Read file, find valid mail addresses and print result
        File file = new File("test.txt");
        BufferedReader in = new BufferedReader(new FileReader(file));
        int lines = 0;
        int matches = 0;
        for (String line = in.readLine(); line != null; line = in.readLine()) {
            lines++;
            Matcher matcher = pattern.matcher(line.toUpperCase());
            if (matcher.matches()) {    
                System.out.println(lines + ": '" + line + "'");
                matches++;
            }
        }
        // output of summary
        if (matches == 0) {    
            System.out.println("No matches in " + lines + " lines");
        } else {
            System.out.println("\n" + matches + " matches in " + lines + " lines");
        }
    }
}   

With the following Test.txt file.

1
2
3
4
5
markus.sprunck@
markus.sprunck@online.de
markus.sprunck@sampledomain.eu
markus.sprunck@online
@online.de

The expected output for this input file is.

1
2
3
4
2: 'markus.sprunck@online.de'
3: 'markus.sprunck@sampledomain.eu'
 
2 matches in 6 lines

If the correct encoding is important, the Scanner class could be used - like in the following code snippet. A better way to read a file is to use correct encoding like this:

1
2
3
4
5
6
final Scanner scan = new Scanner(new File(filePath), encoding);
String line = scan.nextLine();
while (scan.hasNext()) {
    // insert here the code for e-mail scan
    line = scan.nextLine();
}

References

[1] How to Find or Validate an Email Address, Regular-Expressions.info; http://www.regular-expressions.info/email.html

Change History

Revision  Date  Author   Description
 1.0  May 18, 2012  Markus Sprunck    first version
 1.1  Aug 18, 2012  Markus Sprunck   improved layout for tablets
 1.2  Jan 28, 2013  Markus Sprunck   improved introduction, better links and picture
 1.3  Dec 17, 2014 Markus Sprunck  improve structure

Sponsored Link