Ever have a week where multiple requests converged? I had two friends that needed to send emails to their customers. Both thought there were problems with their mailing lists and wanted to know if Microsoft Excel could help. In this tutorial, I’ll show how to find invalid email addresses in Excel using filters and data validation.
Both individuals wanted to find email addresses using improper formats before sending out the emails. They weren’t looking for Excel to show that the firstname.lastname@example.org account existed. Instead, they wanted to make sure the data record had the correct email address format. You might think of this as data cleansing.
Excel offers two ways to check for invalid emails, but it doesn’t use a predefined function like ISEMAIL, which is found in Google Sheets. I looked through the online Excel functions list from Microsoft. The first solution is easy and filters for various characters. Like the Google function, it is easy to do, but not 100% reliable. The problem is email formats have some complex rules that may require another Excel feature called Data Validation.
Email Address Format Rules
I’m sure there may be a way someone could write Excel code to validate email addresses using regular expressions or syntax rules. I didn’t want to take that path, as I’m not a coder. In doing some research, I also noticed more possibilities for valid email addresses than I realized. Here are a few examples:
- They can contain the “+” sign even though many systems disallow this character. For example, Gmail allows you to use the plus sign for their version of disposable email addresses.
- They can contain multiple periods.
- The top-level domains are not always 3 letters like .com, .net or .org. All country codes are two letters, and then there are TLDs like .travel or .museum.
- They can contain hyphens.
- Email addresses are case-insensitive. It doesn’t matter if you use upper or lower case letters.
If you want to get the specifics, there is a standard RFC 822 document for internet messaging.
After scrolling through these Excel address lists, I settled on two rules, identifying most of the bad email address entries. These aren’t email addressing laws or absolutes. Instead, these are simple rules to eliminate 95% of the badly formatted email addresses.
- The email address had to contain the “@” sign
- The email address had to contain at least 1 period.
The nice thing about these two rules is I could have Excel find the invalid emails using the Filter command. Below you will see a screen snap of the type of entries that were on these Excel spreadsheets. You can also see where Excel will create hyperlinks on improper entries.
Method 1: Invalid Email Address Filter
- Click the column header that contains your email addresses.
- From the Data menu, click Filter. This will add a drop-down icon to your column header.
- Click the down arrow control in the top right of the column containing your email addresses.
- From the drop-down menu, select Text Filters.
- From the sub-menu select Custom Filter…
- In the Custom AutoFilter dialog, enter in the two filter criteria using the “does not contain” selection. Make certain to use the Or radio button option.
- Click OK.
Once you click OK, Excel will show the emails that don’t have either the @ sign or a period. These are the problem records. You can either delete them or try to make corrections. Unfortunately, the rules aren’t perfect as you may still get some incorrect ones that slip through, such as anne.h@examplecom. In this example, both the “." and “@" are present, but it is still invalid since a second period is needed between “example” and “com.”
After doing this exercise, I did see some other patterns. For example, some people would type some combination of “do not send email.” This tends to happen if your user input form doesn’t do field validation. Other times, the reverse would happen where people used “shift 2” or [at] instead of the “@ sign.” This is a tactic some people use online to prevent email bots from scraping their addresses.
Method 2: Use Excel Data Validation
The second solution is more advanced but will find more invalid entries. It also has the benefit of flagging emails if you’re typing entries into the Excel sheet. A reader pointed me to this solution from Syntax Byte Tutorials, and it has worked in my email tests. I’ve included a practice spreadsheet at the bottom with my invalid email addresses for testing.
- Open your spreadsheet. In the sample sheet at the end, I’ve put my test email addresses in Column A.
- Click Column A header to highlight the entire column of email addresses.
- Click Data from the Excel Ribbon.
- From the Data Tools group, click the down triangle next to Data Validation.
- From the drop-down menu, select Data Validation…
- The Data Validation dialog opens.
- In the Allow: drop-down menu, select Custom.
- In the Formula: field, copy and paste the code below.
- Click OK.
- Click the small triangle to the right of Data Validation again. Your email address column should show as highlighted.
- Select Circle Invalid Data from the menu.
After a few moments, you should see red circles around the invalid entries. I also applied a yellow background to make it easier to test.
Another benefit to using Excel Data Validation is it will prevent bad email addresses from being accepted if you highlighted the entire column.
Troubleshooting Method 2
If you don’t see your expected results, it could be for several reasons.
- You didn’t highlight the entire column of email addresses.
- Your email addresses weren’t in Column A.
- You inadvertently edited the formula when pasting. This is easy to do if you try to use your directional keys to go back and forth. If an Excel formula error occurs, you can try using Excel Formula Auditing.
Other Steps You Might Take
I also saw several emails with spaces, such as anne email@example.com. I’m not certain if spaces are legitimate, but I seem to recall AOL using them at some point. However, many email systems will not accept spaces in an email address.
The other item I noticed was many duplicate email addresses. Either people had entered the information multiple times, or perhaps the address was shared by two or more people. You should take steps to remove these duplicates, as the recipients do not want multiple copies. Some may even get mad enough to flag your email as spam.
The above methods for finding bad email addresses aren’t perfect. However, they offer you an easy method to validate an email in Excel so you can act on them in a timely way. In addition, removing the bad ones prevents someone from getting those annoying bounce-back messages from ISPs telling you the user doesn’t exist.