And the 'Email Validation' SQL Puzzle Winner Is... - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
9/27/2007
08:59 AM
Joe Celko
Joe Celko
Commentary
50%
50%

And the 'Email Validation' SQL Puzzle Winner Is...

The winner of last week's 'Email Address Validation' SQL puzzle is "Guest" (see comments), because he/she bothered to do the research and come up with an answer that is generic enough to port to any SQL dialect with a SIMILAR TO or a regexp() function. So, "Guest" please email me with your snail mail address (and some attempt to validate your SQL mastery/identity) and I'll send you one of my books. Here's my answer to last week's puzzle...

The winner of last week's 'Email Address Validation' SQL puzzle is "Guest" (see comments), because he/she bothered to do the research and come up with an answer that is generic enough to port to any SQL dialect with a SIMILAR TO or a regexp() function. So, "Guest" please email me with your snail mail address (and some attempt to validate your SQL mastery/identity) and I'll send you one of my books.

My answer to last week's puzzle is as follows:You can find regular expressions at http://regexlib.com/ and copy them into your code, making changes for your dialect. For example, this regular expression checks an email format against the RFC 3696 Standard and was written by David Thompson

^[a-z0-9!$'*+\-_]+ (\.[a-z0-9!$'*+\-_]+)* @ ([a-z0-9]+(-+[a-z0-9]+)*\.)+ ([a-z]{2} |aero|arpa|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|travel)$

If you do not read regular expressions, this says that a valid email address is one or more groups of strings of alphanumeric characters and some limited punctuation marks, optionally separated by a period. Then there is one "little snail" or "at-sign" in the middle. This followed by more groups of strings of alphanumeric characters and a more limited set of punctuation marks separated by periods. The string finally ends with either a two-letter country code or one of several explicit domain codes.

The problem is that the "[a-z]{2}" pattern matches any two letters even when they are not a valid country code.

If you do not have a SIMILAR TO predicate in your SQL, there is another approach. Set up a CREATE TRANSLATION declaration that maps the legal postfixes into a single unique token not used in an email address. This result is then passed on to another TRANSLATE () expression which reduces the alphanumeric and punctuation characters to a second unique token. Eventually, you wind up with a reduced pattern made up of the two tokens and the at-sign, say '#@#?' since neither '#' nor '?' appear in an email address.

A third approach is to use the TRIM( FROM ) function to reduce the suspect email address to a single at-sign or empty string.

The same effect can be had with nested REPLACE expressions in some dialects but the nesting can be pretty deep. If you are a LISP programmer, you will not mind a bit.

Are these good methods to use in place of using an external call to an external procedure in a 3GL language or SQL/PSM? Baroque as these suggestions are, they often run much faster than the external call and they are portable. But they are ugly to maintain.

Joe Celko is an independent consultant in Austin, Texas, and the author of SQL Puzzles and Answers (2006), Joe Celko's SQL for Smarties: Advanced SQL Programming (2005), and Joe Celko's Trees and Hierarchies in SQL for Smarties (2004).The winner of last week's 'Email Address Validation' SQL puzzle is "Guest" (see comments), because he/she bothered to do the research and come up with an answer that is generic enough to port to any SQL dialect with a SIMILAR TO or a regexp() function. So, "Guest" please email me with your snail mail address (and some attempt to validate your SQL mastery/identity) and I'll send you one of my books. Here's my answer to last week's puzzle...

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
News
Top 10 Data and Analytics Trends for 2021
Jessica Davis, Senior Editor, Enterprise Apps,  11/13/2020
Commentary
Where Cloud Spending Might Grow in 2021 and Post-Pandemic
Joao-Pierre S. Ruth, Senior Writer,  11/19/2020
Slideshows
The Ever-Expanding List of C-Level Technology Positions
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/10/2020
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Why Chatbots Are So Popular Right Now
In this IT Trend Report, you will learn more about why chatbots are gaining traction within businesses, particularly while a pandemic is impacting the world.
Slideshows
Flash Poll