Open thy mouth, judge righteously, And plead the cause of the poor and needy.

Proverbs 31:9

Popular blogs and most websites offer an email newsletter subscription. While the big sites maintain teams to monitor the quality the content and the communication, the average blogger cannot afford to hire employees. Subscribing to a newsletter can significantly increase traffic to your blog, but it also comes with the need to filter out bogus email addresses. Using built-in HTML and JavaScript tools, we can perform basic syntax checking for address validity. By querying the specified mail server we can check for the existence of the address.

Syntax check for email address

Embedding an email field is easy with an HTML form. It is enough to enter an “input” element with an attribute type “email”.

<form>
    <input type = "email" required>
    <input type = "submit" value= "Subscribe">
</form>

The “input” element’s value attribute contains a string which is automatically validated as conforming to e-mail syntax. The entered string is checked based on the at “@” sign. A sufficient condition is that there are characters on both sides of the separator “@”.
For example, an address “user@hello” without a top-level domain will be validated successfully. Of course, such validation cannot be accepted.

Using patterns to validate the email address

The pattern attribute, when specified, is a regular expression that the input’s value must match in order for the value to pass the validation. It must be a valid JavaScript regular expression.

If the specified pattern is not specified or is invalid, no regular expression is applied and this attribute is ignored completely.

<form>
    <input type = "email" pattern = "[a-zA-Z0-9.\-_]{1,}@[a-zA-Z.-]{2,}[.]{1}[a-zA-Z]{2,}$" required>
    <input type = "submit" value = "Subscribe">
</form>

In this example, palette is set as a regular expression. Let’s take a closer look at the regular expression.
We have two parts to the full email address – address@server.

Address – all characters in the space are allowed:

  • a-z;
  • A-Z;
  • 0-9;
  • . dot;
  • – dash;
  • _ underscore.

Mail server address – all characters in the space are allowed:

  • Domain
    • a-z;
    • A-Z;
    • . dot;
    • – dash.
  • Domain delimiter
    • . dot.
  • Top-level domain
    • a-z;
    • A-Z.

You can change the set of allowed characters using the attached table.

ASCII (1977/1986)
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x NUL SOH STX ETX EOT ENQ ACK BEL  BS   HT   LF   VT   FF   CR   SO   SI 
1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN  EM  SUB ESC  FS   GS   RS   US 
2x  SP  ! # $ % & ( ) * + , . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
  Changed or added in 1963 version
  Changed in both 1963 version and 1965 draft

Original ASCII table

Conclusion

Nowadays it is possible to register domains of different character sets (Cyrillic, hieroglyph) for example. For this reason, it is quite possible that the mail server is on such a domain type and the verification with the attached example will be invalid. For more fine-tuning, see the full Unicode table. Of course checking the integrity and validity of an email address doesn’t just have to be done on the front end. You must also do server-side authentication.

Last modified: August 9, 2022

Author

Comments

Write a Reply or Comment

Your email address will not be published.