Skip to content
This repository has been archived by the owner on Aug 21, 2024. It is now read-only.

Update email regex to be more liberal to check TLDs #10973

Closed
wants to merge 1 commit into from

Conversation

CITIZENDOT
Copy link
Collaborator

Summary

Previously, EMAIL_REGEX has a list of top level domains, that are most commonly used by people. That regex was picked from here: https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php

But there are actually lot of other top level domains in existence, and it's just not viable to put all of them into regex. Here's the list of all allowed top level domains: https://data.iana.org/TLD/tlds-alpha-by-domain.txt. There are 1447 top level domains registered till now.

Now, I'm updating regex to allow top level domains which follow this criteria:

  • minimum length: 2
  • maximum length: 10 (this is actually very long when considering regularly used TLDs like .com, .io. But to allow maximum TLDs, while still not allowing obscure TLDs, this figure is chosen)

Out of the 1447 TLDs, 1305 (90%) TLDs follow this criteria.

New pattern is liberal than previous pattern. So it can have false positives. But I guess false positives are better than false negatives in case of email validation.

Subtasks Checklist

Breaking Changes

References

closes #insert number here

QA Steps

@CITIZENDOT CITIZENDOT removed the request for review from SamMazerIR August 16, 2024 09:00
@CITIZENDOT
Copy link
Collaborator Author

sorry for the ping guys, it was a mistake 🥲

@CITIZENDOT
Copy link
Collaborator Author

closing in favour of this: #10983

@CITIZENDOT CITIZENDOT closed this Aug 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant