Back in 2018, investigative journalist Brian Krebs warned against the nuances of internationalized domain names (IDNs). These domains, which contain non-Latin characters but appear to do so, can be used to create visual confusions that can become particularly handy in executing credible punycode phishing campaigns.
In 2020, hundreds of IDNs continue to be registered but are detectable using a new and soon-to-be-released version of our Typosquatting Data Feed. We started studying these domain names closely and want to highlight some common instances we found as well as related best cybersecurity practices.
What Are IDN or Punycode Phishing Attacks?
IDNs paved the way for the use of characters that do not belong to the American Standard Code for Information Interchange (ASCII) in domain names. IDNs help non-English speakers create domain names in their local language using their alphabet.
Countries such as Japan, China, Germany, and Poland, to name a few, can register domain names using their local, non-English alphabet. However, since the Domain Name System (DNS) can’t understand such characters, the domain names are converted into Punycode. As such, they would bear the standard prefix “xn--.” Here are a few examples:
- ọffice365[.]com (xn--ffice365-x80d[.]com)
- offĭce365[.]com (xn--offce365-ujb[.]com)
- offìce365[.]com (xn--offce365-41a[.]com)
To end users, however, the domain names would appear in their IDN format. And since there are characters that look very similar to ASCII ones, it’s easy to misjudge them and think they’re legitimate. For this reason, IDNs can be used effectively in punycode phishing attacks and business email compromise (BEC) scams.
Commonly Used Non-ASCII Characters
Alternatives to the Letter “o”
It’s easy to distinguish between microsoft[.]com and micr0soft[.]com. The second “o” was obviously replaced with zero (0). But, when the following similar characters are used, you can hardly notice the difference:
- ọ (ọffice365[.]com, micrọsoft[.]com)
- ȯ (microsȯft[.]com)
- ö (microsöft[.]com)
Alternatives to the Letter “i”
We detected several Instagram lookalike domains. Among them are those that use non-ASCII characters that closely resemble the letter “i.” Microsoft, Office 365, and Instagram typosquatting domains also make use of a couple of alternative characters to “i.”
- ḭ (ḭnstagram[.]com)
- í (ínstagram[.]com)
- ı (mıcrosoft[.]net, ınstagram[.]xyz)
- ĩ (mĩcrosoft[.]com)
- ì (offìce365[.]com)
- ĭ (offĭce365[.]com)
Alternatives to the Letter “a”
Below are four variations of the letter “a” that threat actors could use to register typosquatting domains:
- ə (instagrəm[.]com)
- à (instagràm[.]com)
- ą (instągram[.]com, instągram[.]com)
- ā (lloydsbānk[.]com)
Alternatives to the Letter “m”
Two non-ASCII characters that could replace the letter “m” were also detected. They were used to mimic Microsoft and Instagram.
- ṃ (ṃicrosoft[.]com, instagraṃ[.]com)
- ʍ (ʍicrosoft[.]com, instagraʍ[.]com)
Protection against IDN or Punycode Typosquatting
These are just four out of 26 characters of the alphabet that can be used in IDN-based attacks. Any brand that has the letters “m,” “a,” “i,” and “o” can easily be mimicked using these special characters.
Most of the typosquatting domains can go unnoticed even to the most vigilant eyes. For instance, you won’t notice anything suspicious with the domain offìce365[.]com, until you look closely at the “ì” with a grave accent.
Here are some ways organizations can protect themselves from such abuse.
Early Typosquatting Detection with Typosquatting Data Feed – Our Typosquatting Data Feed will shortly be able to detect typosquatting IDNs, especially those that are registered on the same day with other lookalike domains. The suspicious domains are also reported a day after they are recorded in the DNS, allowing security teams to act immediately. Brand owners can also use the database to see how others are using their brand.
Compare WHOIS Records with Those of the Legitimate Websites – Confirming if a domain is indeed a typosquatting domain is easy with the help of WHOIS Lookup. Once a typosquatting domain is detected, it can go through a WHOIS lookup to gain domain intelligence.
Take, for example, offìce365[.]com. The legitimate domain office[.]com has these registrant details:
On the other hand, the typosquatting domain was registered in Japan while all of its other registrant details have been redacted for privacy.
Map Out Domain Infrastructure – For the utmost security, it’s best to investigate typosquatting domains and see other associated domains. This can be done by mapping out the domain’s infrastructure with the help of DNS records.
To illustrate, let’s go back to the WHOIS record of offìce365[.]com. WHOIS Lookup reveals that the domain uses these hostnames:
Running these nameservers on Reverse NS API would return all domain names that use them. For each nameserver, the tool detected over 300 associated domains. Although we didn’t see other domains that mimic famous brands, some could belong to legitimate small businesses that might be sharing their infrastructure with potential phishers.
IDN-based typosquatting can make BEC scams, Punycode phishing attacks, and other cybercrime more successful since users can hardly distinguish between them and the legitimate domains. Therefore, organizations can step up their cybersecurity efforts by detecting typosquatting domains as early as possible. They can even gain more domain intelligence on each typosquatting domain by using WHOIS Lookup, Reverse NS API, and other domain intelligence tools.