The use of embedded HTML documents in phishing e-mails is a standard technique employed by cybercriminals. It does away with the need to put links in the e-mail body, which antispam engines and e-mail antiviruses usually detect with ease. HTML offers more possibilities than e-mail for camouflaging phishing content.
There are two main types of HTML attachments that cybercriminals use: HTML files with a link to a fake website or a full-fledged phishing page. In the first case, the attackers can not only hide a link in the file, but also automatically redirect the user to the fraudulent site when they open this file. The second type of HTML attachment makes it possible to skip creating the website altogether and save on hosting costs: the phishing form and the script that harvests the data are embedded directly in the attachment. In addition, an HTML file, like an e-mail, can be modified according to the intended victim and attack vector, allowing for more personalized phishing content.
Fig.1. Example e-mail with an HTML attachment
Structure of phishing HTML attachments
Phishing elements in HTML attachments are usually implemented using JavaScript, which handles redirecting the user to a phishing site or collecting and sending credentials to scammers.
Fig. 2. Phishing HTML page and its source code
Typically, the HTML page sends data to a malicious URL specified in the script. Some attachments consist entirely (or mostly) of a JS script.
In the e-mail source code, the HTML attachment looks like plain text, usually Base64-encoded.
Fig. 3. HTML attachment in e-mail source code
If a file contains malicious scripts or links in plaintext, the security software can quickly parse and block it. To avoid this, cybercriminals resort to various tricks.
JavaScript obfuscation
JavaScript obfuscation is one of the most common techniques used to disguise HTML attachments. To prevent the URL in the file from being quickly spotted and blocked, phishers obfuscate either the phishing link itself or the entire script, and sometimes the whole HTML file. In some cases, cybercriminals obfuscate the code manually, but often they use ready-made tools, of which many are freely available, such as JavaScript Obfuscator.
For example, opening the HTML attachment in the phishing e-mail supposedly from HSBC Bank (see Fig. 1) in a text editor, we see some pretty confusing JS code, which, it would seem, hints neither at opening a link nor at any other meaningful action.
Fig. 4. Example of obfuscation in an HTML attachment
However, it actually is an obfuscated script that redirects the user to a phishing site. To disguise the phishing link, the attackers used a ready-made tool, allowing us to easily deobfuscate the script.
Fig. 5. Deobfuscated script from an attachment in an e-mail seemingly from HSBC Bank: link for redirecting the user
If a script, link, or HTML page is obfuscated manually, it is much harder to restore the original code. To detect phishing content in such a file, dynamic analysis may be required, which involves running and debugging the code.
Encoding
Sometimes attackers use more interesting methods. In one phishing e-mail, for instance, we found an unusual HTML attachment. As in the example above, it contained JavaScript. Because the code was so compact, one might think it was doing the same as the code in the fake HSBC e-mail — that is, redirecting the user to a phishing site. But upon running it, we found a full-fledged phishing page encoded in this small script.
Fig. 6. HTML file using the unescape() method — the source code of the file contains only five lines, one of which is empty
Fig. 7. Phishing page in the HTML attachment
The cybercriminals used an interesting trick that involves the deprecated JS method unescape(). This method substitutes the “%xx” character sequences with their ASCII equivalents in the string that is passed to it. Running the script and viewing the source code of the resulting page, we see plain HTML.
Fig. 8. The resulting HTML file
Instead of unescape(), JavaScript now uses the decodeURI() and decodeURIComponent() methods, yet most modern browsers still support unescape(). We cannot say for sure why the attackers chose a deprecated method, but it could be because modern methods are more likely to be interpreted and detected by antispam engines.
Statistics
In the first four months of 2022, Kaspersky security solutions detected nearly 2 million e-mails containing malicious HTML attachments. Nearly half of them (851,328) were detected and blocked in March. January was the calmest month, with our antispam solutions detecting 299,859 e-mails with phishing HTML attachments.
Number of detected e-mails with malicious HTML attachments, January–April 2022 (download)
Conclusion
Phishers deploy a variety of tricks to bypass e-mail blocking and lure as many users as possible to their fraudulent sites. A common technique is HTML attachments with partially or fully obfuscated code. HTML files allow attackers to use scripts, obfuscate malicious content to make it harder to detect, and send phishing pages as attachments instead of links.
Kaspersky security solutions detect HTML attachments containing scripts regardless of obfuscation.
HTML attachments in phishing e-mails