Spammers have jumped on the little-used soft hyphen (or SHY character) to fool URL filtering devices. According to researchers at Symantec Corp., spammers are larding up URLs for sites they promote with the soft hyphen character, which many browsers ignore.
Spammers aren’t shy about jumping humans flexible cognitive abilities to slip past the notice of spam filters (H3rb41 V14gr4, anyone?). They’re also ever-alert to flaws or inconsistencies in the way that browsers render text to allow them to slip pitches URLs by programs designed to spot unwanted solicitations, phishing attempts and more.
The latest trend, according to researchers at Symantec Corp., involves the use of an obscure character called the soft hyphen or “SHY” character to obscure malicious URLs in spam messages. Writing on the Symantec Connect blog, researcher Samir Patil said that the company has seen recent spam messages that insert the HTML symbol for the soft hyphen to obfuscate URLs for Web pages promoted by the spammers.
Soft hyphens are represented by the HTML equivalent character “­” and rendered by a graphic symbol that’s identical to a standard hyphen (-). Unlike hyphens, though, soft hyphens are only used to represent line breaks within a word, say within a Microsoft Word document. However, common Web browsers, including Mozilla’s Firefox, don’t render the soft hyphen. That has enabled spammers to lard up URLs to Web sites they’re promoting with soft hyphen characters, ensuring that users will see a properly formatted URL, while URL filters that rely on text matching will be fooled, Patil wrote.
More advanced content analysis technologies that don’t rely on URL matching can spot the obfuscation and block the messages anyway, he said, but e-mail users still need to be on guard and have anti malware and anti spam products running on their system, he wrote.
Inconsistent rendering of standard HTML elements has been a major sticking point for Internet security advocates – and a major loophole for spammers and phishers, who take advantage of loopholes afforded by irregularities in the rendering of HTML content to trick users into clicking on innocuous seeming links that deliver malicious content. The advent of HTML 5 within the next couple years – and browsers that support it – is expected to solve many of these problems, because that specification finally standardizes how HTML code should be parsed by Web browsers, rather than leaving it up to individual platform vendors to develop their own interpretations of how the code should be parsed.