Spammers have jumped on the little-used soft hyphen (or SHY character) to fool URL filtering devices. According to researchers at Symantec Corp., spammers are larding up URLs for sites they promote with the soft hyphen character, which many browsers ignore.

Spammers aren’t shy about jumping humans flexible cognitive abilities to slip past the notice of spam filters (H3rb41 V14gr4, anyone?). They’re also ever-alert to flaws or inconsistencies in the way  that browsers render text to allow them to slip pitches URLs by programs designed to spot unwanted solicitations, phishing attempts and more. 

The latest trend, according to researchers at Symantec Corp., involves the use of an obscure character called the soft hyphen or “SHY” character to obscure malicious URLs in spam messages. Writing on the Symantec Connect blog, researcher Samir Patil said that the company has seen recent spam messages that insert the HTML symbol for the soft hyphen to obfuscate URLs for Web pages promoted by the spammers

Soft hyphens are represented by the HTML equivalent character “&shy” and rendered by a graphic symbol that’s identical to a standard hyphen (-). Unlike hyphens, though, soft hyphens are only used to represent line breaks within a word, say within a Microsoft Word document. However,  common Web browsers, including Mozilla’s Firefox, don’t render the soft hyphen. That has enabled spammers to lard up URLs to Web sites they’re promoting with soft hyphen characters, ensuring that users will see a properly formatted URL, while URL filters that rely on text matching will be fooled, Patil wrote. 

More advanced content analysis technologies that don’t rely on URL matching can spot the obfuscation and block the messages anyway, he said, but e-mail users still need to be on guard and have anti malware and anti spam products running on their system, he wrote. 

Inconsistent rendering of standard HTML elements has been a major sticking point for Internet security advocates – and a major loophole for spammers and phishers, who take advantage of loopholes afforded by irregularities in the rendering of HTML content to trick users into clicking on innocuous seeming links that deliver malicious content. The advent of HTML 5 within the next couple years – and browsers that support it – is expected to solve many of these problems, because that specification finally standardizes how HTML code should be parsed by Web browsers, rather than leaving it up to individual platform vendors to develop their own interpretations of how the code should be parsed. 

Categories: Malware

Comments (15)

  1. Anonymous
    1

    Hah, don’t even need IDN for this to work. Though I’d be surprised if that won’t be exploited in the not so terribly far future too. Oh, and something about forgetting the terminating semicolon.

  2. ghettohacker inc
    2

    this doesn’t seem to parse as an ignored character in the latest versions of IE, Firefox, or Chrome.

  3. Jeremy
    4

    Oh h3ll to the yes to AC at 6:27pm.  Death to people who send us unwanted email.  Meanwhile torture victims in the US are conspiracy theorists or delusional.

     

  4. Huckle
    5

    Of course they’re delusional! The torture victims are either dead or still being tutored, so if someone tells you they’re a torture victim then they’re either conspiracy theorists or delusional.

  5. rob
    6

    Maybe this would not be such a problem if people would not confuse email with the www and use stupid html-crap in emails in the fist place. Serves them right. Maybe learning through pain is the only way.

  6. FlyLikeAG6
    8

    “What in the hell is strelaoz talking about?”

    Not quite sure, but I give her props for quoting Warren Buffett in a revenge post.

  7. Anonymous
    9

    Show up for a security warning and get entertained by guerrilla theater. This truly is a full-service site.

  8. Grundibular
    10

    Search on “David Freer symantec”. She’s popping up all over.

    Also, why the Times New Roman?!

  9. Anonymous Coward
    11

    “Soft” hyphens are used as a conditional hyphen in a word that is now handled correctly by the built-in hyphenators of Word processors. Seldom used!

    There is no legitimate reason to use a soft hyphen in any URL … so looking for it and declaring any message with a URL that contains a soft hyphen to be spam seems pretty simple.

  10. DuffManLight
    13

    Finally something that is not dull boring in the AV world! Who the hell wants to hear about the SHY character anyway?

  11. Jeffrey A. Williams
    14

    This is not all that new of a development, but it is good that it is getting some press/exposier so that browser developers can make the necessary adjustments.  What this article doesn’t make very clear IMO is wheather or not the perps of the miss-use of ‘shy’ are the registrants, malicious spammers, or the registrars turning a blind eye at registration time.  In any event these domain names once recognized would seem to me to be good candidates for DMCA takedowns.

Comments are closed.