Almost no one on the Unicode List seems to "get" phishing

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!

Almost no one on the Unicode List seems to "get" phishing

  • Comments 24

The Unicode List is up to its old fun and games again (well, actually its the participants, not the list itself), and this time it is not about the Unicode BOM.

I talked a little about this problem when I was saying International Domain Names? The sign on the door says 'Gone Phishing'....

Then some people started really getting into it because a bunch of hackers "found" a homograph spoofing issue. They even registered an evil URL (www.pаypal.com -- the first "a" is U+0430, a CYRILLIC SMALL LETTER A) which in browsers that support the new IDN/punycode stuff becomes www.xn--pypal-4ve.com.

Then those folks at the Unicode List weighed in (in a thread with 116 posts the last time I looked)....

The "solution" that many people have touted involves a list of common cross-script items that might be expected (like Kana and Kanji). And then to show the actual punycode names, since that way people could tell they were being spoofed.

Anyone else see the flaw here?

The feature is for international domain names. If it were just ASCII then a confusing string would indeed warn users that bad things were going to happen. But if we were all using ASCII we wouldn't need IDN in the first place, now would we?

Doesn't it make the whole feature suck just a little bit for its target users if they are left seeing eird crap every time they go to a site that uses their native language for the URL?

I almost weighed into the thread to point out the obvious problems in approach but I did not want to add to the noise (and most likely be drowned out by the people who point out that there is no way to make it secure and how IDN will bring down the internet). So I did not become post #117.

Oops, a few more while I was typing this, mine would have been #120. Sometimes in this post-Kitty Genovese era in which we all live, it is better to not get involved....

 

This post brought to you by "а" (U+0430, a.k.a. CYRILLIC SMALL LETTER A)
A letter that is feeling quite popular these days and which would like to point out that this site is not ВӀоgs.Мsdn.соm/miсhкар no matter what the URL looks like...

Comment on the blather
Leave a Comment
  • Please add 2 and 5 and type the answer here:
  • Post
Blog - Comment List
  • I have absolutely no desire to receive email from or do business with anyone outside of the US. Give me the ability to filter out anything that does not originate from a specific country or region and I'll be happy. This includes 3rd party web sites and visitors/traffic to my web site.

    The average person simply does not need to be able to receive email from someone in Russia or Venezuela. By filtering everyting that does not originate in the US, you greatly reduce your attack surface.

    Give me a RELIABLE way to do this and the whole IDN thing becomes a non-issue.
  • Well, that is great for you. But how about the rest of the world?

    "We are not alone anymore" (and I do not mean that in an X-Files sense!)
  • That fixes nothing Matt; the fake paypal address could just as easily have been sent from inside the US.

    All this shows is that IDN is poorly thought out and you need to turn off support for it in your browser for the foreseeable future.
  • Hm, AFAIK the entered IDN is normalized before converted to Punycode (e.g. ß becomes ss)... couldn't mappings be added that map identically looking characters to the same code point (e.g. the small cyrillic a would get mapped to the plain old ASCII small a), for anti-phishing purposes?
  • I think we have two problems here:

    a) "Cross-scripting", when individual letters/symbols are replaced by their homographs in another script to fit in. The current paypal address is an example of that.

    b) When the complete domain name is represented in another script, while maintaining homography.

    Something like the cross item list you mention could solve a -- limit which character groups may appear in one domain. www.<strangestuff>.com should be allowed, while www.pa<strangestuff>pal.com should not.

    This would alleviate the problem a little. It does certainly not solve it, because of b and also because of
    c) íìïîı, or for example â in languages where you're used to seeing å (and have a shitty non-ClearType, low resolution, small size display and simply don't look closely)

    I don't see any easy ways out of this. One could speculate about a highligthing in the browser, in addition to the site security, of whether the site is in the browsing history. Hopefully, you're more careful if you log in for the first time and create your account or are setting up a new machine, while a phishing attack trying to get you to key in your account info would give a somehow distinctly different "experience" to the user.

    After all, highlighting of that a site that has already been visited has been around for ages for LINKS to said sites, adding it prominently in the browser UI could improve things.

    Just some stray thoughts, I know this is far from both the Unicode list and your field of direct interest.
  • "e.g. the small cyrillic a would get mapped to the plain old ASCII small a), for anti-phishing purposes?"

    That would mean that I can't go and register река.com (река (pronounced re-'kah) means river in Russian) since peka.com (with all Latin characters) is already taken. That would create inconvenience for anybody trying to register an IDN, and also open up new possibilities for squatting (what if I register my IDN first, before you get the chance to register all-Latin domain name?)
  • For a concrete example of point b, "paypal" can more or less convincingly be represented as "раура1" - that's five Cyrillic letters and a digit 'one'. With some fonts, you'll have a hard time telling the difference.

    Here is the list of Cyrillic letters that look similar to some Latin ones - have fun finding sites that could be spoofed:
    а е з к о р с у х
    In addition to these (that look similar in both upper- and lowercase), a few letters only look right when capitalized:
    В М Н Т Ь
    Well, the last one is a bit of a stretch.
  • I am a somewhat of a linguist but know nothing about character sets, Unicode etc, beyond the fact it exists.

    That said, there is next to no reason for having different language character sets in the same URL.

    Warning when you had mixed alphabets on the Firefox/Internet Explorer gold bar, and also prompting I guess, would solve 99.9% of the problem.
  • Well, there are many cases that ok:

    1) Japanese could reasonably have Hiragan, Katakana, and or Kanji and could reasonably mix the first or the second with the third (or with each other in some cases).

    2) A site about the talmud with a URL of www.תלמוד-on-the-web.com or Ελλας-fans.org (both examples from a post by Mark Shoulson). Note that the internet requires some things be in latin, and the other Latin additions are not unreasonable.
  • OK, point taken.

    But still, what is wrong with a gold bar warning in all such cases, with prompting by default with information (which can be turned off by advanced users)?
  • Sorry, obviously prompting is only necessary at secure sites.
  • A potential approach?
  • Gary, this is the kind of cool, well thought out, reasonable approaches to the problem which gets buried in with all of the dumbass arguments that people have about the problem on big mailing lists.

    Thanks for pointing it out, its a very cool approach!
  • Breaking news.

    Firefox will break IDN support by default. Those who want it will be able to easily enable it.

    see feedhouse.mozillazine.org
  • Hmmm.... this is in its own way just as bad though, isn't it? Going from one extreme to the other.

    Unless this is a temporary thing until they enable functionaliy to do IDN right. Otherwise Firefox is just saying that if you want IDN you will be screwed over by phishing.

    So Firefox either gets one thumb up for admitting they were overeager and working to slow down and do it right (to get two thumbs up you have to do all that the ferst time!) or two thumbs down if they are just panicking and turning off the functionality....
Page 1 of 2 (24 items) 12