<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>I'm not a Klingon (&lt;span style="font-family:pIqaD,code2000"&gt; &lt;/span&gt;) : IDN (Internationalized Domain Names)</title><link>http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx</link><description>Tags: IDN (Internationalized Domain Names)</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>The FUD of IDN and Homographs</title><link>http://blogs.msdn.com/shawnste/archive/2009/11/23/the-fud-of-idn-and-homographs.aspx</link><pubDate>Tue, 24 Nov 2009 00:13:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9927640</guid><dc:creator>shawnste</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/9927640.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=9927640</wfw:commentRss><description>&lt;P&gt;I was pointed to this article &lt;A href="http://www.microsofttranslator.com/BV.aspx?ref=Internal&amp;amp;a=http%3a%2f%2fwww.bortzmeyer.org%2fidn-et-phishing.html"&gt;http://www.microsofttranslator.com/BV.aspx?ref=Internal&amp;amp;a=http%3a%2f%2fwww.bortzmeyer.org%2fidn-et-phishing.html&lt;/A&gt;&amp;nbsp;about IDN and homographs, which points out that most of the fear around IDN and phishing is unfounded.&amp;nbsp; Seemed like a good reference (thanks, Mark), so I'm forwarding.&amp;nbsp; (For some reason Mark used a different translation engine though).&lt;/P&gt;
&lt;P&gt;Cross-tagged with EAI since the same concerns about homographs and phishing apply to email.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9927640" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category><category domain="http://blogs.msdn.com/shawnste/archive/tags/eMail+Address+Internationalization/default.aspx">eMail Address Internationalization</category></item><item><title>IDNA2008 / IDNAbis on Windows 7, Vista, Net, etc.</title><link>http://blogs.msdn.com/shawnste/archive/2009/10/27/idna2008-idnabis-on-windows-7-vista-net-etc.aspx</link><pubDate>Tue, 27 Oct 2009 18:04:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9913614</guid><dc:creator>shawnste</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/9913614.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=9913614</wfw:commentRss><description>&lt;P&gt;Some people have asked what they should do to support IDNA2008 on Microsoft platforms.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;We provide IdnToAscii() and related functions in the Windows SDK.&amp;nbsp; That's available natively on Vista+, and through idndl.dll on earlier platforms.&amp;nbsp; Idndl is shipped with IE 7, or through &lt;A href="http://www.microsoft.com/downloads/details.aspx?familyid=AD6158D7-DDBA-416A-9109-07607425A815&amp;amp;displaylang=en" mce_href="http://www.microsoft.com/downloads/details.aspx?familyid=AD6158D7-DDBA-416A-9109-07607425A815&amp;amp;displaylang=en"&gt;"Microsoft Internationalized Domain Names (IDN) Mitigation APIs"&lt;/A&gt; at the Microsoft Download Center.&lt;/P&gt;
&lt;P&gt;For .Net, the IdnMapping class provides IDNA2003 conversion since .Net V2.&lt;/P&gt;
&lt;P&gt;Obviously these APIs currently only support IDNA2003, however the interfaces won't change for IDNA2008 support.&amp;nbsp; I don't know what mechanism will be used to update for IDNA2008 support, but applications should continue to use the exposed APIs for consistent support across the platform.&amp;nbsp; That way if the system gets updated to IDNA2008, the application will be able to take advantage of the updated support.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9913614" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item><item><title>Oversimplification of EAI/IMA (International eMail Addresses)</title><link>http://blogs.msdn.com/shawnste/archive/2009/08/18/oversimplification-of-eai-ima-international-email-addresses.aspx</link><pubDate>Wed, 19 Aug 2009 00:58:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9874662</guid><dc:creator>shawnste</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/9874662.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=9874662</wfw:commentRss><description>&lt;P&gt;A couple months ago I blogged about EAI &lt;A href="http://blogs.msdn.com/shawnste/archive/2009/06/04/email-address-internationalization-internationalized-email-addresses-eai-ima.aspx" mce_href="http://blogs.msdn.com/shawnste/archive/2009/06/04/email-address-internationalization-internationalized-email-addresses-eai-ima.aspx"&gt;Email Address Internationalization/Internationalized Email Addresses (EAI/IMA)&lt;/A&gt;&amp;nbsp;and felt like blogging again.&lt;/P&gt;
&lt;P&gt;China's been very interested in non-ASCII email addresses for some time, and is working hard to adopt the EAI standard.&amp;nbsp; I've heard a target of November 2009 for that standard.&amp;nbsp; &lt;A href="http://www.china.org.cn/china/sci_tech/2008-09/27/content_16544162.htm"&gt;http://www.china.org.cn/china/sci_tech/2008-09/27/content_16544162.htm&lt;/A&gt;&amp;nbsp;briefly addresses EAI.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Oversimplification of EAI&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The basic concept of EAI is "just" to use UTF-8 for email.&amp;nbsp; Most software can comply just by allowing Unicode in their email addresses.&amp;nbsp;&amp;nbsp;Using UTF-8 is&amp;nbsp;reasonably straight forward, and most of the details are just around compatibility with existing mail standards.&amp;nbsp; The IETF working group has a page at &lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;A href="http://www.ietf.org/dyn/wg/charter/eai-charter.html"&gt;http://www.ietf.org/dyn/wg/charter/eai-charter.html&lt;/A&gt;.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;STRONG&gt;Local Part of the Email Address&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;The local part of an email address is the user account part.&amp;nbsp; Often times servers allow it to be case-insensitive, however it can also be case-sensitive.&amp;nbsp; Similarly EAI allows the servers to define any mappings of the local part that are appropriate for that organization.&amp;nbsp; Some may choose to do case mapping similar to existing case-insensitive servers.&amp;nbsp; A different mapping, like Turkish behavior for i and I is possible.&amp;nbsp; Another option would be to perform normalization like NFC or NFKC on the name.&amp;nbsp; Width mapping and aliases are possible.&amp;nbsp; Just like now, clients would just use the names given and let the recipient's mail server figure it out.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;STRONG&gt;Domain Part&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;EAI allows Unicode (UTF-8) for the entire address, so special mapping isn't necessary.&amp;nbsp; Of course if the domain doesn't have a valid registration, eg: isn't valid IDN, then it won't work, but that's not really an email protocol issue.&amp;nbsp; EAI uses UTF-8 instead of "punycode" for domain names.&amp;nbsp; Punycode only happens when "downgrading."&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;STRONG&gt;Negotiation&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;Mostly, "just using UTF-8" is pretty simple, but for backward compatibility, EAI aware servers and clients will need to negotiate their protocols.&amp;nbsp; For SMTP, the UTF8SMTP does this.&amp;nbsp; EAI aware servers can exchange the UTF8SMTP extension and agree to communicate in UTF-8.&amp;nbsp; If the server doesn't provide that flag, then the client's have to use a different mechanism.&amp;nbsp; The other protocols have similar handshaking.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;STRONG&gt;Downgrading&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;All email clients and servers aren't going to instantly become Unicode aware, so there is a downgrading concept for compatibility.&amp;nbsp; Downgrade is the area with the most churn in the experimental standards, but the basic concept remains the same.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;If you have an EAI aware server and you try to talk to an unaware system, you'll need to fallback to the legacy protocols and encoding mechanisms.&amp;nbsp; Effectively this means that EAI accounts will need an ASCII alias so that if an EAI mail fails, it can be resent using the ASCII alias and MIME encodings.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;To a legacy recipient, such a mail would appear as any other legacy email, and replies would go to the sender's ASCII alias.&amp;nbsp; The receiving server would need to recognize that the ASCII and Unicode EAI aliases were for the same account and route the mail appropriately.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;There was some discussion of providing additional data that allows reconstructing a downgraded mail, but most of those techniques seem to break at least some legacy clients and have additional problems.&amp;nbsp; My feeling is also that if a client knows how to reconstruct a downgraded mail, it also knows EAI anyway, so likely the mail would never be downgraded, so the additional complexity is unnecessary.&amp;nbsp; I think it's likely that the initial standards will only specify minimal downgrading and not the ability to reconstruct a downgraded message.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;STRONG&gt;Status&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: 'MS Mincho'; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Mangal; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: JA; mso-bidi-language: AR-SA"&gt;Of course the IETF RFCs are still experimental and China hasn't published their standards yet, but my oversimplification probably won't change much in the final version.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9874662" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category><category domain="http://blogs.msdn.com/shawnste/archive/tags/eMail+Address+Internationalization/default.aspx">eMail Address Internationalization</category></item><item><title>Unicode, IDN (IDNA), EAI (IMA) and Homograph Security</title><link>http://blogs.msdn.com/shawnste/archive/2009/07/07/unicode-idn-idna-eai-ima-and-homograph-security.aspx</link><pubDate>Tue, 07 Jul 2009 22:33:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9823241</guid><dc:creator>shawnste</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/9823241.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=9823241</wfw:commentRss><description>&lt;P&gt;I wrote about IDN &amp;amp; Security before &lt;A href="http://blogs.msdn.com/shawnste/archive/2005/03/03/384692.aspx"&gt;http://blogs.msdn.com/shawnste/archive/2005/03/03/384692.aspx&lt;/A&gt; but thought I'd share some of my more&amp;nbsp;updated views about security of URLs/IDN/Unicode/Email addresses.&lt;/P&gt;
&lt;P&gt;People haven't really bothered much with DNS&amp;nbsp;or character based&amp;nbsp;security when it was limited to ASCII.&amp;nbsp; I'm not sure if this because&amp;nbsp;people just&amp;nbsp;didn't think about it, or if they thought there wasn't a problem or whatever.&amp;nbsp; What security attacks happen have been regarded more as "oh, that's curious" rather than a real concern.&amp;nbsp; Basically there seems to be a presumption that a script, like&amp;nbsp;the ASCII subset of Latin,&amp;nbsp;are inherintly secure.&amp;nbsp; Therefore it would seem reasonable that if ASCII Latin can be secure, then other scripts, or mixed script environments have homographs, then those scenarios must be insecure and are therefore broken.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Latin and ASCII aren't Secure&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The problem with that logic is that it's flawed.&amp;nbsp; Homographs exist in Latin/ASCII, however &lt;A href="http://rnicrosoft.com/"&gt;http://rnicrosoft.com&lt;/A&gt; tends to be regarded as "quaint and amusing" rather than a security problem.&amp;nbsp; (There used to be a web page there, dunno what happened).&amp;nbsp; Similarly g00gle or MlCROSOFT or whatnot can all happen in ASCII.&amp;nbsp; Some things can be done to ASCII to limit the risk, such as choosing fonts or making things lowercase, but that's not always possible.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Strings are Typed and Read by Humans&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Even if the scripts themselves are perfect, the strings we use with the scripts are not.&amp;nbsp; For example, users have to type them in, and they may or may not use upper or lower case (in cased scripts).&amp;nbsp; I heard one computer expert indicate that users should just figure out how to enter URLs in lower case, in Unicode Normalization Form C.&amp;nbsp; (Instead of addressing the problem we should educate all the users).&amp;nbsp; I wish he were joking.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Depending on the context, there are things you can do to ASCII only strings that can confuse users.&amp;nbsp; For example &lt;A href="http://microsoft.secure.com/"&gt;http://microsoft.secure.com&lt;/A&gt; isn't going to necessarily go to a Microsoft site.&amp;nbsp; &lt;A href="http://secure.com/microsoft.com"&gt;http://secure.com/microsoft.com&lt;/A&gt; is a similar trick.&lt;/P&gt;
&lt;P mce_keep="true"&gt;DNS isn't the only subject of these problems.&amp;nbsp; I get mail all the time in the form &lt;A href="mailto:company@mail-servicing.com"&gt;company@mail-servicing.com&lt;/A&gt; where "company" is a legitimate company and "mail-servicing" is the people they've contracted to send their bulk mail.&amp;nbsp; So it's impossible for me to determine if that's actually a good address for the company.&amp;nbsp; Even worse is when the mail contains a link.&amp;nbsp; "Provide feedback about your recent warrenty support to&amp;nbsp;&lt;A href="http://feedback-surveys.com/OEMsupport"&gt;http://feedback-surveys.com/OEMsupport&lt;/A&gt;"&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Strings aren't Even Strings&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Sometimes what we click on isn't even related to where we end up going.&amp;nbsp; We've all seen phishing attacks that are look like &lt;A href="http://207.46.232.182/" mce_href="http://207.46.232.182"&gt;mybank.com&lt;/A&gt; but go to an IP address that no one can tell if it's real or not.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Strings aren't Always Specific&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;In some environments strings often aren't even very specific.&amp;nbsp; I'm pretty certain that if I want a live.com account that I won't get shawn or shawns or even shawnsteele.&amp;nbsp; Instead I'll be shawn7935 or something.&amp;nbsp; There's another Shawn here at work that gets some of my mail from simple typos, let alone malicious intent.&amp;nbsp; There's a pretty good chance that&amp;nbsp;Fred8374&amp;nbsp;could pass himself off as Fred8347 if he really wanted to.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P mce_keep="true"&gt;We've even&amp;nbsp;been trained that strings&amp;nbsp;don't even have to be close.&amp;nbsp;&amp;nbsp;If I buy something on eBay from "JoesBestStuff", it takes some faith for me to pay SallySewing7@live.com (apologies if those are real accounts).&amp;nbsp; I've been quite amused at the varation betwee "seller's name" and the email sometimes.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Even when we expect them to be the same, there are many spellings for some words.&amp;nbsp; "Mohammed" is often transliterated differently to Latin.&amp;nbsp; Unless you deal with one quite often, you're likely to assume most spellings are the same.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Globalization of Strings&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Now we've figured out that strings aren't secure, and we'll get tricked even if they were secure.&amp;nbsp; How does that change in a global environment, such as with IDNA or EAI/IMA strings?&amp;nbsp; Not much.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Sticking to Latin, you suddenly gain a bunch of look-alikes (homographs) by allowing non-ASCII values.&amp;nbsp; Strings like mícrosoft, mïcrosoft and mıcrosoft are all “close enough” to be convused, particularly at a quick glance, even more so if the user is conditioned to expect the "real" string.&amp;nbsp; E.g:&amp;nbsp; "Important security update for windows, go download it from Mícrosoft.com"&amp;nbsp; We're already expecting to see microsoft, so the few different pixels are easily missed.&lt;/P&gt;
&lt;P mce_keep="true"&gt;For other scripts the problem can be much more severe.&amp;nbsp; Complex scripts can have simliar appearing strings, and many include numerous characters.&amp;nbsp; Chinese for example has enough characters available that it can be fairly easy in some cases to find a rare character that is similar in appearance to a common character which people have been preconditioned to expect.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;"I Solved Homographs"&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;This leads to a&amp;nbsp;typical problem for developers, particularly "Western" Latin-script based developers.&amp;nbsp; We tend to expect that if we solve script mixing so that we can't mix up Cyrillic and Latin, that we've solved the homograph problem.&amp;nbsp; Instead, we've barely scratched the surface and effectively buried our heads in the sand.&lt;/P&gt;
&lt;P mce_keep="true"&gt;In some cases the "solution" can be worse than the problem.&amp;nbsp; For example, some browsers decide that I don't understand Cyrillic since my user locale is en-US (or Klingon), and then prints out punycode.&amp;nbsp; That's mildly useful to me as a warning, however it does the same thing for Chinese.&amp;nbsp; It's very unlikely that I'm going to confuse Chinese with Latin, but I'll get Punycode in the address bar anyawy.&amp;nbsp; Now I have no chance of finding out what the actual URL is supposed to look like.&amp;nbsp; Punycode is all gibberish, but I could probably decipher a Chinese glyph enough to see if it looked similar to what I expected.&amp;nbsp; With any punicode strings, I don't even need homographs to confuse me, any Chinese would look the same.&amp;nbsp; For that matter I could be expecting Chinese, but it could actually be Japanese or Korean, or Cyrillic for that matter.&amp;nbsp; I'm not trying to say that the browsers' approach is "wrong", just that&amp;nbsp;while this approach&amp;nbsp;may address some problems,&amp;nbsp;it can also cause new ones.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Most of the "solutions" to Homographs that I've seen are similar in my opinion.&amp;nbsp; They may address a specific issue, but don't solve the entire problem globally.&amp;nbsp; I also think some approaches are unnecessarily limiting.&amp;nbsp; Mitigations that reduce the surface area for an attack are useful, however developers should recognize the limitations of those approaches and make sure they aren't spending tons of effort "shutting the window, but leaving the front door wide open."&amp;nbsp; That only provides a false sense of security, which can be far worse than the original problem.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Comprehensive Solutions&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;So instead of thinking that strings like URLs are inherintly secure somehow if they're ASCII, and focusing on the differences from ASCII, like Cyrillic homographs, we should rather assume that ANY URL might not take us to a place we want to go.&amp;nbsp; Even an ASCII one.&lt;/P&gt;
&lt;P mce_keep="true"&gt;A much better solution to URL security is one that addresses the entire system rather than focusing on Homographs.&amp;nbsp; IE, for example, detects malicious web sites (I don't know exactly how it works, but I gather there's blacklisting and bad&amp;nbsp;behavior detection, kinda like virus checking for web sites).&amp;nbsp; This is far more effective than preventing mixed scripts, and has the advantage of working with ASCII only URLs.&amp;nbsp; It also does a good job against homographs, pretty much making the punicode-in-the-address-bar irrelevent.&amp;nbsp; It also works with many forms of attack, even non-obvious ones.&amp;nbsp; &lt;/P&gt;
&lt;P mce_keep="true"&gt;My opinion is that if you do a "good job" of detecting any phishing/spoofing type web site, even ASCII-only, then the need for Homograph detection is much reduced.&amp;nbsp; And if you can't do that, then the attackers will merely add an extra label or something to get around your homograph detection.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Mitigation by Protocol&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;For things like IDN, it is interesting to consider how the protocol itself approaches security.&amp;nbsp; Some things are "obvious" as not being interesting for a name.&amp;nbsp; Compatibility characters, control characters, etc. could somewhat readily be excluded.&amp;nbsp; Some things are generally considered technically "obvious" to some users, but may frustrate others.&amp;nbsp; It is generally considered that lower casing the DNS name causes less confusing (can't mix up lower case l with capital I), but I doubt that AAA.com prefers lower casing.&amp;nbsp; Similarly IDNA2003 allows unicode "symbols,"&amp;nbsp;which are widely regarded as being useless, particularly since they're hard to type, but I suspect that someone would like I♥NY.&amp;nbsp; So there's a gray area that gets a bit confusing.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Consideration for other protocols is similar.&amp;nbsp; EAI (email) is interesting because it basically defers "correctness" to the registrar (whoever runs the mail server).&amp;nbsp; IDN provides some restriction by protocol and more at the registrar level.&lt;/P&gt;
&lt;P mce_keep="true"&gt;One problem with restricting valid characters at the protocol level is that it works OK in a small set, but once you get to a global audiance the rules get very complicated.&amp;nbsp; Domain names allowed (most) English names when they were restricted to ASCII, but German and French had difficulties.&amp;nbsp; With IDN additional languages are supported, but perhaps the needs of an English registrar and a German one differ.&amp;nbsp; A complete set of rules applicable world-wide for all strings in all languages may not be possible (eg: turkish i), but even if they were, they would be very complex and difficult to implement for every application adopting a protocol.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Mitigation by Registrar&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Restriction at the registrar can be more effective, though perhaps less consistent.&amp;nbsp; A registrar could be like a domain name registrar, but for these purposes you could also think of the person that assigns user accounts at a business, or&amp;nbsp;email address registration from your ISP.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Registrars can restrict languages to those used in the country they support.&amp;nbsp; They can bundle or block&amp;nbsp;homographs or alternate spellings (like Traditional and Simplified Chinese spellings of the same word.)&amp;nbsp; In a business they could have certain rules. &amp;nbsp;First name, last initial, or first initial, last name is common for user accounts in many companies, at least until they get too many employees).&lt;/P&gt;
&lt;P mce_keep="true"&gt;IDN has some restrictions by protocol, but allows much tighter restriction at the registrar level.&amp;nbsp; Ironically, a label at a lower level could then have different "rules" than at the higher level.&amp;nbsp; EAI allows the local part to be determined entirely by the provider/registrar rather than the protocol.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Rules at the "registrar" level can still be very complex for a complete set of rules, however cases with conceptual differences can still be adopted as applicable for the registrar's environment, whereas a protocol level rule has to either be too flexible, or disallow one registrar's legitimate scenario.&amp;nbsp; Rules at the registrar level can also be adjusted more readily than at the protocol level.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Mitigation by Application&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;An application can also decide to be more comprehensive than the protocol.&amp;nbsp; An application may also have more information,&amp;nbsp;such as blacklists or user settings.&amp;nbsp; They can make choices for some users like "they only read English, so don't bother with Cyrillic then," and a different choice for a different user.&amp;nbsp; Applications can also potentially be grayer in their behavior.&amp;nbsp; Instead of "allowing" and "disallowing" strings, they can say "gee, I'm not so sure, you really want to do this?", or flag it and continue.&amp;nbsp; They can also be dynamic, such as when you add a sender to a junk mail filter.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;IDN vs EAI/IMA vs Unicode&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Pretty much this entire "strings aren't secure"&amp;nbsp;concept applies to any Unicode (or for that matter any other code page) string.&amp;nbsp; That could be an IDN domain name, an EAI mail address, a user account name, etc.&amp;nbsp; Some environments may be more ameniable to certain solutions than others, but the types of attacks that impact a Unicode&amp;nbsp;IDN label could also succeed with the local (user name) part of a Unicode&amp;nbsp;EAI&amp;nbsp;email address.&amp;nbsp; The general concepts are portable.&lt;/P&gt;
&lt;P mce_keep="true"&gt;I used IDN heavily as an example, but the same things happen to EAI addresses, user account names, logon credentials, etc.&amp;nbsp; Anything that uses Unicode, or strings, needs to realize that strings can't be expected to be inherintly "secure."&lt;/P&gt;
&lt;P mce_keep="true"&gt;There's more info on some thinking about Unicode Security in Unicode TR#39 &lt;A href="http://www.unicode.org/draft/reports/tr39/tr39.html"&gt;http://www.unicode.org/draft/reports/tr39/tr39.html&lt;/A&gt;.&amp;nbsp; TR39 addresses the appropriate use of Unicode characters and homographs, but this is at best a mitigation of the more general security concerns of identifier strings.&amp;nbsp; Phishing and spoofing would still happen even in plain ASCII.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Hope this was helpful, or at least interesting,&lt;/P&gt;
&lt;P mce_keep="true"&gt;Shawn&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9823241" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category><category domain="http://blogs.msdn.com/shawnste/archive/tags/Unicode+and+Code+Pages_2F00_Encodings/default.aspx">Unicode and Code Pages/Encodings</category><category domain="http://blogs.msdn.com/shawnste/archive/tags/eMail+Address+Internationalization/default.aspx">eMail Address Internationalization</category></item><item><title>Email Address Internationalization / Internationalized eMail Addresses (EAI/IMA)</title><link>http://blogs.msdn.com/shawnste/archive/2009/06/04/email-address-internationalization-internationalized-email-addresses-eai-ima.aspx</link><pubDate>Fri, 05 Jun 2009 03:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9700586</guid><dc:creator>shawnste</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/9700586.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=9700586</wfw:commentRss><description>&lt;P&gt;With the &lt;A href="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx" mce_href="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx"&gt;IDN&lt;/A&gt;&amp;nbsp;work for Internationalized Domain Names using characters beyond ASCII, it is only natural to tackle the problem of Internationalized Internet eMail.&lt;/P&gt;
&lt;P&gt;Some smart people have been working on an IETF working group to figure out how non-ASCII email would work, and I encourage people to take a look: &lt;A href="http://www.ietf.org/html.charters/eai-charter.html"&gt;http://www.ietf.org/html.charters/eai-charter.html&lt;/A&gt;.&amp;nbsp; That page has the charter, a list of drafts and RFCs that have already been produced, and links to the IMA working group mailing list.&lt;/P&gt;
&lt;P&gt;Assuming you're an ASCII/Latin character user, imagine having to type all your URL's in Chinese, or Cyrillic (or if you know those, imagine typing everything in Klingon, eg: &lt;SPAN style="FONT-FAMILY: pIqaD, Code2000; FONT-SIZE: 11pt"&gt; &lt;/SPAN&gt;)&amp;nbsp; In many cultures, that's what it's like to use the web.&amp;nbsp; Some users may not be literate in Latin letters, or may have to do a lot of hunt-n-pecking.&amp;nbsp; EAI should help address that problem.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;How EAI/IMA Works&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The basic idea of the EAI working group is to stick email in UTF-8 instead of ASCII.&amp;nbsp; UTF-8 works pretty well in many systems, and many mailers already handle 8 bit encodings, so this is a pretty "simple" solution.&amp;nbsp; Unfortunately email touches a lot of places, so there're a lot of protocols that need updates (eg: STMP, POP, mailto:, etc.)&amp;nbsp; Additionally everyone knows that UTF-8 email can't happen instantly, so there needs to be a system for existing servers to talk to UTF-8 aware ones, which leads to a few more RFCs.&lt;/P&gt;
&lt;P&gt;UTF8SMTP allows the servers to make decisions about the "local" part of the email address, which allows for groups to fit their own needs.&amp;nbsp; The backwards compatibility means that users also need ASCII addresses, as they do today.&amp;nbsp; The server would alias from one address to another so mail to &lt;SPAN style="FONT-FAMILY: pIqaD, Code2000; FONT-SIZE: 11pt"&gt;&lt;/SPAN&gt;@microsoft.com could map to my normal mailbox, and I'd only have one mail.&amp;nbsp; Unfortunately that simple concept means that places that didn't have to worry about aliasing before may now have to consider aliases and fallback addresses.&amp;nbsp; Contact lists may need to have both forms, etc.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Current Status of EAI/IMA&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Currently there are several experimental RFCs, and several people&amp;nbsp;have&amp;nbsp;created&amp;nbsp;interoperating systems that work with each other to demonstrate the feasibility of UTF8SMTP.&amp;nbsp;&amp;nbsp; The next step is to move towards a standards track process, which could happen "reasonably quickly".&amp;nbsp; I'm optimistic that the standards will move quickly, but sometimes these things take a while.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;So Who's Gonna Use It?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;There are a lot of markets where ASCII doesn't work very well for various reasons.&amp;nbsp; Even when people have ASCII aliases, it may seem artificial, and there may be a desire for an email that reflects them or their country.&amp;nbsp; There are many ISPs in countries like Korea, China, &amp;amp; Japan that are very eager to be able to send email in a native script.&amp;nbsp; Some governments like Russia and China are weighing in on the importance of being able to send mail and use the Internet&amp;nbsp;in their script.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;What's&amp;nbsp;IMA Mean To Me As&amp;nbsp;a Software Developer?&lt;/STRONG&gt;&amp;nbsp;(who cares?)&lt;/P&gt;
&lt;P&gt;If you are a developer, then you may run into IMA addresses.&amp;nbsp; Even if your app doesn't explicitly deal with mail, there may be a place for email to sneak into your app.&amp;nbsp; For example, IDN and domain names don't really have much to do with Word or PowerPoint, yet they often show up in documents and presentations.&amp;nbsp; I could imagine an author address in metadata, such as a photographer contact in a photo's metadata.&amp;nbsp; Many apps probably will run into IMA addresses whether they realize it or not.&lt;/P&gt;
&lt;P&gt;Anyway, I have been thinking about this space for a while and thought I'd share my observations.&amp;nbsp; It's worth considering what impact IMA will have on your application (while you're at it, how's IDN behave?)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;-Shawn&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9700586" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category><category domain="http://blogs.msdn.com/shawnste/archive/tags/eMail+Address+Internationalization/default.aspx">eMail Address Internationalization</category></item><item><title>Rambling about RFC 4690 and IDN</title><link>http://blogs.msdn.com/shawnste/archive/2006/10/10/Rambling-about-RFC-4690-and-IDN.aspx</link><pubDate>Wed, 11 Oct 2006 04:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:815413</guid><dc:creator>shawnste</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/815413.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=815413</wfw:commentRss><description>&lt;P&gt;There's a reasonably new RFC 4690 (&amp;nbsp;&lt;A href="http://www.ietf.org/rfc/rfc4690.txt"&gt;http://www.ietf.org/rfc/rfc4690.txt&lt;/A&gt;&amp;nbsp;) that raises a bunch of questions about IDN names and Unicode regarding such things as confusable characters and other issues.&amp;nbsp; Some of those are also discussed in Unicode TR36 "Unicode Security Considerations" &lt;A href="http://www.unicode.org/reports/tr36/"&gt;http://www.unicode.org/reports/tr36/&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;One thing that confuses me about the discussion regarding IDN's weaknesses is that most of these issues wouldn't be a problem if the registrar's didn't register the names.&amp;nbsp; So from the client application's perspective, it doesn't matter much if a particular IDN name is legal or not, so long as it does the appropriate mapping.&amp;nbsp; If its not a legal IDN name, then web browsers or other apps won't respond to it because the DNS system won't return any records.&amp;nbsp; So from the client perspective I'm not sure what all the fuss is about.&lt;/P&gt;
&lt;P&gt;Obviously the registrars need to be able to disallow bad names, and guidelines such as TR36 help.&amp;nbsp; Registrars also have the opportunity to be pickier than the standards.&amp;nbsp; Many only allow certain scripts or combinations relevent for their TLD.&lt;/P&gt;
&lt;P&gt;For similar reasons the migration of IDN to Unicode 5 or newer doesn't bother me.&amp;nbsp; Nobody's going to allow unassigned code points in their domain name.&amp;nbsp; If a Unicode 5 code point appears in a future IDN name, and the DNS system resolves it, it would follow that it is a valid code point, even if some client app only understood Unicode 3.2.&amp;nbsp; That doesn't help case mapping, but&amp;nbsp;a pure Unicode 5&amp;nbsp;name should be easily understandable and resolvable even by downlevel clients.&lt;/P&gt;
&lt;P&gt;Anyway, it seems to me that there's more fuss about these issues than there needs to be.&amp;nbsp; Probably if the IETF IDN folks and the Unicode folks worked with each other to resolve these issues then IDN would get updated faster.&amp;nbsp; Right now it seems like each group is reacting on its own.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=815413" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item><item><title>IDN Test URLs</title><link>http://blogs.msdn.com/shawnste/archive/2006/09/14/754882.aspx</link><pubDate>Fri, 15 Sep 2006 00:57:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:754882</guid><dc:creator>shawnste</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/754882.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=754882</wfw:commentRss><description>&lt;P&gt;A colleague gave me some URLs to use for testing IDN.&amp;nbsp; I don't vouch for any of these sites, and some are obvious ads and some may be inappropriate (since I can't read them), but maybe they'll be helpful for testing.&lt;/P&gt;
&lt;P&gt;Additionally some sites could be phishing, spoofing or have malicious content, so please use this list carefully.&lt;/P&gt;
&lt;TABLE style="FONT-SIZE: smaller"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://äöüss.com/"&gt;äöüß.com&lt;/A&gt;&lt;/TD&gt; 
&lt;TD&gt;&lt;A href="http://äöüss.com/"&gt;xn--ss-uia6e4a.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;IDN Tool in German&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://visegrád.com/"&gt;visegrád.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://visegrád.com/"&gt;xn--visegrd-mwa.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;Hungarian&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://házipatika.com/"&gt;házipatika.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://házipatika.com/"&gt;xn--hzipatika-01a.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;Hungarian&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://айкидо.com/"&gt;айкидо.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://айкидо.com/"&gt;xn--80aildf0a.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;Bulgarian&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.خداوند.com/"&gt;www.خداوند.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.خداوند.com/"&gt;www.xn--mgbndb8il.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;Farsi&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.çukurova.com/"&gt;www.çukurova.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.çukurova.com/"&gt;www.xn--ukurova-txa.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;Turkish&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://nixieröhre.nixieclock-tube.com/"&gt;nixieröhre.nixieclock-tube.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://nixieröhre.nixieclock-tube.com/"&gt;xn--nixierhre-57a.nixieclock-tube.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;German&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://caxap.ru/"&gt;caxap.ru&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://caxap.ru/"&gt;caxap.ru&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;homograph in dotRu&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://игроshop.com/"&gt;игроshop.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://игроshop.com/"&gt;xn--shop-k4d3a7bp.com/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;mixed script in dotCom&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://北京2008旅行.com/"&gt;北京2008旅行.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://北京2008旅行.com/"&gt;xn--2008-4x5f27utu0bmz1d.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://רישוםב99סנט.com/"&gt;רישוםב99סנט.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://רישוםב99סנט.com/"&gt;xn--99-xldpsb3a1ai7dm.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://महरोत्रा.com/"&gt;महरोत्रा.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://महरोत्रा.com/"&gt;xn--h2btgb6b7a7fra.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://தேடல்.com/"&gt;தேடல்.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://தேடல்.com/"&gt;xn--mlcj7bwe6a.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://தேடு.com/"&gt;தேடு.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://தேடு.com/"&gt;xn--mlcj2gwa.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://தமிழ்நாடு.com/"&gt;தமிழ்நாடு.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://தமிழ்நாடு.com/"&gt;xn--mlcjmx4a2deu8h.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://வலைப்பூ.com/"&gt;வலைப்பூ.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://வலைப்பூ.com/"&gt;xn--xlcawl2e7azb.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://வலைப்பதிவு.com/"&gt;வலைப்பதிவு.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://வலைப்பதிவு.com/"&gt;xn--rlcla4aoe3er1d9b.com&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://österreich.ac/"&gt;österreich.ac/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://österreich.ac/"&gt;xn--sterreich-z7a.ac/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.münchhausen.at/"&gt;www.münchhausen.at/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.münchhausen.at/"&gt;www.xn--mnchhausen-9db.at/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://kernöl.cc/"&gt;kernöl.cc/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://kernöl.cc/"&gt;xn--kernl-mua.cc/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.rêve.ch/"&gt;www.rêve.ch/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.rêve.ch/"&gt;www.xn--rve-fma.ch/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.hypermarchés.ch/"&gt;www.hypermarchés.ch/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.hypermarchés.ch/"&gt;www.xn--hypermarchs-kbb.ch/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.rüüsbier.ch/"&gt;www.rüüsbier.ch/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.rüüsbier.ch/"&gt;www.xn--rsbier-3yaa.ch/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://山东大学.cn/"&gt;山东大学.cn&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://山东大学.cn/"&gt;xn--xhq02ykwbp4a.cn&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://天蓝蓝.cn/"&gt;天蓝蓝.cn&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://天蓝蓝.cn/"&gt;xn--rssy03ha.cn&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://腊味.cn/"&gt;腊味.cn/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://腊味.cn/"&gt;xn--btr765h.cn/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://无线扩音机.cn/"&gt;无线扩音机.cn/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://无线扩音机.cn/"&gt;xn--fquz0fy3ac29c041a.cn/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://时代互联.cn/"&gt;时代互联.cn/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://时代互联.cn/"&gt;xn--blq9g996dr9x.cn/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.bärmasens.de/"&gt;www.bärmasens.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.bärmasens.de/"&gt;www.xn--brmasens-0za.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.querschnittlähmung.de/"&gt;www.querschnittlähmung.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.querschnittlähmung.de/"&gt;www.xn--querschnittlhmung-1qb.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.düsseldorf-airport.de/"&gt;www.düsseldorf-airport.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.düsseldorf-airport.de/"&gt;www.xn--dsseldorf-airport-22b.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.möhr.de/"&gt;www.möhr.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.möhr.de/"&gt;www.xn--mhr-sna.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.travemünde.de/"&gt;www.travemünde.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.travemünde.de/"&gt;www.xn--travemnde-v9a.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://golf.ausrüstung.de/"&gt;golf.ausrüstung.de&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://golf.ausrüstung.de/"&gt;golf.xn--ausrstung-t9a.de&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.hügö.de/"&gt;www.hügö.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.hügö.de/"&gt;www.xn--hg-gkaw.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://gratis-klingeltöne.2t2.de/"&gt;gratis-klingeltöne.2t2.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://gratis-klingeltöne.2t2.de/"&gt;xn--gratis-klingeltne-e0b.2t2.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://köln.studenten-wohnung.de/"&gt;köln.studenten-wohnung.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://köln.studenten-wohnung.de/"&gt;xn--kln-sna.studenten-wohnung.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.polizeipuppenbühne.de/"&gt;www.polizeipuppenbühne.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.polizeipuppenbühne.de/"&gt;www.xn--polizeipuppenbhne-g3b.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.vermögensberatung-berlin.de/"&gt;www.vermögensberatung-berlin.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.vermögensberatung-berlin.de/"&gt;www.xn--vermgensberatung-berlin-blc.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://berufsunfähigkeitsversicherung.easy-worxx.de/"&gt;berufsunfähigkeitsversicherung.easy-worxx.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://berufsunfähigkeitsversicherung.easy-worxx.de/"&gt;xn--berufsunfhigkeitsversicherung-8pc.easy-worxx.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.printout-für-experten.de/"&gt;www.printout-für-experten.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.printout-für-experten.de/"&gt;www.xn--printout-fr-experten-yec.de/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://hansjørn.dk/"&gt;hansjørn.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://hansjørn.dk/"&gt;xn--hansjrn-u1a.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://æøå.dk-hostmaster.dk/"&gt;æøå.dk-hostmaster.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://æøå.dk-hostmaster.dk/"&gt;xn--5cab8c.dk-hostmaster.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.møgelbjerg.dk/"&gt;www.møgelbjerg.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.møgelbjerg.dk/"&gt;www.xn--mgelbjerg-l8a.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.lyø-ølejr.dk/"&gt;www.lyø-ølejr.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.lyø-ølejr.dk/"&gt;www.xn--ly-lejr-r1ab.dk/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.sää.fi/"&gt;www.sää.fi/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.sää.fi/"&gt;www.xn--s-0faa.fi/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://γλωσσολογια.gr/"&gt;γλωσσολογια.gr/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://γλωσσολογια.gr/"&gt;xn--mxadayia1ab7aa8d.gr/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.biergläser.info/"&gt;www.biergläser.info/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.biergläser.info/"&gt;www.xn--bierglser-02a.info/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.dachgepäckträger.info/"&gt;www.dachgepäckträger.info/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.dachgepäckträger.info/"&gt;www.xn--dachgepcktrger-cibe.info/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://€.linux.it/"&gt;€.linux.it&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://€.linux.it/"&gt;xn--lzg.linux.it&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://味の素.jp/"&gt;味の素.jp/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://味の素.jp/"&gt;xn--u9j479hu21a.jp/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://セガ.jp/"&gt;セガ.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://セガ.jp/"&gt;xn--mck3a.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://倉敷.jp/"&gt;倉敷.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://倉敷.jp/"&gt;xn--0vq553c.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://フジテレビ.jp/"&gt;フジテレビ.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://フジテレビ.jp/"&gt;xn--yck2a1bf1j.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://角川書店.jp/"&gt;角川書店.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://角川書店.jp/"&gt;xn--5rt1pr1svu1b.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://東京理科大学.jp/"&gt;東京理科大学.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://東京理科大学.jp/"&gt;xn--1lq68wkwbj6ugkpigi.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://産経新聞.jp/"&gt;産経新聞.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://産経新聞.jp/"&gt;xn--efvu75ac8f49c.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://じゃらん.jp/"&gt;じゃらん.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://じゃらん.jp/"&gt;xn--78j9dsa4b.jp&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://정통부.kr/"&gt;정통부.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://정통부.kr/"&gt;xn--or3bn7qhwh.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://한글.kr/"&gt;한글.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://한글.kr/"&gt;xn--bj0bj06e.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://이효리.kr/"&gt;이효리.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://이효리.kr/"&gt;xn--oy2b15wvzl.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://정보통신부.kr/"&gt;정보통신부.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://정보통신부.kr/"&gt;xn--on3b2jr1lk6fb1n.kr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://alėja.lt/"&gt;alėja.lt/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://alėja.lt/"&gt;xn--alja-wva.lt/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://tūdaliņ.lv/"&gt;tūdaliņ.lv/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://tūdaliņ.lv/"&gt;xn--tdali-d8a8w.lv/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://blåbärsbröd.idn.museum/"&gt;blåbärsbröd.idn.museum/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://blåbärsbröd.idn.museum/"&gt;xn--blbrsbrd-2zai7q.idn.museum/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.nårsk.no/"&gt;www.nårsk.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.nårsk.no/"&gt;www.xn--nrsk-qoa.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.dødeladen.no/"&gt;www.dødeladen.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.dødeladen.no/"&gt;www.xn--ddeladen-54a.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.blæst.no/"&gt;www.blæst.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.blæst.no/"&gt;www.xn--blst-woa.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.blåbærmuffin.no/"&gt;www.blåbærmuffin.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.blåbærmuffin.no/"&gt;www.xn--blbrmuffin-25an.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.hovedløpet2006.no/"&gt;www.hovedløpet2006.no&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.hovedløpet2006.no/"&gt;www.xn--hovedlpet2006-gnb.no&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.futtitromsø.no/"&gt;www.futtitromsø.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.futtitromsø.no/"&gt;www.xn--futtitroms-9cb.no/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://alliancefrançaise.nu/"&gt;alliancefrançaise.nu&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://alliancefrançaise.nu/"&gt;xn--alliancefranaise-npb.nu&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://tobiasvärk.nu/"&gt;tobiasvärk.nu/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://tobiasvärk.nu/"&gt;xn--tobiasvrk-12a.nu/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.färgbolaget.nu/"&gt;www.färgbolaget.nu/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.färgbolaget.nu/"&gt;www.xn--frgbolaget-q5a.nu/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://högkyrklig.nu/"&gt;högkyrklig.nu/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://högkyrklig.nu/"&gt;xn--hgkyrklig-07a.nu/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://żółć.pl/"&gt;żółć.pl&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://żółć.pl/"&gt;xn--kda4b0koi.pl&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.noże.pl/"&gt;www.noże.pl/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.noże.pl/"&gt;www.xn--noe-42a.pl/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.szkołagłównahandlowa.pl/"&gt;www.szkołagłównahandlowa.pl/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.szkołagłównahandlowa.pl/"&gt;www.xn--szkoagwnahandlowa-lyb21mca.pl/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://кремль.ru/"&gt;кремль.ru/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://кремль.ru/"&gt;xn--e1ajeds9e.ru/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://президент.ru/"&gt;президент.ru/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://президент.ru/"&gt;xn--d1abbgf6aiiy.ru/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.företagsbörsen.se/"&gt;www.företagsbörsen.se/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.företagsbörsen.se/"&gt;www.xn--fretagsbrsen-4ibh.se/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://blåbärsmjölk.webway.se/"&gt;blåbärsmjölk.webway.se/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://blåbärsmjölk.webway.se/"&gt;xn--blbrsmjlk-x2aj4s.webway.se/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.uplandskonstförening.se/"&gt;www.uplandskonstförening.se/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.uplandskonstförening.se/"&gt;www.xn--uplandskonstfrening-26b.se/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://ไดอารี่.th/"&gt;ไดอารี่.th/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://ไดอารี่.th/"&gt;xn--l3c4a3auq9f7a.th/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://www.türkçe.gen.tr/"&gt;www.türkçe.gen.tr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://www.türkçe.gen.tr/"&gt;www.xn--trke-2oa7j.gen.tr&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://jæger.tv/"&gt;jæger.tv/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://jæger.tv/"&gt;xn--jger-voa.tv/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://中央大学.tw/"&gt;中央大学.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://中央大学.tw/"&gt;xn--fiq80yua78t.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://台網中心.tw/"&gt;台網中心.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://台網中心.tw/"&gt;xn--fiq43lrrlz83a.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://中文.tw/"&gt;中文.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://中文.tw/"&gt;xn--fiq228c.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://大大.tw/"&gt;大大.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://大大.tw/"&gt;xn--pssa.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://清華大學.tw/"&gt;清華大學.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://清華大學.tw/"&gt;xn--pssu7c921afvu.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://網域名稱.tw/"&gt;網域名稱.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://網域名稱.tw/"&gt;xn--eqrt2ge74bp6c.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://東華大學.tw/"&gt;東華大學.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://東華大學.tw/"&gt;xn--pssu7cxyrvv2a.tw&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;&lt;A href="http://zürich.ws/"&gt;zürich.ws/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;A href="http://zürich.ws/"&gt;xn--zrich-kva.ws/&lt;/A&gt;&lt;/TD&gt;
&lt;TD&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=754882" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item><item><title>Does your web site die when it sees unexpected languages?</title><link>http://blogs.msdn.com/shawnste/archive/2006/09/01/732231.aspx</link><pubDate>Fri, 01 Sep 2006 11:00:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:732231</guid><dc:creator>shawnste</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/732231.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=732231</wfw:commentRss><description>&lt;P&gt;Recently I've seen a spate of bugs involving client-server web applications related to the user locale.&lt;/P&gt;
&lt;P&gt;Windows Vista has added a ton of locales that weren't there in XP or Server 2003 or Microsoft.Net v2.0.&amp;nbsp; Additionally users can create and use their own custom locales.&amp;nbsp; Lastly, they've always been able to edit their language preference(s) in IE.&lt;/P&gt;
&lt;P&gt;It seems that when some web sites get a language/region pair that their OS doesn't support in the HTTP_ACCEPT_LANGUAGE headers they crash.&amp;nbsp; The way to solve this is to make sure that your application has a fallback path for unknown languages (like to en-US or invariant or some other appropriate locale).&lt;/P&gt;
&lt;P&gt;To test that your web site can handle unexpected client locales,&amp;nbsp;you can try some of the new languages in Vista, such as Greenlandic (Greenland), or make your own custom locale, like tlh-US or haw-US or fj-FJ.&amp;nbsp; Even editing IE's language preferences may provide test data to your servers.&lt;/P&gt;
&lt;P&gt;Currently users rarely change IE's language preference, but we expect that with Vista and in the future, more users will be choosing locales (cultures) that are more appropriate for them.&amp;nbsp; So it will only become more likely that your server will encounter language tags it doesn't recognize.&lt;/P&gt;
&lt;P&gt;- Shawn&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=732231" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category><category domain="http://blogs.msdn.com/shawnste/archive/tags/Custom+Cultures+_2F00_+Locales+_2F00_+CultureInfo/default.aspx">Custom Cultures / Locales / CultureInfo</category></item><item><title>Does "Everyone" Use IDN?</title><link>http://blogs.msdn.com/shawnste/archive/2006/08/25/718715.aspx</link><pubDate>Fri, 25 Aug 2006 21:40:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:718715</guid><dc:creator>shawnste</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/718715.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=718715</wfw:commentRss><description>&lt;P&gt;Recently it came to my attention that arbitrary implementation of IDN can break existing networks.&lt;/P&gt;
&lt;P&gt;Particularly on Intranets machine names have not always been restricted to ASCII.&amp;nbsp; In many of our markets it is quite common to have non-ASCII machine names.&lt;/P&gt;
&lt;P&gt;So in some cases there are existing deployments that use ANSI or UTF-8 machine names directly without relying on Punycode.&amp;nbsp; Internal DNS services often return these names.&amp;nbsp; It is also possible (but not of much use and very much not recommended) to have sub-domains (like xxx in xxx.example.com) that respond to 8-bit encoded domain names on the Internet itself.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=718715" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item><item><title>How Come IdnToAscii Fails For String XXXX?</title><link>http://blogs.msdn.com/shawnste/archive/2006/08/24/718700.aspx</link><pubDate>Thu, 24 Aug 2006 21:11:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:718700</guid><dc:creator>shawnste</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/718700.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=718700</wfw:commentRss><description>&lt;P&gt;I'm often asked why particular strings fail the IdnToAscii function.&amp;nbsp; The answer is "because its not a legal IDN name, that's why" :-)&amp;nbsp; But why aren't some strings legal IDN names?&lt;/P&gt;
&lt;P&gt;Some rules act on the "label" level.&amp;nbsp; In &lt;A href="http://www.microsoft.com/"&gt;www.microsoft.com&lt;/A&gt; www, microsoft &amp;amp; com are each separate labels.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;IDN prohibits some characters, such as space characters, control characters, etc.&lt;/LI&gt;
&lt;LI&gt;IDN is currently at Unicode 3.2, characters added after that point are prohibited.&amp;nbsp; Eventually that'll probably be upgraded to Unicode 5 or later, but currently many issues regarding security, etc are prohibiting it.&lt;/LI&gt;
&lt;LI&gt;Our APIs also disallow the ASCII control and Backspace characters even if the IDN_USE_STD3_ASCII_RULES flag is not set since control characters don't make sense in domain names.&amp;nbsp; 0 in particular is bad in strings.&lt;/LI&gt;
&lt;LI&gt;There are rules regarding mixing bidirectional characters.&amp;nbsp; Basically if there are any right to left (RTL) characters, then there may not be any left to right (LTR) characters in the same label.&amp;nbsp; Those characters that aren't specifically RTL or LTR can be mixed with RTL or LTR characters.&lt;/LI&gt;
&lt;LI&gt;If the label has RTL characters, the first and last characters must be RTL as well.&amp;nbsp; Other&amp;nbsp;characters that aren't explicitly RTL or LTR may be in the string, but not in the first or last positions.&lt;/LI&gt;
&lt;LI&gt;Punycode labels are longer than their source Unicode strings.&amp;nbsp; So some Unicode strings will convert to Punycode labels that exceed the allowable DNS label or name&amp;nbsp;length.&amp;nbsp; Those will therefore be illegal as well.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;For more information about IDN, you can look at the RFCs and sources:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="http://www.ietf.org/rfc/rfc3490.txt"&gt;http://www.ietf.org/rfc/rfc3490.txt&lt;/A&gt;&amp;nbsp;Internationalizing Domain Names in Applications (IDNA)&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://www.ietf.org/rfc/rfc3491.txt"&gt;http://www.ietf.org/rfc/rfc3491.txt&lt;/A&gt;&amp;nbsp;Nameprep: A Stringprep Profile for&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Internationalized Domain Names (IDN)&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://www.ietf.org/rfc/rfc3454.txt"&gt;http://www.ietf.org/rfc/rfc3454.txt&lt;/A&gt;&amp;nbsp;Preparation of Internationalized Strings ("stringprep")&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://www.ietf.org/rfc/rfc3492.txt"&gt;http://www.ietf.org/rfc/rfc3492.txt&lt;/A&gt;&amp;nbsp;Punycode: A Bootstring encoding of Unicode&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; for Internationalized Domain Names in Applications (IDNA)&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://en.wikipedia.org/wiki/Internationalizing_Domain_Names_in_Applications"&gt;http://en.wikipedia.org/wiki/Internationalizing_Domain_Names_in_Applications&lt;/A&gt;&amp;nbsp;Wikipedia's entry on IDN.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=718700" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item><item><title>IDN Mitigation API dll needs to be explicitly linked</title><link>http://blogs.msdn.com/shawnste/archive/2006/01/05/idn-mitigation-api-dll-needs-to-be-explicitly-linked.aspx</link><pubDate>Thu, 05 Jan 2006 20:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:509746</guid><dc:creator>shawnste</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/509746.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=509746</wfw:commentRss><description>&lt;P&gt;Since we (I) didn't provide a .lib file for the "Microsoft Internationalized Domain Name (IDN) Mitigation APIs 1.0", you'll have to explicitly link to the IDN Mitigation API dll, which means you'll have to make your own header &amp;amp; code something like:&lt;/P&gt;
&lt;P&gt;[5 May 2009 - updated with Siva's comment, but didn't actually try to compile it]&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;// Declare the stuff to use (repeat for each function/dll to import)&lt;BR&gt;typedef int (__stdcall *PFN_DOWNLEVELGETLOCALESCRIPTS)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; (LPCWSTR lpLocaleName, LPWSTR lpLCData, int cchData);&lt;BR&gt;&lt;BR&gt;&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;PFN_DOWNLEVELGETLOCALESCRIPTSm_pfnGetLocaleScripts = NULL;&lt;BR&gt;&lt;FONT face="Courier New"&gt;HMODULE m_hDownlevelDll = NULL;&lt;BR&gt;&lt;BR&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;...&lt;BR&gt;// Load the library and get our function pointer&lt;BR&gt;&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;m_hDownlevelDll = LoadLibrary(L"idndl.dll");&lt;BR&gt;if (m_hDownlevelDll != NULL)&lt;BR&gt;{&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; FARPROC pfn = GetProcAddress(m_hDownlevelDll, "DownlevelGetLocaleScripts");&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (pfn != NULL)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; m_pfnGetLocaleScripts = (PFN_DOWNLEVELGETLOCALESCRIPTS)pfn;&lt;BR&gt;&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR&gt;&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; // else error&lt;BR&gt;}&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;&lt;BR&gt;// else error&lt;BR&gt;...&lt;BR&gt;&lt;/FONT&gt;&lt;FONT face="Courier New"&gt;&lt;BR&gt;// Call the function using our pointer&lt;BR&gt;int&amp;nbsp;count = (*pfnGetLocaleScripts)( L"en-US", NULL,&amp;nbsp;0 );&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Hope that helps, we'll probably add the .lib in a future update, but that'll probably be a while.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=509746" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item><item><title>Microsoft Internationalized Domain Names (IDN) Mitigation APIs 1.0 Released</title><link>http://blogs.msdn.com/shawnste/archive/2005/08/09/449644.aspx</link><pubDate>Wed, 10 Aug 2005 01:02:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:449644</guid><dc:creator>shawnste</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/449644.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=449644</wfw:commentRss><description>&lt;P&gt;Today we posted some APIs that convert to &amp;amp; from IDN Punycode, provide some mitigation functionality for those APIs, and provide Unicode Normalization support.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://www.microsoft.com/downloads/details.aspx?FamilyId=E289BB2C-D111-4331-8FB2-CC6C5A026C93&amp;amp;displaylang=en"&gt;http://www.microsoft.com/downloads/details.aspx?FamilyId=E289BB2C-D111-4331-8FB2-CC6C5A026C93&amp;amp;displaylang=en&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;See the documentation included with the download for details, but this could help apps that need IDN or normalization functionality.&amp;nbsp; Managed apps should of course use the .Net v2.0 String.Normalize()&amp;nbsp;and IdnMapping classes :-)&lt;/P&gt;
&lt;P&gt;5 Jan 2006 - See &lt;a href="https://blogs.msdn.com:443/shawnste/archive/2006/01/05/509746.aspx"&gt;http://blogs.msdn.com/shawnste/archive/2006/01/05/509746.aspx&lt;/A&gt;&amp;nbsp;for notes on loading the dll.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=449644" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item><item><title>IDN &amp; Homographs</title><link>http://blogs.msdn.com/shawnste/archive/2005/03/03/384692.aspx</link><pubDate>Fri, 04 Mar 2005 00:10:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:384692</guid><dc:creator>shawnste</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/shawnste/comments/384692.aspx</comments><wfw:commentRss>http://blogs.msdn.com/shawnste/commentrss.aspx?PostID=384692</wfw:commentRss><description>&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;I’ve divided this into a few parts:&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l2 level1 lfo1; tab-stops: list .5in"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;About IDN&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l2 level1 lfo1; tab-stops: list .5in"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;IDN &amp;amp; Security&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l2 level1 lfo1; tab-stops: list .5in"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Homograph Thoughts&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l2 level1 lfo1; tab-stops: list .5in"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Conclusion&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;h3 style="MARGIN: 12pt 0in 3pt"&gt;&lt;font face="Arial"&gt;About IDN:&lt;/font&gt;&lt;/h3&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;My interest in IDN is that I’m the SDE for the System.Globalization.IdnMapping class in &lt;a href="http://lab.msdn.microsoft.com/vs2005/default.aspx"&gt;Whidbey&lt;/a&gt;.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I also think its pretty nifty for the users in countries that use more than the basic Latin letters. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;For those of you that don’t know, IDN/IDNA is trying to solve the problem of international (non-ASCII) characters in domain names. IDN is an “Internationalized Domain Name”. RFC 3490 - Internationalizing Domain Names in Applications &lt;a href="http://www.faqs.org/rfcs/rfc3490.html"&gt;http://www.faqs.org/rfcs/rfc3490.html&lt;/a&gt; has the details.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;IDN only addresses domain names, it doesn’t attack the email address user name issue or other internationalization issues related to URLs/URIs/IRIs.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Before IDN, domain names were basically restricted to the Latin character set, A-Z, 0-9 and sometimes -. This is useful if your company is Microsoft.com, but not so helpful if you’re company is in Chinese or Cyrillic characters. IDN provides a mechanism for encoding additional &lt;A href="http://weblogs.asp.net/shawnste/category/9699.aspx"&gt;Unicode&lt;/a&gt; characters using the allowed a-z, 0-9 and - characters. So a name like &lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: 'MS Mincho'; mso-ascii-font-family: Arial; mso-hansi-font-family: 'Times New Roman'; mso-bidi-font-family: Arial"&gt;きくどら&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;.com (&lt;?xml:namespace prefix = st1 ns = "urn:schemas-microsoft-com:office:smarttags" /&gt;&lt;st1:place w:st="on"&gt;&lt;st1:PlaceName w:st="on"&gt;Kikuna&lt;/st1:PlaceName&gt; &lt;st1:PlaceName w:st="on"&gt;Driving&lt;/st1:PlaceName&gt; &lt;st1:PlaceType w:st="on"&gt;School&lt;/st1:PlaceType&gt;&lt;/st1:place&gt;) or www.mäkitorppa.com (Mäkitorppa mobile store) is represented like xn--w8je2f2f.com (&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: 'MS Mincho'; mso-ascii-font-family: Arial; mso-hansi-font-family: 'Times New Roman'; mso-bidi-font-family: Arial"&gt;きくどら&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;) or www.xn--mkitorppa-v2a.com (&lt;a href="http://www.mäkitorppa.com/"&gt;www.mäkitorppa.com&lt;/a&gt;).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;So IDN doesn’t require any changes to the DNS layers of the Internet, but it does require conversion from the Unicode to the ASCII “Punycode” form of a name at some point. A Whidbey .Net application uses the System.Globalization.IdnMapping class to convert between the Unicode and “Punycode” forms.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&amp;nbsp;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;In addition to the punycode conversion, IDN does some &lt;a href="http://www.unicode.org/reports/tr15/tr15-23.html"&gt;normalization&lt;/a&gt; using NFKC and additional mappings such as making the strings all the same case.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Some Unicode characters are considered ambiguous or dangerous and are disallowed in IDN, others are folded into a more common form to prevent some repetition.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;h3 style="MARGIN: 12pt 0in 3pt"&gt;&lt;font face="Arial"&gt;IDN &amp;amp; Security:&lt;/font&gt;&lt;/h3&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;IDN disallows some Unicode characters considered dangerous and “folds” others into a more common form in some cases if they are ambiguous.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Even with these restrictions, it was quite obvious that many look alike characters, or &lt;a href="http://www.unicode.org/reports/tr36/#visual_spoofing"&gt;homographs&lt;/a&gt;, exist in Unicode.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Examples exist even in ASCII as you can construct MICROSOFT.com as MlCR0S0FT.com by using the little el and zero characters.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;The DNS system would think this is a different domain name and send a user to a different server, yet, depending on your font it could be difficult to distinguish from the real domain name.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Unicode has tens of thousands of characters, so when the IDN RFC was created it was the homograph problem becomes even more complicated when Unicode characters are allowed.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;For example, Місrоѕоft.com can be written almost entirely in Cyrillic letters (this example has only the r, f &amp;amp; t in Latin. Я just doesn’t look quite the same ;-)).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Even worse, some scripts have characters that are difficult to distinguish.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Many Chinese characters appear very similar in small fonts.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Other scripts have minor diacritics that could be missing or slightly modified such that the user might not notice.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Due to the complexity of the problem, the IDN RFCs leave the homograph problem to be resolved later, perhaps by the registrars or a future RFC. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Since IDN doesn’t directly address the homograph problem, users could be susceptible to spoofing, phishing and other social engineering attacks.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;This is exactly what happened with the recent paypal attack.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;The IDN name pаypal.com was registered with a Cyrillic a for the first A.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;&lt;span style="COLOR: #333333"&gt;xn--pypal-4ve.com is the punycode version of this name.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;A user following such a link in some browsers would see what looked like paypal.com in their address bar, but would actually be a different web site.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;An email or link from another web site could be used to trick a user into providing their paypal information to an attacker.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;This type of attack is similar to the socially engineered emails that have already been used to try to get users to enter personal information by trying to get them to go to https://safe-com.com/ebay or some such URL instead of a real vendor site.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;Some people were amused that Mozilla, Firefox and other browsers were susceptible to the &lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;pаypal.com homograph attack,&lt;span style="COLOR: #333333"&gt; but Internet Explorer is not (because it doesn’t do IDN conversion).&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Equally interesting is the browser reaction of removing IDN support and then choosing to display the Punycode name instead.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;Personally I don’t think this is just an IDN weakness.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Rather IDN merely makes an existing problem with trusting links more obvious.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;http://paypal-safe.com or http://secure.com/paypal would catch many users anyway.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;h3 style="MARGIN: 12pt 0in 3pt"&gt;&lt;font face="Arial"&gt;Homograph Thoughts:&lt;/font&gt;&lt;/h3&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; COLOR: #333333; FONT-FAMILY: Arial"&gt;My thinking is that basically the IDN &lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;pаypal.com attack where the first A is Cyrillic is a social attack.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;For this attack to succeed, a user must first follow an untrusted link.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;That link could be a web site (please buy my book at Amazon.com) or an email (click here to update your bank information).&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Some users are already wary of following unsolicited links from email, but don’t think twice about a web link.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;In either case, a look-alike name in the address bar of the browser would be reassuring.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Several solutions to the homograph problem have been suggested.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I don’t have a magic bullet, but I think that the root problem is a user education/social engineering problem, remember in some cases these attacks can even happen with the non-IDN DNS names.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;The following are suggestions I’ve seen in various places and my thoughts about them.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Other people &amp;amp; coworkers disagree, so these are merely my thoughts.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Several suggestions seem American centric, and I’m disappointed that the developer community doesn’t have a broader global perspective.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt 39pt; TEXT-INDENT: -0.25in; mso-list: l1 level1 lfo2; tab-stops: list 39.0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;span style="mso-list: Ignore"&gt;·&lt;span style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span dir="ltr"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Disabling IDN&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt; – This is perhaps the most obvious suggestion, but seems quite short sighted.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;After all IDN was created to solve a real problem for DNS names that are not ASCII.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;If your corporate name is &lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: 'MS Mincho'; mso-ascii-font-family: Arial; mso-hansi-font-family: Arial; mso-bidi-font-family: Arial"&gt;きくどら&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-fareast-font-family: 'MS Mincho'"&gt; this “solution” doesn’t help at all.&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt 21pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt 39pt; TEXT-INDENT: -0.25in; mso-list: l1 level1 lfo2; tab-stops: list 39.0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;span style="mso-list: Ignore"&gt;·&lt;span style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span dir="ltr"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Displaying Punycode&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt; – This is also a quite non-global suggestion.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;While showing punycode solves the particular pаypal.com problem, its even worse for the &lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: 'MS Mincho'; mso-ascii-font-family: Arial; mso-hansi-font-family: Arial; mso-bidi-font-family: Arial"&gt;きくどら&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial; mso-fareast-font-family: 'MS Mincho'"&gt; user.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;xn--w8je2f2f.com is the same as xn—gibberish.com to their users (xn—gibberish.com would be &lt;span dir="rtl"&gt;&lt;/span&gt;&lt;span dir="rtl"&gt;&lt;/span&gt;&lt;span lang="AR" dir="rtl"&gt;&lt;span dir="rtl"&gt;&lt;/span&gt;&lt;span dir="rtl"&gt;&lt;/span&gt;ٮ٨٧٩ٯٲٳ&lt;/span&gt;&lt;span dir="ltr"&gt;&lt;/span&gt;&lt;span dir="ltr"&gt;&lt;/span&gt;&lt;span dir="ltr"&gt;&lt;/span&gt;&lt;span dir="ltr"&gt;&lt;/span&gt;.com, cool it even decodes!)&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Even in the www.mäkitorppa.com case how is a user supposed to know if its www.xn--mkitorppa-v2a.com or www.xn--mkitorppa-01a.com?&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;For non-US users displaying the punicode doesn’t solve the problem, it makes it worse.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;There are many other suggestions though.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;My concerns with these suggestions are that either a) they are too restrictive, preventing reasonable names, or b) they aren’t sufficient to catch all of the problems, or both.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Registrars Should Prohibit Bad Scripts&lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt; – The idea is that since certain countries would only expect names in their language(s), only those scripts should be allowed.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Some registrars are doing this and it seems reasonable, however since “everyone” uses .com and the other well known top level domains, this solution is pretty incomplete.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I also wonder if this might be a bit too aggressive.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;What about a Chinese grocery in &lt;st1:country-region w:st="on"&gt;&lt;st1:place w:st="on"&gt;Germany&lt;/st1:place&gt;&lt;/st1:country-region&gt;?&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;It seems reasonable that they might want a Chinese character .de domain name.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Fortunately for them (but not for .com), they can currently fall back to .com in this case &lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Wingdings; mso-ascii-font-family: Arial; mso-hansi-font-family: Arial; mso-bidi-font-family: Arial; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;span style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;J&lt;/span&gt;&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Disallowing Mixed Scripts&lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt; – The thought is that since paypal was spelled with mixed Cyrillic and Latin, then disallowing such mixes should prevent such attacks.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;Unfortunately Latin is often allowed with other scripts, particularly for businesses with tech names where it is sometimes popular to pick up words in Latin scripts.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Variations of this suggestion include disallowing mixed scripts, except that Latin could appear with most scripts, except Cyrillic.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I question however whether this is accurate or not.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I can easily imagine an import-export company with a multi-script-but-not-Latin name.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I’ve also heard that even Cyrillic can rarely be used with Latin, although I don’t know how the user’s supposed to type a mixed Cyrillic/Latin name.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Disallow Non-User Scripts&lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt; – Check the user locale or keyboard or whatever to see if the URL is a script the user uses.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Many users however have 2nd language skills that aren’t necessarily represented by their current locale or keyboard choice.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Prohibiting or Normalizing Homographs (applications) &lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;– This is an interesting suggestion, but I have several concerns.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Which form is “correct” if a browser encounters a Cyrillic and a Latin version of the same URL?&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I’d imagine that this happens often.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Who is going to find out which ones look alike?&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Using which fonts?&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;I cannot imagine that all Homographs could be caught.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;There’s probably a billion combinations to check, and we (the developer community) would only catch the obvious ones.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;The “bad guys” would undoubtedly catch the one(s) we missed.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;This also doesn’t help with some subtle character differences, even í, i, ì, and ï look pretty close and its pretty obvious that í and ì can’t be rejected just because they look like i.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Certificates Are The Solution &lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;– This has the same problem as trying to get the Registrars to do it, although since the fees are higher they might have more resources to address the problem.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;Again only 1 CA would have to let a bad name pass through.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Additionally despite the little key icon I suspect many users don’t realize if they’re on a secure connection or not.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;IDN APIs Should Filter Bad Names &lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;– If homograph filters are implemented, the RFC, Registrars or Unicode should drive them.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Obviously different software vendors can’t have different filtering or else those names would break, and I have difficulty suggesting that a registered name cannot be accessed by a software program.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Users Should Retype URLs &lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;– This seems a bit harsh.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Personally I only care if it’s a “safe” URL if I’m about to enter personal data.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;Many URLs are too long and this would also be bad for those people that get credit for referrals to Amazon and the like.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Display URLs in Lower Case&lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt; – Some have suggested that lower case letters are less likely to be confused, however there are still cases like rn vs m. (R + N vs M), and í, i, &amp;amp; ì that look very similar. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;Font choice could help, but probably can’t solve the problem, particularly for some users.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Some suggestions seem more reasonable to me.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;(Of course, I’m just me, so other people probably have other ideas).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Registrars Should Prohibit Homographs &lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;– The UTC is working on guidelines to help registrars with this problem (&lt;a href="http://www.unicode.org/reports/tr36/"&gt;UTR #36 Security Considerations for the Implementation of Unicode and Related Technology&lt;/a&gt;). This seems quite reasonable if consistently implemented, but it still seems likely that some combinations will still slip through, like using rn instead of m.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt 0.25in"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Whitelisting &lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;– This would be some sort of mechanism for trusting sites. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;I like this because it would address IP addresses in phishing mail, URLs with keywords and other attacks.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;The idea is that if the user tries to enter a form on a new site then they’d be prompted before the action was allowed. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;If the site was already whitelisted no prompt would appear. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;So a user that used paypal.com would have some warning if they followed an e-mail link to http://1.2.3.4/paypal.com or http://secure.com/paypal. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;The downside is that determining which forms require extra protection could be hard. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;Additionally kiosks like those in hotels or libraries couldn’t persist the whitelist or else someone could intentionally whitelist an unsafe site to trap future users.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;ul style="MARGIN-TOP: 0in" type="disc"&gt; &lt;li class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-list: l0 level1 lfo3; tab-stops: list .5in"&gt;&lt;b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;User Education&lt;/span&gt;&lt;/b&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt; – At some level user’s need to be aware of what behaviors are safe or unsafe. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;Unfortunately there are a &lt;st1:place w:st="on"&gt;LOT&lt;/st1:place&gt; of users, so even a fraction of a percent that remain unaware of these issues would still be at risk. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;This could also address problems such as users who use the same name &amp;amp; password for every site they register.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;One bogus contest entry and they could be at risk if the same credentials worked on paypal. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;h3 style="MARGIN: 12pt 0in 3pt"&gt;&lt;font face="Arial"&gt;Conclusion:&lt;/font&gt;&lt;/h3&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;Various technical problems, including IDN, can be combined with well-designed social attacks to allow a user to trust a web site that shouldn’t really be trusted. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;Vigilance by the registrars could prevent many homograph attacks, however some will undoubtedly still be possible. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;Font choices and browser behavior might limit a few more mistakes, but those can be offset by poor eyesight, dyslexia, monitor differences, color choices (user or application), platform (mobile or PC) and other differences.&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/span&gt;User education can also help catch a percentage of the problem.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;It seems to me that all of these taken together will reduce the available surface for attacks, but there will still be a window for the attackers to attempt their exploits. &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/span&gt;Many of those socially engineered attacks don’t even require IDN Homographs to trap some unwary or uneducated users.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=384692" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/shawnste/archive/tags/IDN+_2800_Internationalized+Domain+Names_2900_/default.aspx">IDN (Internationalized Domain Names)</category></item></channel></rss>