<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx</link><description>Suzanne has been riffing on me in relation to Vietnamese and then she shifted over to talk about Google and other languages , so I thought I would riff off of her a bit. :-) By the way Suzanne -- I did not find your terminology to be inaccurate; it was</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#492869</link><pubDate>Tue, 15 Nov 2005 13:01:06 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:492869</guid><dc:creator>Andrew West</dc:creator><description>&amp;quot;Andrew West's claim that Google is normalizing appears to be crap&amp;quot;&lt;br&gt;&lt;br&gt;NOT ME! &amp;quot;Andrew C.&amp;quot; is someone else ... I don't mnake crap claims ;)&lt;br&gt;&lt;br&gt;... and anyway he claimed that Google was *not* normalizing.&lt;br&gt;</description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#492881</link><pubDate>Tue, 15 Nov 2005 14:17:32 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:492881</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Sincerest aplogies, Andrew -- I have removed the bogus reference to you. It was Simon who was trying to make the claim (I am not sure who Simon is here).&lt;br&gt;&lt;br&gt;Correction made, and again I am very sorry.</description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#493173</link><pubDate>Wed, 16 Nov 2005 03:02:52 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:493173</guid><dc:creator>Suzanne McCarthy</dc:creator><description>Isn't Simon's claim abaout Greek correct? Normalization is happening for some sequnces but not others. I get the same results in French and German with and without the precomposed 'vowels plus diacritics'. </description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#493175</link><pubDate>Wed, 16 Nov 2005 03:09:01 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:493175</guid><dc:creator>Suzanne McCarthy</dc:creator><description>BTW I forgot to add that I really appreciate this little experiment that you created here. Thank you, Mike.  </description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#493180</link><pubDate>Wed, 16 Nov 2005 03:30:10 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:493180</guid><dc:creator>Michael S. Kaplan</dc:creator><description>If I had to guess, I would say they are not using Unicode normalization at all -- they are building their own homegrown system that happens to work in some cases but not others....</description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#493192</link><pubDate>Wed, 16 Nov 2005 04:19:26 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:493192</guid><dc:creator>Michael S. Kaplan</dc:creator><description>You are very welcome, Suzanne. It was fun. :-)&lt;br&gt;&lt;br&gt;</description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#493830</link><pubDate>Thu, 17 Nov 2005 13:44:40 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:493830</guid><dc:creator>Jim</dc:creator><description>Ahem, a microsoft employee rubbishing google through speculation...</description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#493866</link><pubDate>Thu, 17 Nov 2005 16:31:15 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:493866</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Hi Jim,&lt;br&gt;&lt;br&gt;This is not rubbish -- tet yourself. Canonically equivalent Unicode forms do not find the same pages.&lt;br&gt;&lt;br&gt;If I were trying to diss just Google I would not have pointed out that Microsoft is also guilty though, obviously. Both of them need to do this.</description></item><item><title>re: SIAO to Search engines -- would you please normalize, already?</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#494196</link><pubDate>Fri, 18 Nov 2005 03:46:03 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:494196</guid><dc:creator>Mike</dc:creator><description>Very good info - hopefully this will be a big wakeup call for the big 2.  I hope those that have the power to do something about this have been &amp;quot;educated&amp;quot; now. </description></item><item><title>What is equal to some may not be equal to others</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#525510</link><pubDate>Mon, 06 Feb 2006 11:29:41 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:525510</guid><dc:creator>Sorting It All Out</dc:creator><description>Someone going by the handle AC asked me via email:&lt;br&gt;&lt;br&gt;You have mentioned that Google has trouble with...</description></item><item><title>Harder intermediate forms of characters</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#597334</link><pubDate>Sun, 14 May 2006 12:32:33 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:597334</guid><dc:creator>Sorting It All Out</dc:creator><description>In the post Getting intermediate forms, I gave an example three character sequences that look the same...</description></item><item><title>The search for someone who does Search correctly</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#1761989</link><pubDate>Mon, 26 Feb 2007 12:05:38 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1761989</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;Thinking about the issues involved with &amp;#224; ≠ a (unless &amp;#224; = a) made me think back to other posts where&lt;/p&gt;
</description></item><item><title>SIAO is still underwhelmed by search engines (all of them)</title><link>http://blogs.msdn.com/michkap/archive/2005/11/15/492301.aspx#3891389</link><pubDate>Mon, 16 Jul 2007 10:16:01 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:3891389</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;You may have read in Arial Unicode MS effectively [bites|sucks|blows] about how Microsoft MVP Omi Azad&lt;/p&gt;
</description></item></channel></rss>