<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx</link><description>Well, Jochen Neyens asked: What's the easiest way to remove diacritic marks from characters using C#? I would like to have following function: string RemoveDiacriticMark(string c) Sample use: RemoveDiacriticMark("é") -&amp;gt; "e" RemoveDiacriticMark("ü")</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#376662</link><pubDate>Sat, 19 Feb 2005 19:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:376662</guid><dc:creator>Johan Petersson</dc:creator><description>In what context would such a transformation ever be useful? I can understand the need for lossy remapping, e.g. from a Unicode encoding to ASCII, but not blindly stripping diacritics.&lt;br&gt;&lt;br&gt;Sorry if this is a stupid question; I'm just curious.</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#376685</link><pubDate>Sat, 19 Feb 2005 19:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:376685</guid><dc:creator>Michael Kaplan</dc:creator><description>Not stupid at all -- in most contexts you would probably be correct. And as I have said previously (cf: &lt;a target="_new" href="http://blogs.msdn.com/michkap/archive/2005/02/05/367666.aspx"&gt;http://blogs.msdn.com/michkap/archive/2005/02/05/367666.aspx&lt;/a&gt;) sometimes doing so would be destructive of language content.&lt;br&gt;&lt;br&gt;This was kind of code to spec, based on a question. I had a moment to kill and I realized it would show off new features, so.... :-)</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#376868</link><pubDate>Sun, 20 Feb 2005 08:35:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:376868</guid><dc:creator>Jochen Neyens</dc:creator><description>I've a webform that displays a companylist with A B C .. Z paging instead of 1 2 3 ... N. After adding a new company name, using the admin interface, I would like to jump to the corresponding letter page. Suppose someone enters &amp;quot;&amp;#233;&amp;#233;n&amp;quot; as company name then I'd have to jump to letter page E. So, I need someting like:&lt;br&gt;&lt;br&gt;JumpToPage(RemoveDiacritics(Left(cmpname,1)));&lt;br&gt;&lt;br&gt;The application is written for .NET 1.1 so I cannot use the new Whidbey feature (although it's nice to know of it's existince)!&lt;br&gt;&lt;br&gt;I'll have a go at the &lt;a title="FoldString" href="http://msdn.microsoft.com/library/en-us/winui/winui/windowsuserinterface/resources/strings/stringreference/stringfunctions/foldstring.asp" target="_blank"&gt;FoldString&lt;/a&gt; API and see how far I can get. This would actually be the first time I'd use p/invoke to call a Win32 DLl :-)&lt;br&gt;&lt;br&gt;Thanks for answering my question Michael!&lt;br&gt;&lt;br&gt;PS: If someone reading this blog already has has an .NET 1.1 compliant RemoveDiacritics function it would be nice to have it posted here...&lt;br&gt;&lt;br&gt;</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#376883</link><pubDate>Sun, 20 Feb 2005 09:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:376883</guid><dc:creator>Michael Kaplan</dc:creator><description>Well, you will want to be careful -- since for some people those &amp;quot;letters with diacritics&amp;quot; are actually letters in their own right....</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#376983</link><pubDate>Sun, 20 Feb 2005 18:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:376983</guid><dc:creator>Johan Petersson</dc:creator><description>Thanks for explaining Jochen. There are company names starting with non-letters too (like digits or @), which might be a problem. Basing the page index on first characters actually in use for company names might be an alternative.</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#397317</link><pubDate>Thu, 17 Mar 2005 07:07:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:397317</guid><dc:creator>Alejandro Lapeyre</dc:creator><description>I would use two strings:&lt;br&gt;&lt;br&gt;string1=&amp;quot;&amp;#225;&amp;#233;&amp;#237;&amp;#243;&amp;#250;&amp;#241;&amp;quot;&lt;br&gt;string2=&amp;quot;aeioun&amp;quot;&lt;br&gt;&lt;br&gt;and replace the occurrences of the characters in the first string with the characters of the second string.</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#398127</link><pubDate>Thu, 17 Mar 2005 19:28:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:398127</guid><dc:creator>Evan Stein</dc:creator><description>First, there are hugely good reasons for stripping diacritics, mainly for the purpose of searching in plain ASCII. This is lossy stuff, but when you're going down to seven bits, that's the idea. The best method is to keep the full string for display and manipulation, and to maintain a search column alongside the original.&lt;br&gt;&lt;br&gt;I have a function (actually, a class) that does this in the current C#. It's about 1000 lines, and a bit inconvenient to put in a blog, but if someone can tell me where to post the thing (or sends me a mail), I'd be delighted to share the code.&lt;br&gt;&lt;br&gt;Here's how I did it (I’m in verbose mode here -- apologies):&lt;br&gt;&lt;br&gt;1) Went to the Unicode database, which is actually a series of text files. Got the code ranges for the Latin set [Scripts.txt]. Got the character values from UnicodeData.txt. There are about 933 characters that qualify as Latin, in all its extensions and modifications.&lt;br&gt;&lt;br&gt;2) Field 6 of UnicodeData.txt (starting from 1) has a decomposition map, and can be recursed until no more decomposition is possible. Wrote a routine to do this and write the results to a file. &lt;br&gt;&lt;br&gt;3) This took care of the vast majority of values, except some of the IPA characters and the really outlandish ones, such as Anglo-Saxon characters descended from runes. There were about 100 cases where the Unicode Consortium played it safe and didn't suggest any decomposition values, and I don't blame them at all. Armed with the descriptions in UnicodeData.txt (&amp;quot;open, upside-down, backwards, small capital O with a ring, 2 tatoos and a piercing&amp;quot;), as well as a PDF showing what the characters look like, I did my best. I'd like to repeat that some of these are pretty outlandish, in case you skipped when I said that the first 4 times. I'll be shocked if anyone even notices what the choices were.&lt;br&gt;&lt;br&gt;4) Originally I held the Unicode database in Oracle, since putting data into program code goes so much against the grain. I also did the recursion at runtime. But, for speed and distribution, you can't beat code, and you also can't beat pre-finished data. I got the finished data into a file, performed step 3, did some word processing ... and dropped it into C#.&lt;br&gt;&lt;br&gt;5) Created a class with a static constructor, which loads the data only once. The data is in the form of a Hashtable, so that the Unicode character itself is the key to the fully normalized ASCII character. The speed seems reasonably good.&lt;br&gt;&lt;br&gt;6) Wrote the StringStrip( ) function, which copies an input string to an output string character by character in a &amp;quot;for loop&amp;quot;. If it encounters a character it doesn't have (i.e., non-Latin or punctuation), it simply copies that character to the output string without altering it. The one caveat is that some of the characters are diagraphs (e.g., &amp;quot;Dz&amp;quot;), so if you're in a tight spot you'll have to measure the string you get back before using it.&lt;br&gt;&lt;br&gt;The ink is still wet, but reading this blog spurred me to get it done. As I said, I'd be delighted to share this, especially if I get some tips back on my ham-handed coding. If you like, you can reach me at Evan@travelogues/DOT/net.&lt;br&gt;&lt;br&gt;Regards,&lt;br&gt;Evan</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#406255</link><pubDate>Thu, 07 Apr 2005 20:59:27 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406255</guid><dc:creator>Francois Beauchemin</dc:creator><description>Hi, &lt;br&gt;&lt;br&gt;I'm trying to strip diacritics with the FoldString API. My code seem ok but the regex does not work with the Mn category but with the Sk category. What i'me doing wrong?&lt;br&gt;&lt;br&gt;		[Flags]&lt;br&gt;		private enum MapFlags &lt;br&gt;		{&lt;br&gt;			MAP_FOLDCZONE   =  0x00000010,// fold compatibility zone chars&lt;br&gt;			MAP_PRECOMPOSED =  0x00000020,// convert to precomposed chars&lt;br&gt;			MAP_COMPOSITE   = 0x00000040, // convert to composite chars&lt;br&gt;			MAP_FOLDDIGITS  = 0x00000080  // all digits to ASCII 0-9&lt;br&gt;		}&lt;br&gt;&lt;br&gt;		[DllImport(&amp;quot;kernel32.dll&amp;quot;, SetLastError=true)]&lt;br&gt;		static extern int FoldString(MapFlags dwMapFlags, string lpSrcStr, int cchSrc,&lt;br&gt;			[Out] StringBuilder lpDestStr, int cchDest);&lt;br&gt;&lt;br&gt;	&lt;br&gt;		public static string RemoveDiacritics(string stIn) {&lt;br&gt;			StringBuilder sb = new StringBuilder();		&lt;br&gt;			int ret = FoldString(MapFlags.MAP_COMPOSITE , stIn, stIn.Length,  sb, stIn.Length * 2);			&lt;br&gt;			return Regex.Replace(sb.ToString(), @&amp;quot;\p{Sk}&amp;quot;, &amp;quot;&amp;quot;);						&lt;br&gt;		}</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#406269</link><pubDate>Thu, 07 Apr 2005 21:51:24 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406269</guid><dc:creator>Michael S. Kaplan</dc:creator><description>I do not know what charactrs you are referring to, but you can see the code -- it is only stripping UnicodeCategory.NonSpacingMark -- you would have to strip the other category too if you wanted it gone....</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#406288</link><pubDate>Thu, 07 Apr 2005 22:47:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406288</guid><dc:creator>Francois Beauchemin</dc:creator><description>My problem is that FoldString whith MAP_COMPOSITE return a string with UnicodeCategory.ModifierSymbol instead of NonSpacingMark. &lt;br&gt;&lt;br&gt;For exemple the character &amp;#251; (0x00FB) is expanded to 0x0015 0x005E instead of 0x0015 0x0302&lt;br&gt;&lt;br&gt;Anyway I think it's ok for my case. (removing accented char from french contry name)</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#406347</link><pubDate>Fri, 08 Apr 2005 02:28:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406347</guid><dc:creator>Michael S. Kaplan</dc:creator><description>So you could modify the code to look for both UnicodeCategory.ModifierSymbol and UnicodeCategory.NonSpacingMark, rather than just UnicodeCategory.NonSpacingMark as it does now, right?&lt;br&gt;&lt;br&gt;FoldString is of course not based on nomalization, as I explain at &lt;a rel="nofollow" target="_new" href="http://blogs.msdn.com/michkap/archive/2005/01/31/363701.aspx"&gt;http://blogs.msdn.com/michkap/archive/2005/01/31/363701.aspx&lt;/a&gt; . :-)</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#408407</link><pubDate>Fri, 15 Apr 2005 09:17:57 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:408407</guid><dc:creator>Quan Nguyen</dc:creator><description>Until Whidbey, another way to strip diacritics is after performing the NFD on the input string (decompose), use RegEx to delete all the Combining Diacritical Marks, such as:&lt;br&gt;&lt;br&gt;Normalizer decomposer = new Normalizer(Normalizer.D, false);&lt;br&gt;string result = decomposer.normalize(inputString);&lt;br&gt;result = Regex.Replace(result, &amp;quot;\\p{IsCombiningDiacriticalMarks}+&amp;quot;, &amp;quot;&amp;quot;);&lt;br&gt;&lt;br&gt;This is how it is done in VietPad.NET (&lt;a rel="nofollow" target="_new" href="http://vietpad.sf.net"&gt;http://vietpad.sf.net&lt;/a&gt;).</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#408410</link><pubDate>Fri, 15 Apr 2005 09:35:59 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:408410</guid><dc:creator>Michael S. Kaplan</dc:creator><description>The only thing I know of called &amp;quot;Normalizer&amp;quot; out of Microsoft is the internal name for the Microsoft Access wizard that is officially called the Table Analyzer. It has nothing to do with Unicode normalization.&lt;br&gt;&lt;br&gt;Since there is no class that will do normalization in .NET until Whidbey, I am not sure where this code would work....</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#409890</link><pubDate>Wed, 20 Apr 2005 06:32:47 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:409890</guid><dc:creator>Quan Nguyen</dc:creator><description>I'm sorry, Normalizer is one of Unicode (ICU) Java classes that I ported to C#. It performs Unicode Normalization Forms like those that are going to be supported in Whidbey.&lt;br&gt;&lt;br&gt;The point is you can strip the diacritics simply by deleting them using Regexp, rather than checking the UnicodeCategory of every character.</description></item><item><title>Stripping out diacritics, redux</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#446487</link><pubDate>Tue, 02 Aug 2005 09:56:56 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:446487</guid><dc:creator>Sorting It All Out</dc:creator><description>This last week, Dean Harding asked in the suggestion box:&lt;br&gt;&lt;br&gt;Hey Michael, after all these years of reading...</description></item><item><title>
		 
		  Eliminando acentos con .NET 2			</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#575928</link><pubDate>Thu, 13 Apr 2006 21:33:13 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:575928</guid><dc:creator>
		 
		  Eliminando acentos con .NET 2			</dc:creator><description>PingBack from &lt;a rel="nofollow" target="_new" href="http://www.buayacorp.com/archivos/eliminando-acentos-con-net-2/"&gt;http://www.buayacorp.com/archivos/eliminando-acentos-con-net-2/&lt;/a&gt;</description></item><item><title>Those letters are stripping off their diacritics in public again, the sluts!</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#766178</link><pubDate>Fri, 22 Sep 2006 17:05:50 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:766178</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;br&gt;Have you ever noticed how the bigger fan someone seems to be of your blog, the more likely they are...</description></item><item><title>Removing diacritics (accents) from strings</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#904534</link><pubDate>Tue, 31 Oct 2006 01:06:17 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:904534</guid><dc:creator>Fabrice's weblog</dc:creator><description>&lt;P&gt;It's often useful to remove diacritic marks (often called accent marks) from characters. You know:&lt;/P&gt;</description></item><item><title>Joe.Blog  &amp;raquo; Making string URL friendly, redux</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#1280323</link><pubDate>Thu, 14 Dec 2006 06:44:22 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1280323</guid><dc:creator>Joe.Blog  » Making string URL friendly, redux</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://joe.hardy.id.au/blog/2006/12/14/making-string-url-friendly-redux/"&gt;http://joe.hardy.id.au/blog/2006/12/14/making-string-url-friendly-redux/&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>The non-ASCII solution to the .NET Unicode Puzzle</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#1802731</link><pubDate>Sun, 04 Mar 2007 11:11:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1802731</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;So anyway, I was pointed to Chris Mullins' .NET Unicode Puzzle and was struck by the irony of the use&lt;/p&gt;
</description></item><item><title>Stripping is an interesting job (aka On the meaning of meaningless, aka All Mn characters are non-spacing, but some are more non-spacing than others)</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#2629752</link><pubDate>Mon, 14 May 2007 20:51:38 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:2629752</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;(apologies to George Orwell, of course!) Val asks: Michael, I've been reading your &amp;quot;Striping Diacritics&amp;quot;&lt;/p&gt;
</description></item><item><title>Removing diacritics (accents) from strings</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#2869798</link><pubDate>Fri, 25 May 2007 16:04:27 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:2869798</guid><dc:creator>Harmjan Menninga</dc:creator><description>&lt;p&gt;It's often useful to remove diacritic marks (often called accent marks) from characters. You know: tilde&lt;/p&gt;
</description></item><item><title>Removing diacritics (accents) from strings </title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#2982826</link><pubDate>Wed, 30 May 2007 09:59:47 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:2982826</guid><dc:creator>Harmjan Menninga</dc:creator><description>&lt;P&gt;It's often useful to remove diacritic marks (often called accent marks) from characters. You know&lt;/P&gt;</description></item><item><title>How to remove accents from strings in .NET</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#3269448</link><pubDate>Wed, 13 Jun 2007 17:03:21 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:3269448</guid><dc:creator>/egilh</dc:creator><description /></item><item><title>Normalize Wide Shut</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#4427190</link><pubDate>Fri, 17 Aug 2007 10:46:36 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:4427190</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;(Apologies to Stanley Kubrick, of course!) It was almost the very first blog post I ever wrote, back&lt;/p&gt;
</description></item><item><title>I am not a nudist, but I do support stripping when it is appropriate, part 1</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#4736730</link><pubDate>Tue, 04 Sep 2007 10:31:05 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:4736730</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;The title is correct: I am not a nudist. I think people who are nudists are just fine and I enjoyed the&lt;/p&gt;
</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#5488842</link><pubDate>Wed, 17 Oct 2007 13:47:12 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:5488842</guid><dc:creator>Jean L.N. Hofste'</dc:creator><description>&lt;p&gt;Stripping diacritics is tentamount to MURDER.&lt;/p&gt;
&lt;p&gt;It is based on false economics and lazyness. Wish you tried to understand mr. Pi&amp;#235;l being stripped of his diacritic (in Dutch).&lt;/p&gt;</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6727430</link><pubDate>Tue, 11 Dec 2007 00:11:59 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6727430</guid><dc:creator>WALDO</dc:creator><description>&lt;p&gt;Your sample code fails to strip a diacritic mark in the second block. The fifth character should convert to a 'd', but remains as is. Is there something I'm missing?&lt;/p&gt;
&lt;p&gt;P.S. - &amp;quot;Stripping diacritics is tentamount to MURDER.&amp;quot;&lt;/p&gt;
&lt;p&gt;Stripping diacritics is necessary when developing a URL structure based on user input. For example &lt;a rel="nofollow" target="_new" href="http://foo.bar/mrPi&amp;#235;l/"&gt;http://foo.bar/mrPi&amp;#235;l/&lt;/a&gt; is an ugly URL, but &lt;a rel="nofollow" target="_new" href="http://foo.bar/mrPiel/"&gt;http://foo.bar/mrPiel/&lt;/a&gt; is much friendlier.&lt;/p&gt;</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6728334</link><pubDate>Tue, 11 Dec 2007 01:58:04 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6728334</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;I have no idea what &amp;quot;second block&amp;quot; you are referring to, Waldo.&lt;/p&gt;
&lt;p&gt;Though you may want to look at a few of the trackbacks? And the updated code as the post mentions?&lt;/p&gt;
</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6735575</link><pubDate>Tue, 11 Dec 2007 17:47:45 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6735575</guid><dc:creator>WALDO</dc:creator><description>&lt;p&gt;Even the updated code fails to change that character.&lt;/p&gt;</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6735659</link><pubDate>Tue, 11 Dec 2007 17:58:56 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6735659</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;WALDO.... &amp;nbsp;*what* character? *What* second block? Please provide the repro as I still have no idea what you are talking about.&lt;/p&gt;
</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6797692</link><pubDate>Tue, 18 Dec 2007 19:22:02 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6797692</guid><dc:creator>WALDO</dc:creator><description>&lt;p&gt;c:\temp\samples&amp;gt;remove &amp;#226;&amp;#227;&amp;#228;&amp;#229;&amp;#231;&amp;#232;&amp;#233;&amp;#234;&amp;#235; &amp;#236;&amp;#237;&amp;#238;&amp;#239;&amp;#240;&amp;#241;&amp;#242;&amp;#243; &amp;#244;&amp;#245;&amp;#246;&amp;#249;&amp;#250;&amp;#251;&amp;#252;&amp;#253;&lt;/p&gt;
&lt;p&gt;aaaaceeee&lt;/p&gt;
&lt;p&gt;iiiidnoo &amp;nbsp; &amp;nbsp;&amp;lt;-- second block, fifth character&lt;/p&gt;
&lt;p&gt;ooouuuuy&lt;/p&gt;</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6797748</link><pubDate>Tue, 18 Dec 2007 19:33:05 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6797748</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;Sorry, WALDO -- that is by design. &lt;/p&gt;
&lt;p&gt;LATIN SMALL LETTER ETH does not decompose to LATIN SMALL LETTER D. &lt;/p&gt;
&lt;p&gt;It never has and homegrown &amp;quot;ASCIIFICATIONS&amp;quot; are something you are on your own with (the given solution is based on the Unicode Standard's own published composition/decomposition mappings).&lt;/p&gt;
</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6798110</link><pubDate>Tue, 18 Dec 2007 20:42:07 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6798110</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;P&gt;FYI -- The initial sample was run against an older version of the framework that had an incorrect mapping -- the problem was fixed and we now conform to Unicode here in .NET....&lt;/P&gt;</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#6798725</link><pubDate>Tue, 18 Dec 2007 22:07:34 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6798725</guid><dc:creator>WALDO</dc:creator><description>&lt;p&gt;If that is by design, then cool. It's just that the posting produced something different than what the framework did. I just wanted you to be aware of that.&lt;/p&gt;
&lt;p&gt;I have no concerns about fixing that particular character. I wasn't even convinced that there was a translation for that character. I just wanted to be sure I was actually getting what I was expecting. My concern was whether I was doing something wrong, or should I change my expectation to differ from the sample provided. It turns out the latter.&lt;/p&gt;</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#8671249</link><pubDate>Mon, 30 Jun 2008 15:15:34 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8671249</guid><dc:creator>ojejej</dc:creator><description>&lt;p&gt;There is tool wReplace which removes diacritic:&lt;/p&gt;
&lt;p&gt;&lt;a rel="nofollow" target="_new" href="http://wwidgets.com/us_wReplace.html"&gt;http://wwidgets.com/us_wReplace.html&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There is also replacement table available inside, so you can test your solutions.&lt;/p&gt;</description></item><item><title>Speicherung von UTF-8 kodierter Strings | hilpers</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#9347614</link><pubDate>Tue, 20 Jan 2009 17:56:21 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9347614</guid><dc:creator>Speicherung von UTF-8 kodierter Strings | hilpers</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://www.hilpers.com/271727-speicherung-von-utf-8-kodierter"&gt;http://www.hilpers.com/271727-speicherung-von-utf-8-kodierter&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>re: Stripping diacritics....</title><link>http://blogs.msdn.com/michkap/archive/2005/02/19/376617.aspx#9366573</link><pubDate>Thu, 22 Jan 2009 14:22:24 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9366573</guid><dc:creator>Dilip</dc:creator><description>&lt;p&gt;Thanks. This helped a lot.&lt;/p&gt;
&lt;p&gt;My search would have been much easier if a description like European special character alphabets was included. &lt;/p&gt;</description></item></channel></rss>