<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Regular expressions, Unicode style....</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx</link><description>A few days ago, Scott Hanselman asked in the Suggestion Box: I'm doing an English/Spanish site with ASP.NET using some client side validation with Regular Expressions. I wanted to write a single Regular Expression for most large text fields: ^[\w\d\s-'.,&amp;amp;amp;#@:?!()$\/]+$</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Regular expressions and Unicode</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx#411109</link><pubDate>Sat, 23 Apr 2005 13:09:25 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:411109</guid><dc:creator>Erno de Weerd</dc:creator><description /></item><item><title>re: Regular expressions, Unicode style....</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx#411121</link><pubDate>Sat, 23 Apr 2005 14:35:41 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:411121</guid><dc:creator>Jonathan</dc:creator><description>It's כשר לפסח, not קשר לפסח.&lt;br&gt;כשר = Kosher&lt;br&gt;קשר = knot, connection</description></item><item><title>re: Regular expressions, Unicode style....</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx#411133</link><pubDate>Sat, 23 Apr 2005 15:55:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:411133</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Interesting... looking at food containers I actually see both spellings. :-(&lt;br&gt;&lt;br&gt;Not sure what to do with that -- good think it's not Passover yet I suppose....</description></item><item><title>Write better Regular expression using unicode character classes</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx#411151</link><pubDate>Sat, 23 Apr 2005 18:41:48 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:411151</guid><dc:creator>^(?:[^$]*)$ -- Matches everything, captures nothing</dc:creator><description /></item><item><title>re: Regular expressions, Unicode style....</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx#411312</link><pubDate>Sun, 24 Apr 2005 05:10:50 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:411312</guid><dc:creator>Svend Tofte</dc:creator><description>I think the linked authors problem is that he is using JScript, probably some 5.6 version, which doesn't have the (.NET) features, you list here.&lt;br&gt;&lt;br&gt;Also, I've never really exercised the regex part of .NET, so I'm no expert there, but my copy of &amp;quot;Mastering Regular Expressions&amp;quot; tells me that by default &amp;quot;\w&amp;quot; matches things, such as :Ll, :Lu, and a few others. &lt;br&gt;&lt;br&gt;I guess they could break backward compatability, when there is no existing code-base, which may (what do I know) have been the problem with the regex part of JScript (or whereever the regex code that JScript uses lives), since suddenly allowing much wider input, may not have flown well with alot of existing websites, relying on \w to mean just [A-Za-z], and no more. &lt;br&gt;&lt;br&gt;And thanks for a very cool blog btw. I never dabble much in these waters (internationalization, etc), but when I do, it's always scary.</description></item><item><title>re: Regular expressions, Unicode style....</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx#411318</link><pubDate>Sun, 24 Apr 2005 05:36:01 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:411318</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Yep, Scott is indeed looking for a JScript solution, I told him a comment to his post that there may not be an answer there other than hardcoding the desired characters....&lt;br&gt;&lt;br&gt;Glad you like the blog! :-)</description></item><item><title>re: Regular expressions, Unicode style....</title><link>http://blogs.msdn.com/michkap/archive/2005/04/23/411106.aspx#411814</link><pubDate>Mon, 25 Apr 2005 19:20:42 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:411814</guid><dc:creator>Travis Illig</dc:creator><description>JavaScript supports \u0000-\uFFFF style Unicode character expressions, so you could theoretically expand the character classes into the corresponding \uXXXX-\uYYYY style range(s) on the server side to feed to the client.&lt;br&gt;&lt;br&gt;I've posted a code sample here:&lt;br&gt;&lt;a rel="nofollow" target="_new" href="http://www.paraesthesia.com/blog/comments.php?id=809_0_1_0_C"&gt;http://www.paraesthesia.com/blog/comments.php?id=809_0_1_0_C&lt;/a&gt;&lt;br&gt;&lt;br&gt;Obviously something you'd want to cache rather than calculate each time, but it's something to think about.</description></item></channel></rss>