<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx</link><description>The character in question is U+005c , the REVERSE SOLIDUS, also known as the backslash or '\'. It is the path separator for Windows, which is encoded at 0x5c across all of the ANSI code pages. Since path separators are a pretty important requirement,</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#470086</link><pubDate>Sat, 17 Sep 2005 13:07:32 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:470086</guid><dc:creator>Chris Lundie</dc:creator><description>Interesting and confusing! I fired up Word, changed the font to MS Gothic, typed a backslash and indeed it displays a Yen symbol. Changing back to Times, it looks like a backslash again.</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#470255</link><pubDate>Sat, 17 Sep 2005 16:59:47 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:470255</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Hi Chris -- well, if nothing else, it helps give confidence in the idea that the actual code point value does not change, even if the representative glyph does....</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#470392</link><pubDate>Sat, 17 Sep 2005 18:50:47 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:470392</guid><dc:creator>Ben Bryant</dc:creator><description>So when you convert a string like &amp;quot;Yen 100&amp;quot; (where Yen is the symbol) from Shift-JIS to Unicode, the Yen becomes a backslash, and the fact that it was a Yen is unknown. But most users are not the wiser because their font shows the backslash as a Yen. This is bad! I suppose most Japanese Unicode databases must have some major backslash Yen confusion where you might have to implement a policy across unrepaired data like if you know the string is a file path treat U+005c as a backslash, otherwise as a Yen U+20a9. And Unicode string functions that search for the Yen probably are sometimes implemented due to user demand to cheat and look for the backslash too. Crazy stuff.&lt;br&gt;Btw, the Korean ks_c_5601-1987 has an encoding for the Yen, but Japanese Shift-JIS doesn't have one for the Won.&lt;br&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#470419</link><pubDate>Sat, 17 Sep 2005 19:06:22 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:470419</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Hi Ben -- Although I admit the situation is not ideal, I think it is pretty obvious that the situation would be a *lot* worse if there were no path separators since that would pretty much destroy a lot more.&lt;br&gt;&lt;br&gt;And it was not really up to anyone but JIS to decide what should be done in the Japanese encoding.&lt;br&gt;&lt;br&gt;Though if you are using Unicode then you can use the actual Yen character and call it a day! :-)</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#470461</link><pubDate>Sat, 17 Sep 2005 19:39:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:470461</guid><dc:creator>Ben Bryant</dc:creator><description>Thanks for the quick reply. But you almost answer as if I was blaming you or anyone. No, I agree the right choice was made in the JIS to Unicode conversion; I am just chiming in on the fact that it is a problematic situation. I do take comfort in the fact that the code point 5c is not changed. Anyway, encoding systems are full of these ideosyncracies, but the strange thing about this one is that it involves fonts. So many layers of complexity: a character set can be represented in different encodings and displayed differently by different fonts, what fun!</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#470492</link><pubDate>Sat, 17 Sep 2005 19:57:49 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:470492</guid><dc:creator>Michael S. Kaplan</dc:creator><description>No worries, Ben -- I was not taking it as blame or anything. :-)</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#472326</link><pubDate>Wed, 21 Sep 2005 17:54:50 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:472326</guid><dc:creator>Claus Brod</dc:creator><description>Thanks for the interesting post. I've come across another slightly surprising conversion - the tilde character (0x7e) is mapped 1:1 when converting from CP932 to, say, UTF16 (MultiByteToWideChar). However, when I use libiconv to convert a tilde character, and tell it that the source encoding is &amp;quot;SJIS&amp;quot;, it will map the tilde to U+203E. I found a few slightly mysterious references to this behavior, but nothing that would really explain the reasoning behind this mapping...&lt;br&gt;&lt;br&gt;Claus&lt;br&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481057</link><pubDate>Fri, 14 Oct 2005 17:29:05 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481057</guid><dc:creator>Steve Loughran</dc:creator><description>This is fascinating. Of course, the whole fact that DOS-derived platforms use \ as a dir separator is iself a bit of a mess -I've always assumed it was because DOS 1.x used / as an argument prefix, so when directories came along in 2.0, they had to use a different char. This is just another unintended consequence of the first, well, error. </description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481065</link><pubDate>Fri, 14 Oct 2005 17:40:38 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481065</guid><dc:creator>Rune</dc:creator><description>IIRC, the backslash maps to &amp;quot;&amp;#216;&amp;quot; using a norwegian codepage with 7-bit ASCII. Later (8-bit codepage 865) the &amp;#216; moved to the same spot normally inhabited by the Yen sign (american cp 437).&lt;br&gt;&lt;br&gt;So a backslash is probably a lot of things around the world, historically speaking.&lt;br&gt;&lt;br&gt;-- &lt;br&gt;Rune</description></item><item><title>Paths</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481134</link><pubDate>Fri, 14 Oct 2005 20:14:32 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481134</guid><dc:creator>Craig Ringer</dc:creator><description>Quite frankly, these days I begin to think that having common characters as path and argument separators is anachronistic and pretty nasty. Any possible choice from the standard character set will be largely arbitrary and will impair &amp;quot;legitimate&amp;quot; use of those characters. It's necessary due to all the legacy code out there and it'd be an unpleasant thing to try to change to dedicated symbols, but that doesn't make it any nicer.&lt;br&gt;&lt;br&gt;Nonetheless, I still find it annoying that  \ (or / on UNIX, or : on Mac OS &amp;lt; 10) are &amp;quot;special&amp;quot; to the system and can't be used in filenames. Similarly the need to quote &amp;quot;multi word arguments&amp;quot; seems silly these days. Dedicated delimeters just for those two purposes, and nothing else, would seem a much nicer way to do it if we got the option to do it all again.&lt;br&gt;&lt;br&gt;Alas, it's unlikely to ever happen. We'll still want to use 7-bit ASCII serial terminals, still want to use ancient systems that don't understand utf-8 or UCS-2, and so on. Anyway, I swear every time I have to use \these\silly\paths , just as I'm sure many folks here find /these/paths/very/annoying ; getting used to someːotherːpathːseparator (obviously not actually ː , I just use that as an example) would be pretty irritating. Especially having to use some sort of compose sequence or shortcut to type it on &amp;quot;legacy&amp;quot; keyboards...</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481161</link><pubDate>Fri, 14 Oct 2005 21:02:52 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481161</guid><dc:creator>andypennell</dc:creator><description>Why was \ the dir separator in DOS? A friend of a friend once bought a big wooden desk from a sale on Microsoft campus in the early 90s. The desk had not been cleared out: inside it was a pice of paper containing discussion notes on which separator should be used. IBM was mentioned, but I cannot recall the rest of the details.</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481184</link><pubDate>Fri, 14 Oct 2005 21:47:03 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481184</guid><dc:creator>mikeb</dc:creator><description>As far as the '\' character being used as the path separator in DOS (starting with DOS 2.0, since 1.x did not support sub directories), I would have to agree with Steve Loughran that the fact that DOS commands used '/' as a command line option 'switch character' is probably the number 1 reason.&lt;br&gt;&lt;br&gt;Remember that internally and at the API level DOS supports using either '/' or '\' as a path separator (I'm not sure if this agnosticism goes all the way back to DOS 2.0) - it's applications that don't like '/'.&lt;br&gt;&lt;br&gt;Also remember that early versions of DOS supported setting the 'switch character' to something other than '/'.  Unfortunately each DOS application is responsible for parsing it's command line, and virtually no 3rd party applications supported the switch character setting (Microsoft applications may have been guilty of this, too).  &lt;br&gt;&lt;br&gt;At some point MS removed the set switch char API - a *very* rare thing for Microsoft to do (just ask Raymond Chen).  I mean, the old CP/M compatibility DOS calls are still supported even in the WinXP VDM.&lt;br&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481275</link><pubDate>Sat, 15 Oct 2005 02:06:23 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481275</guid><dc:creator>Dewi Morgan</dc:creator><description>Mikeb points out that the problem with path parsing is that it depends on the applications to support it.&lt;br&gt;&lt;br&gt;It seems strange that people aren't offered a choice of visual style nowadays, though, since 4c is the character that applications expect.&lt;br&gt;&lt;br&gt;Just need a little checkbox in windows versions affected by the 0x4c issue:&lt;br&gt;&lt;br&gt;[ ] Display path separator as '\'&lt;br&gt;&lt;br&gt;So that wherever 4c was displayed, a '\' would be shown instead of a currency symbol.</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481283</link><pubDate>Sat, 15 Oct 2005 02:37:21 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481283</guid><dc:creator>Michael S. Kaplan</dc:creator><description>I think you meant 0x5c/U+005c, right? :-)&lt;br&gt;&lt;br&gt;You do have a choice -- this only happens with CJK fonts....</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481321</link><pubDate>Sat, 15 Oct 2005 06:07:53 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481321</guid><dc:creator>foxyshadis</dc:creator><description>I suppose you could use a tool (eg, fontographer or fontforge) to replace 0x5c in your favorite fonts (tahoma, ms sans serif, maybe times, arial, and verdana) with your favorite path separator. Oh, and paint over your keyboard's \ key. That would be an amusing extension of 'skinning' the OS. =p</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#481392</link><pubDate>Sat, 15 Oct 2005 15:19:11 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:481392</guid><dc:creator>koji</dc:creator><description>IIRC, if you go back a little further, ASCII defines 0x5c as one of the &amp;quot;localizable&amp;quot; code point, and that is why several countries have several different glyphs here.&lt;br&gt;&lt;br&gt;DOS 2 made a mistake by choosing such a localizable code point as the path separator. Well, I don't think anyone can blame on it though.&lt;br&gt;&lt;br&gt;Whether we should fix this glyph or not is as good open question as whether we should fix the path separator to &amp;quot;/&amp;quot;, which is not a localizable code point in ASCII.&lt;br&gt;&lt;br&gt;And, although both are good questions, I don't think anyone could fix either.</description></item><item><title>On the fuzzier definition of a 'Unicode application' on Win9x....</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#553691</link><pubDate>Fri, 17 Mar 2006 16:12:31 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:553691</guid><dc:creator>Sorting It All Out</dc:creator><description>Riffing on Raymond here, and his post On the fuzzy definition of a &amp;amp;quot;Unicode application&amp;amp;quot;....&lt;br&gt;The points...</description></item><item><title>Two chickens in every pot, and an ASCII in every code page</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#608236</link><pubDate>Fri, 26 May 2006 22:04:10 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:608236</guid><dc:creator>Sorting It All Out</dc:creator><description>Francisco Moraes asked in the Suggestion Box: &lt;br&gt;&lt;br&gt;Are there any code pages (exception EBCDIC) where the...</description></item><item><title>A yen for Yen may be left unsatiated</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#1972985</link><pubDate>Wed, 28 Mar 2007 10:15:35 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1972985</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;(No, this post is not about anything kinky) If you are a regular reader then you know that I have been&lt;/p&gt;
</description></item><item><title>Trying to get people to use Unicode? Lock and load, baby!</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#5512712</link><pubDate>Thu, 18 Oct 2007 22:47:07 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:5512712</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;P&gt;One of the more fascinating conversations I had with a customer at the recent Internationalization and&lt;/P&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#6119620</link><pubDate>Mon, 12 Nov 2007 02:02:04 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6119620</guid><dc:creator>not given</dc:creator><description>&lt;p&gt;Was a work around ever found for this issue? Is there a way to keep Japanese language support in Windows and have the backslash ( \ ) display correctly in address fields instead of the yen symbol?&lt;/p&gt;
&lt;p&gt;It works correctly when typing in fourms, but not in the address fields? Why is this?&lt;/p&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#6119801</link><pubDate>Mon, 12 Nov 2007 02:09:52 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6119801</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;It is all about the font selected and sometimes the technology doing the rendering -- and for every person who considers one behavior to be a bug, there is another who thinks the other is a bug.&lt;/p&gt;
&lt;p&gt;Which essentially makes it unfixable, at least for everyone....&lt;/p&gt;
</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#6997509</link><pubDate>Sun, 06 Jan 2008 02:26:11 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6997509</guid><dc:creator>Mike</dc:creator><description>&lt;p&gt;I got the same thing, after playing Clannad the primary fonts seem to be MS Gothic lol, yet it still types as \ here.&lt;/p&gt;</description></item><item><title>http://en.wikipedia.org/wiki/backslash</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#7695108</link><pubDate>Thu, 14 Feb 2008 18:42:25 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7695108</guid><dc:creator>TrackBack</dc:creator><description /></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#9372028</link><pubDate>Fri, 23 Jan 2009 06:30:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9372028</guid><dc:creator>spitzak</dc:creator><description>&lt;p&gt;The reason 0x5c still prints a Yen is not because of Windows paths, but because of Windows *text* files. There must be vast numbers of Japanese Windows text files where 0x5c is used *as* a Yen symbol. You can't make them all suddenly display backslash. And there is not likely any intelligent way to figure out whether a Yen or backslash is intended and translate the documents.&lt;/p&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#9563058</link><pubDate>Wed, 22 Apr 2009 23:04:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9563058</guid><dc:creator>Anonymous</dc:creator><description>&lt;p&gt;I recently set up Japanese language support on my Vista machine - it was a surprise to see cmd.exe rendering paths wrong. &amp;nbsp;My fix? &amp;nbsp;Use xterm from the cygwin toolkit as your terminal. &amp;nbsp;It looks better than the Windows terminal as well.&lt;/p&gt;
&lt;p&gt;Also, &amp;gt;&amp;gt;spitzak :&lt;/p&gt;
&lt;p&gt;It would be an easy enough fix, actually. &amp;nbsp;Just add another backwards compatibility option &amp;quot;Render \ as &amp;#165; (Use for pre-[year] text documents)&amp;quot; to the menu available when alt-clicking a file. &amp;nbsp;That way, people who needed the old style would be able to use it and the general population would be able to see actual '\' characters where they belong.&lt;/p&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#9718597</link><pubDate>Wed, 10 Jun 2009 01:17:53 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9718597</guid><dc:creator>Russ</dc:creator><description>&lt;p&gt;Since '\'s are contained in text files where the codepage isn't defined one way or the other, the only sane thing to do going forward is to define it as a '\' and not a Won or Yen. The currencies differ by a factor of about 10. Since we now regularly send files around this thing called the &amp;quot;Internet&amp;quot;, it should not be guessed that it is a Won on Yen. Stop the insanity from going any further? Why wasn't this changed when unicode support was added to Windows?&lt;/p&gt;
&lt;p&gt;If the '\' is contained in a document where the codepage is defined, then sure, make it the appropriate Yen or Won symbol and if converted to unicode, save it as the appropriate unicode symbol.&lt;/p&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#9776348</link><pubDate>Thu, 18 Jun 2009 18:14:39 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9776348</guid><dc:creator>Pat</dc:creator><description>&lt;p&gt;The last post is totally relevant. &amp;nbsp;This really should be fixed. &amp;nbsp;For people in IT who need to make screenshots of file paths and realated documentation this is a real pain.&lt;/p&gt;</description></item><item><title>re: When is a backslash not a backslash?</title><link>http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx#9776811</link><pubDate>Thu, 18 Jun 2009 20:33:12 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9776811</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;People in IT who need to make screenshots need to be more respectful of decade long differences in their target markets. And if they want to support customers in other countries they need not insist that the other countries change to make the IT life easier....&lt;/p&gt;
&lt;p&gt;Also, people in IT making documentation for Japan and Korea should have the documentation in Japanese and Korean anyway -- if not then their yen/won/solidus issues are *not* why their docs are not appreciated!&lt;/p&gt;
</description></item></channel></rss>