Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
I'm working on a prolix essay on the future of declarative programming languages, but I've been sidelined by, horrors, actually having to do lots of real work on the next version of VSTO these last couple weeks. Expect that rant sometime in the next week or so.
In the meanwhile, a coworker asked me the other day what the //U "use Unicode" switch does in the console version of Windows Script Host. "Is this so that the host can read in script files that are saved in Unicode?" he asked.
Not quite -- actually it is the output that we are worried about, not the input. The script engine (VBScript, JScript) expects the source code to be in a standard UTF-16 BSTR. WSH is responsible for getting the source code off the disk in whatever format and turning it into UTF-16 in memory. WSH does this exactly the way you'd expect -- it calls IsTextUnicode and then does whatever is necessary to get it into UTF-16. (Reverse bytes, strip the order mark, call MultiByteToWideChar, whatever.) The //U flag doesn't affect this at all.
What we don't know is what output encoding the user wants. In short, the //U flag only causes a change in behaviour when you are running cscript.exe on an NT-based operating system and redirecting the output to a file. To expand on that a bit:
The behaviour of the standard streams is kind of complicated depending on whether you're opening WScript.StdErr, StdIn, StdOut, on NT-based or 95-based OS, and whether streams are redirected or not, but suffice to say that basically it's the same as above.