Fabulous Adventures In Coding

Eric Lippert's Blog

Binary Files and the File System Object Do Not Mix

OK, back to scripting today.

But before I get back to scripting issues, one brief correction. An attentive reader noted that "The Well-Tempered Clavier" was in fact designed to sound good on a "well tempered" instrument, not an "equally tempered" instrument. The difference is that a "well" temperament is designed so that every key sounds good, but is allowed to have some badly-out-of-tune intervals that must be avoided. (Traditionally these are called "wolf intervals".)

There was considerable controversy when equal temperament was introduced in Europe. I suppose it was the "what is the One True Bracing Style?" ridiculous issue of the day.

Another commenter pointed out that you could translate my wav-writing program into VBScript by using the File System Object to write out the bytes. To simplify their code down to a program that writes out individual bytes:

' DO NOT DO THIS
Set FSO=CreateObject("Scripting.FileSystemObject")
Set File=FSO.CreateTextFile("c:\test.bin", True)
For i = 0 to 255
  File.Write Chr(i)
Next
File.Close

And sure enough, this writes out a binary file consisting of those bytes.

Please don't do that. See that line that says "CreateTextFile"? We wrote that method to create a text file, not a binary file. Though this code might appear to work, it actually does not. Text files are more than just binary files that can be interpreted as text. Text files have to conform to certain rules to ensure that they can be sensibly interpreted as text in the local code page. If that's not 100% clear to you, read Joel's article on the subject before we go on.

Let me give you an example that clearly fails. What does this program do?

Set FSO=CreateObject("Scripting.FileSystemObject")
Set File=FSO.CreateTextFile("c:\test.bin", True)
For i = 0 to 255
  File.Write Chr(&hE0)
Next
File.Close

If you said "it writes out a binary file consisting of 256 E0 bytes," bzzt! Sorry, try again. The correct answer is "it writes out a binary file consisting of 256 E0 bytes on any operating system where the user's default ANSI code page does not define E0 as a lead byte in a DBCS encoding, like, say, Japanese, in which case it writes out 256 zeros."

In the Japanese code page, just-plain-chr(E0) is not even a legal character, so Chr will turn it into a zero. 

If I were whipping up a little one-off program on my own to write out a binary file -- well, I'd personally do it in C, but I can see how some people might want to do it in script. But there's a big difference between writing a one-off program that you're going to delete in five minutes, and writing a general-purpose utility program that you expect people around the world will use. That's an entirely different standard of robustness and portability. Do not use the FSO to read/write binary files, you're just asking for a world of hurt as soon as someone in DBCS-land runs your code.

I have been asked many times over the years if I know of a scriptable object that can read-write true binary files in all locales. I do not. Anyone have any suggestions? I would have thought given the number of people that have asked me, that some third party would have come up with something decent by now.

Published Wednesday, April 20, 2005 12:47 PM by Eric Lippert

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alex Papadimoulis said:

Eric, I've had a lot of luck using the ADO Stream object.
April 20, 2005 1:43 PM
 

Curtis Hulett said:

I have used this in the past, but I don't know if it works in all locales. I have never had to deal with that.

Function SaveBinaryData(FileName, ByteArray)
Const adTypeBinary = 1
Const adSaveCreateOverWrite = 2

'Create Stream object
Dim BinaryStream
Set BinaryStream = CreateObject("ADODB.Stream")

'Specify stream type - we want To save binary data.
BinaryStream.Type = adTypeBinary

'Open the stream And write binary data To the object
BinaryStream.Open
BinaryStream.Write ByteArray

'Save binary data To disk
BinaryStream.SaveToFile FileName, adSaveCreateOverWrite
End Function
April 20, 2005 2:51 PM
 

Dave said:

Google says:
http://www.google.com/search?q=vbscript+binary+file

And those paths generally lead to the Adodb.Stream solution.
April 20, 2005 4:18 PM
 

Eric Lippert said:

I've heard that, and I've also heard from people that it doesn't work well from script, so I don't know who to believe. How do you create the binary array? VBScript only supports creation of arrays of variants.
April 20, 2005 4:56 PM
 

hir said:

I found that text mode works OK like so.
Only for writing, though.
-----------------------------------------

//JScript version

var str = WScript.CreateObject("ADODB.Stream");
str.type = 2; //adTypeText
str.charset = "iso-8859-1";
str.open();

for(var i = 0; i < 0x100; i++){
str.writeText(String.fromCharCode(i));
}
str.saveToFile("c:\\temp\\bin.bin", 2);
str.close();
str = null;

'VBScript version

dim str
set str = WScript.CreateObject("adodb.stream")
str.type = 2
str.charset = "iso-8859-1"
str.open

for i = 0 to &hff
str.writeText(ChrW(i)) 'uses ChrW
next
str.saveToFile "c:\temp\bin.bin", 2
str.close

-----------------------------------------

There still is a problem when you try to read some of the byte values 0x80 - 0x9f: when you read them they turn into completely different values. I guess this also relates to encoding.

I heard you could acquire an array of bytes like this (haven't tried myself):

Set DM = CreateObject("Microsoft.XMLDOM")
Set EL = DM.createElement("tmp")
EL.DataType = "bin.hex"
EL.Text = [some text in hex format]
bin = EL.NodeTypedValue
April 20, 2005 7:49 PM
 

Mike Trinder said:

Further to this, is there a reason why Binary read/write was left out of the File System Object? I would have thought given the number of people that have asked you, that microsoft would have come up with something decent by now :)

Surely it's just a simpler version of the FSO.OpenTextFile code?
April 21, 2005 9:34 AM
 

Eric Lippert said:

We certainly considered it. However, there are two main factors. First, and most important, we decided that the Script Team wanted to be in the business of building the script engines themselves, not the objects that those engines would script. We looked around the company and realized that other teams were working on object models for administration (WMI), email (CDONTS), database access (ADO), web servers (IIS), etc. Our tiny team could never do as good a job as those fully staffed and dedicated teams, and to try would have taken away time from stuff that _wasn't_ a massive duplication of effort. So we finished off the FSO and called it done. (This also explains why we did not add any features to the WScript.Network object, etc, when we inherited the WSH codebase.)

Second, adding binary file reading/writing is not as straightforward as you might think. Exposing a straightforward array of bytes on disk is only the very first step. To do it right and make it usable, we'd want to provide things like default serialization of all simple data types -- strings, ints, doubles, singles, currencies, etc. But once you bite that off -- big endian or little endian? Length prefixed? How do you handle seeking? What if the user is reading a file that has a DBCS string embedded in it and wants to translate it into a Unicode string?

You have to think about the real-world problems that people are going to have to solve with this tool, and there are a LOT of different scenarios for binary files. We didn't want to bite that off. It didn't seem like a very "scripty" scenario.
April 21, 2005 10:23 AM
 

Chris said:

You can use SoftArtisans' FileManager. It's like FSO, but can handle binary files.

http://fileup.softartisans.com/fileup-120.aspx
April 21, 2005 10:32 PM
 

Frederik Slijkerman said:

Even though writing binary files is not a 'scripty' scenario, it is something that people will want to do now and then. I don't see any reason why you couldn't have added simple binary read/write functions so you don't have to muck around with ADO stream objects or CreateTextFile.
April 22, 2005 2:06 AM
 

Marcus Tucker said:

I've been reading & writing binary data the "ADODB.Stream" object for years without any problems, but then again I haven't been using anything other than the UK codepage. But since it's got native binary handling, surely in this particular case binary is binary is binary?!
April 22, 2005 6:48 AM
 

ptorr said:

ADODB.Recordset is quite popular in... certain communities. Unfortunately it requries that you have created an ADODB.Stream object, and I don't know how you go about populating that with arbitrary content.
April 24, 2005 8:40 PM
 

zwetan said:

JSDB based on spidermonkey
can read/write binary files

see www.jsdb.org

and much more than that: database connection, socket server, E4X etc..

when I feel WSH is limited by something I automatically move to JSDB, both running ECMAScript code, portability made easy :).
April 26, 2005 1:44 PM
 

Randall K. said:

I use Perl right now to read files and don't have any problems at all with binary. It doesn't require any special methods or variable types or even library includes. It's native to the language itself. It's unfortunate that the creators of Perl (an ancient scripting language by all comparisons), has always allowed working with binary files even across different platforms, but yet VBScript and JScript script developers are just left without access to any such basic routines as working with binary file data, even through the use of ActiveX controls (because there apparently aren't any).

Keep in mind that ADODB.Stream is not a solution because it's disabled on most Windows machines now due to the security vulnerabilities it's imposed with Internet Explorer. You know, it's always nice when a workaround is suggest, then it's not even really available which I guess defeats the purpose.

--Randall
May 12, 2005 12:21 AM
 

berniem said:

"However, there are two main factors. First, and most important, we decided that the Script Team wanted to be in the business of building the script engines themselves, not the objects that those engines would script."

Ok, that's reasonable. Add a few functions to the FSO to support binary byte reads and writes and y'all are done. Simple, eh? :-)

"Second, adding binary file reading/writing is not as straightforward as you might think. Exposing a straightforward array of bytes on disk is only the very first step. To do it right and make it usable,... "

Well, that's ONE approach. Another, quite simple approach, is to NOT be the end-all and just support reading and writing a series of bytes. If someone needs to make it more "usable", they can do it themselves - that's the cost of dealing with BLOB data. And that's exactly why y'all won't know whether it's big-endian, Unicode, or dollars. Just let me get at the bytes and I'll do whatever is necessay to interpret/manipulate the data. :-)

As it is now, I seem to be left with a choice of: a) moving to another language, or b) limiting functionality. Neither is a good solution.

Thanks.
February 4, 2006 3:44 AM
 

John said:

Why is this such a big deal MS??   UN*X systems have been doing this from day 1 !!  I guess it all stems from MS (or CP/M) making the decision years ago that text files NEED CR/LF pairs to be called text files rather than the UN*X philisophy that files are a stream of bytes and it's up to the end-user (or script developer) to decide how to interpret the bytes (or words or quadword ...).
March 5, 2006 4:09 PM
 

Seb said:

How can vb create a byte array? as.

bytes = ChrB(1) & ChrB(1)

..is still a string.
March 23, 2006 11:06 AM
 

Eric Lippert said:

In Visual Basic you create a byte array by, well, creating a byte array.

dim b() as Byte

In VBScript there is no way to create a byte array.  VBScript only supports arrays of variants.

In VBScript if you create a string that contains binary data, and then pass that to an ActiveX object which expects a byte array, the default implementation of IDispatch::Invoke provided by the operating system will turn the binary string into a byte array for you.  So maybe that sneaky trick will work for you.  But my advice would be that if you need to create a byte array, the best thing to do would be to use a language which has byte arrays -- VB, C#, C++, etc.
March 23, 2006 1:55 PM
 

Igor said:

Eric,
you mentioned that creating a string that contains binary data would satisfy a COM method that expects a byte array. That is exactly what I am seeking. Could you please give an example of code that would create such string from a conventional VBScript string, say, "Hello World"? I am not "native" to ASP/VB, so maybe it's common knowledge for those who are in that world; sorry about that.
Thanks!
March 24, 2006 10:59 PM
 

Igor said:

I found out a way to create an array of bytes in VBScript using ADODB.Stream object mentioned above, which resolved the problem I had. Thanks!
March 27, 2006 3:30 PM
 

ATLANTES said:

EricLippert said:
How do you create the binary array?

One way is to create a Text ADODB.Stream and copy it into a Binary ADODB.Stream.
Consider the following snippet I recently wrote to update a database connection UDL file:

  Option Explicit

  Dim sServer, sDatabase, sUsername, sPassword
  sServer   = "servername"
  sDatabase = "database"
  sUsername = "username"
  sUsername = "password"

  Dim UDL
  UDL = ReadBinaryFile("Old.udl")
  UDL = SetValue(UDL, "Data Source", sServer)
  UDL = SetValue(UDL, "Initial Catalog", sDatabase)
  UDL = SetValue(UDL, "User ID", sUsername)
  UDL = SetValue(UDL, "Password", sPassword)
  UDL = SaveBinaryData("New.udl", UDL)

' =================================================================
' Function to read text from a binary (unicode) file
' =================================================================

  Function ReadBinaryFile(FileName)
     Const adTypeBinary = 1

     Dim BinaryStream
     Set BinaryStream = CreateObject("ADODB.Stream")
     BinaryStream.Type = adTypeBinary
     BinaryStream.Open
     BinaryStream.LoadFromFile FileName
     ReadBinaryFile = BinaryStream.Read
     BinaryStream.Close
     Set BinaryStream = Nothing
  End Function

' =================================================================
' Function to write a modified string back to a unicode file
' =================================================================

  Function SaveBinaryData(FileName, Text)
     Const adTypeBinary = 1
     Const adTypeText = 2
     Const adSaveCreateOverWrite = 2

     Dim BinaryStream
     Set BinaryStream = CreateObject("ADODB.Stream")
     BinaryStream.Type = adTypeBinary
     BinaryStream.Open
     With CreateObject("ADODB.Stream")
        .Type = adTypeText
        .Open: .WriteText Text
        .Position = 2
        .CopyTo BinaryStream, Len(Text) * 2
        .Close
     End With
     BinaryStream.SaveToFile FileName, adSaveCreateOverWrite
     BinaryStream.Close
     Set BinaryStream = Nothing
  End Function

' =================================================================
' Function replace semicolon delimited values in a unicode string
' =================================================================

  Function SetValue(Data, Key, Value)
     Dim Text, Prefix, Suffix, i
     If Len(Value) = 0 Then
        SetValue = Data
     Else
        'Drop leading character
        Text = Mid(Data, 2)
        i = InStr(Text, Key)
        Prefix = Left(Text, i + Len(Key))
        Suffix = Mid(Text, i + Len(Key))
        i = InStr(Suffix, ";")
        if i = 0 Then
           Suffix = ""
        Else
           Suffix = Mid(Suffix, i)
        End If
        'Restore leading character and concatinate new value
        SetValue = Left(Data, 1) + Prefix + Value + Suffix
     End If
  End Function
July 11, 2006 7:11 PM
 

LA.NET [EN] said:

I've been playing with SideBar gadgets for some time now. Besides some quirks (ok, they're really bugs

March 14, 2007 12:24 PM
 

Brahim Raddahi said:

Here's how to make a byte array, I extracted this from the example given by Hir

Function VariantArrayToByteArray(arr)

dim DM, EL, bin

Set DM = CreateObject("Microsoft.XMLDOM")

Set EL = DM.createElement("tmp")

EL.DataType = "bin.hex"

EL.Text = ArrayToHexString(arr)

bin = EL.NodeTypedValue

VariantArrayToByteArray = bin

End Function

April 24, 2007 9:31 AM
 

Brahim Raddahi said:

You will also need thi function:

Function ArrayToHexString(arr)

 Dim I, B

 Redim B(UBound(arr))

 For I= 0 to UBound(arr)

   B(I) = right("0" & hex(arr(I)), 2)

 Next

 ArrayToHexString = Join(B,"")

End Function

April 24, 2007 9:33 AM
 

P.G. said:

I have ie6 on my machine. I tried using adodb.stream from vbs for manipulating

binary files. I understand that adodb.stream is not getting recognized and

I am unable to save or read binary files through this. any ideas??

May 17, 2007 3:09 AM
 

Blue Streak said:

Try using an HTA or VBS file.

IE6 block dynamic content (i.e. scripts) in HTM, HTML files

May 24, 2007 12:56 PM
 

Ben said:

Surely it is better to call it MBCS-land... and anyway, aren't we all in an MBCS land these days? Everything except the most basic of basic text files (that do not include any currency symbols other than the dollar sign ;-))

October 16, 2007 1:09 PM
 

Ian Freeman said:

I came across newObjects' AXPack1 when I found out that Windows CE doesn't have FSO. It seems to have great support for Binary files. It's free too.

Check out this VBS sample:

Dim fso, file, BD, BoM

Set fso = CreateObject("newObjects.utilctls.SFMain")

Set file = fso.OpenFile(filename)

Set BD = CreateObject("newObjects.utilctls.SFBinaryData")

BD.Value = file.ReadBin(2)   'read first 2 bytes

BoM = BD.Data(0,1)   'convert first 2 bytes to byte array

file.Close()

bom now contains the byte order mark of filename in a VT_UI | VT_ARRAY byte array.

February 11, 2008 11:17 PM
 

William Shakespeare said:

If ADODB.Stream is a solution.

I dont know what is much ado aabout nothing means.

/****************************************************************************/

 Option Explicit

 Dim sServer, sDatabase, sUsername, sPassword

 sServer   = "servername"

 sDatabase = "database"

 sUsername = "username"

 sUsername = "password"

 Dim UDL

 UDL = ReadBinaryFile("Old.udl")

 UDL = SetValue(UDL, "Data Source", sServer)

 UDL = SetValue(UDL, "Initial Catalog", sDatabase)

 UDL = SetValue(UDL, "User ID", sUsername)

 UDL = SetValue(UDL, "Password", sPassword)

 UDL = SaveBinaryData("New.udl", UDL)

' =================================================================

' Function to read text from a binary (unicode) file

' =================================================================

 Function ReadBinaryFile(FileName)

    Const adTypeBinary = 1

    Dim BinaryStream

    Set BinaryStream = CreateObject("ADODB.Stream")

    BinaryStream.Type = adTypeBinary

    BinaryStream.Open

    BinaryStream.LoadFromFile FileName

    ReadBinaryFile = BinaryStream.Read

    BinaryStream.Close

    Set BinaryStream = Nothing

 End Function

' =================================================================

' Function to write a modified string back to a unicode file

' =================================================================

 Function SaveBinaryData(FileName, Text)

    Const adTypeBinary = 1

    Const adTypeText = 2

    Const adSaveCreateOverWrite = 2

    Dim BinaryStream

    Set BinaryStream = CreateObject("ADODB.Stream")

    BinaryStream.Type = adTypeBinary

    BinaryStream.Open

    With CreateObject("ADODB.Stream")

       .Type = adTypeText

       .Open: .WriteText Text

       .Position = 2

       .CopyTo BinaryStream, Len(Text) * 2

       .Close

    End With

    BinaryStream.SaveToFile FileName, adSaveCreateOverWrite

    BinaryStream.Close

    Set BinaryStream = Nothing

 End Function

' =================================================================

' Function replace semicolon delimited values in a unicode string

' =================================================================

 Function SetValue(Data, Key, Value)

    Dim Text, Prefix, Suffix, i

    If Len(Value) = 0 Then

       SetValue = Data

    Else

       'Drop leading character

       Text = Mid(Data, 2)

       i = InStr(Text, Key)

       Prefix = Left(Text, i + Len(Key))

       Suffix = Mid(Text, i + Len(Key))

       i = InStr(Suffix, ";")

       if i = 0 Then

          Suffix = ""

       Else

          Suffix = Mid(Suffix, i)

       End If

       'Restore leading character and concatinate new value

       SetValue = Left(Data, 1) + Prefix + Value + Suffix

    End If

 End Function

July 11, 2006 7:11 PM

/****************************************************************************/

April 28, 2008 5:02 AM
 

12demons said:

hem. just test that FileSystemObject can be use to READ and WRITE binary file. only that you had to CHEAT a little. Instead of using ReadAll() just use recursice Read() to the file Size. a little snippet in JScript:

-- Code Start --

var objFileSystem = new ActiveXObject('Scripting.FileSystemObject');

var objFileIO = objFileSystem.GetFile('pack.exe');

var streamIO = objFileIO.OpenAsTextStream();

var strTransform = new Array();

for (i=0;i<objFileIO.Size;i++) {

var strContent = streamIO.Read(1);

strTransform[i] = strContent.charCodeAt(0);

}

streamIO.Close();

objFileIO = objFileSystem.CreateTextFile('dumb.exe', true);

for (i=0;i<strTransform.length;i++) { objFileIO.Write(strTransform[i]); }

objFileIO.Close();

-- Code End --

Basicaly copy pack.exe into Array then write Array into dumb.exe

Surprisingly both are identical and executable. Isn't strange?

All i need now is smart way to play with the Array ....

July 16, 2008 4:32 AM
 

12demons said:

ups. Sorry ... a typo in the code. just replace:

strTransform[i] = strContent.charCodeAt(0);

into

strTransform[i] = strContent;

I use charCodeAt to play around with ASCII code .....

July 16, 2008 4:35 AM
 

Michele Fiorantino said:

Scriptable Byte Array  

Function CreateByteArray(nsize)

Dim sBin

Set sBin = CreateObject("Adodb.Stream")

sBin.Open

sBin.Type = 2 ' adTypeText = 2

sBin.WriteText String(nsize, Chr(0))

sBin.Position = 0

sBin.Type = 1 ' adTypeBinary

CreateByteArray = sBin.Read(nsize)

sBin.Close

End Function

May 11, 2009 6:17 AM
 

Kirby L. Wallace said:

I use FSO and ADO.Streams for most of my binary file ops already.  But...  A comment here...

The problem you seem to be describing here doesn't seem to have so much to do with the FileSystemObject, but rather with a quirky behaviour in the CHR() function.  The fso seems to happily, and binarily, write your binary file without hesitation.  It only broke down when you tried to make the CHR() function do something it isn't supposed to do.  

Wouldn't an ADO stream have done the same thing with that same input?

November 18, 2009 2:37 PM

Leave a Comment

(required) 
(optional)
(required) 

  
Enter Code Here: Required
Submit

About Eric Lippert

Eric Lippert is a senior developer on the Microsoft C# compiler team. Before that he worked on the framework of Visual Studio Tools For Office. Before that, he worked on the compilers, runtimes and tools for VBScript, JScript, Windows Script Host and other Microsoft Scripting technologies. He lives in Seattle and spends his free time editing books about programming languages, playing the piano, and trying to keep his tiny sailboat upright in Puget Sound.

This Blog

Syndication


© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker