Fabulous Adventures In Coding

Eric Lippert's Blog

You Can't Convert Data Structures To Strings In VBScript Without Breaking A Few Eggs

Here's a question I get every now and then:

I've written a VBScript program which calls a method on an object that returns an array of bytes containing a GUID.  VBScript only supports arrays of variants.  How can I turn this into a human-readable string?

Good question.  It is doable without writing an object in C++, but it's a little tricky.  The first thing to know is that even though VBScript does not support arrays of anything other than variant, the underlying OLE Automation library supports turning byte arrays into strings.  Therefore you can use CStr to turn the thing into a string, right?

Function GuidToString(ByteArray)
  GuidToString = CStr(ByteArray)
End Function

Print GuidToString(MyObject.GetTheGuid)

Which prints out ~å ??ATErU%'èÅp±

Oops.  We've taken those bytes and interpreted them as Unicode characters in a UTF-16 encoding.  That's not right.  We want to convert the bytes to text, preferably in hex format.  Fortunately we have it in a string now, so we can extract the bytes with the byte-manipulating versions of the string library functions. Let's try that again.

Function GuidToString(ByteArray)
 
Dim Binary, S
 
Binary = CStr(ByteArray)
  
S = "{"
  S = S & Hex(AscB(MidB(Binary, 1, 1)))
  S = S & Hex(AscB(MidB(Binary, 2, 1)))
  S = S & Hex(AscB(MidB(Binary, 3, 1)))
  S = S & Hex(AscB(MidB(Binary, 4, 1)))
  S = S & "-"  
  S = S & Hex(AscB(MidB(Binary, 5, 1)))
  S = S & Hex(AscB(MidB(Binary, 6, 1)))
  S = S & "-"  
  S = S & Hex(AscB(MidB(Binary, 7, 1)))
  S = S & Hex(AscB(MidB(Binary, 8, 1)))
  S = S & "-"  
  S = S & Hex(AscB(MidB(Binary, 9, 1)))
  S = S & Hex(AscB(MidB(Binary, 10, 1)))
  S = S & "-"  
  S = S & Hex(AscB(MidB(Binary, 11, 1)))
  S = S & Hex(AscB(MidB(Binary, 12, 1)))
  S = S & Hex(AscB(MidB(Binary, 13, 1)))
  S = S & Hex(AscB(MidB(Binary, 14, 1)))
  S = S & Hex(AscB(MidB(Binary, 15, 1)))
  S = S & Hex(AscB(MidB(Binary, 16, 1)))
  S = S & "}"
  GuidToString = S
End Function

Which prints out {7E0E50-200-BA25-4026-410540450}

Uh, shouldn't the character counts of each section be 8-4-4-4-12, instead of 6-3-4-4-9 ? 

Oops.  We need the single digit bytes like 0 to go to "00", not "0".  That's easy enough to fix up:

Function HexByte(b)
      HexByte = Right("0" & Hex(b), 2)
End Function

Function GuidToString(ByteArray)
  Dim Binary, S
  Bi
nary = CStr(ByteArray)
  S = "{"
  S = S & HexByte(AscB(MidB(Binary, 1, 1)))
  S = S & HexByte(AscB(MidB(Binary, 2, 1)))
  S = S & HexByte(AscB(MidB(Binary, 3, 1)))
  S = S & HexByte(AscB(MidB(Binary, 4, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 5, 1)))
  S = S & HexByte(AscB(MidB(Binary, 6, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 7, 1)))
  S = S & HexByte(AscB(MidB(Binary, 8, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 9, 1)))
  S = S & HexByte(AscB(MidB(Binary, 10, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 11, 1)))
  S = S & HexByte(AscB(MidB(Binary, 12, 1)))
  S = S & HexByte(AscB(MidB(Binary, 13, 1)))
  S = S & HexByte(AscB(MidB(Binary, 14, 1)))
  S = S & HexByte(AscB(MidB(Binary, 15, 1)))
  S = S & HexByte(AscB(MidB(Binary, 16, 1)))
  S = S & "}"
  GuidToString = S
End Function

Which prints out {7E00E500-2000-BA25-4026-410054004500}

Which is also wrong.  What's wrong this time? 

The logical format of a GUID in memory is not in the same order as the bytes are in the string.   A GUID stored in binary format in memory is a sixteen byte structure in the following format:

DWORD-WORD-WORD-BYTE BYTE-BYTE BYTE BYTE BYTE BYTE BYTE

So what?  Why does that matter?

It matters because a WORD consists of two bytes, but they are stored in memory in order from the least to the most significant on my Intel machine.  Same with the four-byte DWORD.  Intel boxes are "little endian" machines.  Motorolas are "big endian" -- on Macs, the big byte comes first in memory.  Which is the better scheme is one of the great holy wars of information technology.  Apparently some poor deluded people still fail to realize that little-endian architecture is much more sensible than big-endian, or that vi is a much better editor than emacs. J

(ASIDE: These whimsical terms were borrowed from Gulliver's Travels, in which Swift satirizes the political parties of his day.  In Lilliput, the Protestant rulers of England are represented by the Little Endians, the oppressed Catholics as the Big Endians.  They disagree on which is the correct way to break an egg.  See the last half of part one, chapter four for details.)

We need to decode that thing into the correct order:

Function GuidToString(ByteArray)
  Dim Binary, S
  Binary = CStr(ByteArray)
  S = "{"
  S = S & HexByte(AscB(MidB(Binary, 4, 1)))
  S = S & HexByte(AscB(MidB(Binary, 3, 1)))
  S = S & HexByte(AscB(MidB(Binary, 2, 1)))
  S = S & HexByte(AscB(MidB(Binary, 1, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 6, 1)))
  S = S & HexByte(AscB(MidB(Binary, 5, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 8, 1)))
  S = S & HexByte(AscB(MidB(Binary, 7, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 9, 1)))
  S = S & HexByte(AscB(MidB(Binary, 10, 1)))
  S = S & "-"  
  S = S & HexByte(AscB(MidB(Binary, 11, 1)))
  S = S & HexByte(AscB(MidB(Binary, 12, 1)))
  S = S & HexByte(AscB(MidB(Binary, 13, 1)))
  S = S & HexByte(AscB(MidB(Binary, 14, 1)))
  S = S & HexByte(AscB(MidB(Binary, 15, 1)))
  S = S & HexByte(AscB(MidB(Binary, 16, 1)))
  S = S & "}"
  GuidToString = S
End Function

Which prints out {00E5007E-0020-25BA-4026-410054004500}, the correct string.

The whole point of script programming languages is to abstract away from the underlying details of how the machine works.  Occasionally though these abstractions prove to be leaky. This is one of those times when in order to make sense of something, you need to understand some pretty low-level trivia about how computers work.

Published Tuesday, May 25, 2004 11:06 AM by Eric Lippert

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Chris Dickens said:

Did you know that in <b>The Matrix Reloaded's</b> freeway chase there is a truch that says "Big Endian Eggs" on the side? There are pictures here:

http://whatisthematrix.warnerbros.com/rl_cmp/onset_page08.html
May 25, 2004 12:47 PM
 

Eric Lippert said:

I did not know that. That is quite amusing! I'll have to look for that next time I see it.
May 25, 2004 12:57 PM
 

Kyle Lahnakoski said:

Why do we care that "The logical format of a GUID in memory is not in the same order as the bytes are in the string"? GUIDs only need an equality relation defined on them, therefore we should not care how we serialize GUIDs, just as long as we have uniquely (and possibly the process is reversible).
May 25, 2004 11:41 PM
 

Muhammad Ali Shah said:

Lahnakoski: Sometimes we need to convert GUID stored in memory into a human readable string just to show it to the user; then it matters how we are displaying it and how it is stored.
May 26, 2004 12:46 AM
 

Kyle Lahnakoski said:

Muhammad Ali Shah: You are assuming the user is interested in seeing past the vbscript abstraction. In this case the abstraction can not be considered "leaky" because the user is interested in the *implementation* of the abstraction.

May 26, 2004 6:33 AM
 

Eric Lippert said:

Guids need more than an equality relation, they also need a consistent guid-to/from-string operation. Otherwise you can't take a class id and look it up in the registry.
May 26, 2004 8:19 AM
 

Kyle Lahnakoski said:

Eric: The third and fourth definitions of GuidToString() both produce strings that can be recomposed into guids. So both are equally good.
May 26, 2004 10:17 AM
 

Eric Lippert said:

You sure?

Then write me a method which takes as its input a byte array containing a guid, and outputs True if that guid is registered under HKEY_CLASSES_ROOT\CLSID, False if it is not.
May 26, 2004 10:41 AM
 

Kyle Lahnakoski said:

Eric: Good example, that is the type of example I was fishing for. I was not aware of all your requirements. Secifically, I was not aware you had to compare your serialized GUIDs to other systems' serialized GUIDs (like the registry system).
May 27, 2004 2:41 PM
 

Robbo said:

Doing something similar - thought I'd use a timestamp (actually nothing to do with dates, can be considered as an array of 8 bytes or possibly as a bigint).

Gets returned to vbscript from ado as an array of bytes (8 elements, zero to seven).

Having real probs dealing with it in asp - so am doing it in sql ie CAST(CAST(stamp AS BIGINT) AS VARCHAR) AS converted_stamp

cludgy but time is tight! All I waant to do is to store it in a hidden so that the update is dependent on no other edits of the record - overwrite or reload.
July 27, 2004 6:41 AM
 

Michiel said:

I noticed that:

S = S & HexByte(AscB(MidB(Binary, 10, 1)))
S = S & HexByte(AscB(MidB(Binary, 9, 1)))

should be swapped.

I built a small script to collect the msExchMailboxGUID from AD, but it only came out right after I swapped the two lines mentioned above.

Any thoughts on that?
November 25, 2005 6:22 AM
 

Eric Lippert said:

Whoops -- yep, that's a typo. Thanks for pointing that out. I've corrected the text.
November 25, 2005 3:32 PM
 

mark a said:

Any tips on how to do this in reverse ?. ie pass an array of bytes from VBScript to .Net

From VBScript, I am trying to use a .Net class (System.Security.Cryptography.HMACSHA1) method that expects an array of bytes.

Its easy to define an array of bytes in VBSCript (Dim foo() as Byte) but internally foo is just a variant that is pointing to an array of variants, if i am correct.

So the question would be, what does the Ole Automation/ COM interop layer do when passing a VBScript array defined as above to a .Net Class that expects an array of bytes. I am getting an 'Invald Procedure Call' error from the script engine, and unfortunatly do not have a debugger to see what the array is looking like on the .Net side.

Thanks for any help,

Mark

December 2, 2006 2:08 PM
 

Abstraction Leaks | BETA said:

March 29, 2008 3:36 AM
 

Abstraction Leaks | BETA said:

March 29, 2008 3:36 AM
 

Cornan the Iowan said:

It's worth noting for those still dealing with VBScript that this routine is not needed for GUIDs returned from SQL Server as a "uniqueidentifier" via ADO.  (Yes, the article starts with a mention of "array of bytes", but some people might say that a uniqueidentifier IS an "array of bytes", too).

Anyway, if the argument to the function is in a form that VBSCript / ADO / OLE recognizes as a uniqueidentifier the statement "Binary = CString(ByteArray)" will simply do the conversion to string format complete with braces.

In that case, GuidToString() will be trying to format the text string, not the original binary string.

To make the functino more "univeral", it could check if BINARY simply contained a GUID-formatted string and skip the "MyHex" calls.  Simplistically:

BINARY = CStr(ByteArray)

IF LEN(Binary) = 38 THEN

  IF MID(Binary, 1, 1) = "{" AND MID(Binary,38,1) = "}" THEN

     GuidToSTring = BINARY

     EXIT FUNCTION

     END IF

  END IF

October 21, 2009 8:41 AM

Leave a Comment

(required) 
(optional)
(required) 

  
Enter Code Here: Required
Submit

About Eric Lippert

Eric Lippert is a senior developer on the Microsoft C# compiler team. Before that he worked on the framework of Visual Studio Tools For Office. Before that, he worked on the compilers, runtimes and tools for VBScript, JScript, Windows Script Host and other Microsoft Scripting technologies. He lives in Seattle and spends his free time editing books about programming languages, playing the piano, and trying to keep his tiny sailboat upright in Puget Sound.

This Blog

Syndication


© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker