# March, 2004

• #### How do I convert a SID between binary and string forms?

Of course, if you want to do this programmatically, you would use ConvertSidToStringSid and ConvertStringSidtoSid, but often you're studying a memory dump or otherwise need to do the conversion manually.

If you have a SID like S-a-b-c-d-e-f-g-...

Then the bytes are

 a (revision) N (number of dashes minus two) bbbbbb (six bytes of "b" treated as a 48-bit number in big-endian format) cccc (four bytes of "c" treated as a 32-bit number in little-endian format) dddd (four bytes of "d" treated as a 32-bit number in little-endian format) eeee (four bytes of "e" treated as a 32-bit number in little-endian format) ffff (four bytes of "f" treated as a 32-bit number in little-endian format) etc.

So for example, if your SID is `S-1-5-21-2127521184-1604012920-1887927527-72713`, then your raw hex SID is

010500000000000515000000A065CF7E784B9B5FE77C8770091C0100

This breaks down as follows:

 01 S-1 05 (seven dashes, seven minus two = 5) 000000000005 (5 = 0x000000000005, big-endian) 15000000 (21 = 0x00000015, little-endian) A065CF7E (2127521184 = 0x7ECF65A0, little-endian) 784B9B5F (1604012920 = 0x5F9B4B78, little-endian) E77C8770 (1887927527 = 0X70877CE7, little-endian) 091C0100 (72713 = 0x00011c09, little-endian)

Yeah, that's great, Raymond, but what do all those numbers mean?

 S-1- version number (SID_REVISION) -5- SECURITY_NT_AUTHORITY -21- SECURITY_NT_NON_UNIQUE -...-...-...- these identify the machine that issued the SID 72713 unique user id on the machine

Each machine generates a unique ID that it uses to stamp all the SIDs it creates (-...-...-...-). The last number is a "relative id (RID)" that represents a user created by that machine. There are a bunch of predefined RIDs; you can see them in the header file ntseapi.h, which is also where I got these names from. The system reserves RIDs up to 999, so the first non-builtin account gets assigned ID number 1000. The number 72713 means that this particular SID is the 71714th SID created by the issuer. (The machine that issued this SID is clearly a domain controller, responsible for creating the accounts of tens of thousands of users.)

(Actually, I lied above when I said that this is the 71714th SID created by the issuer. Large servers can delegate SID creation to helpers, in which case SID issuance is no longer strictly consecutive.)

Security isn't my area of expertise, so it's entirely possibly (perhaps even likely) that I got something wrong up above. But it's mostly correct, I think.

• #### Senators are really good at stock-picking

A Georgia State University study shows that U.S. senators have an uncanny knack for picking stocks that outpace the overall market. Professor Alan Ziobrowski's analysis of senators' financial disclosure data found that over a period of six years, the lawmakers outperformed the market by 12 percent.

Professor Ziobrowski seems convinced that this is evidence of unethical behavior.

• #### What is the default security descriptor?

All these functions have an optional LPSECURITY_ATTRIBUTES parameter, for which everybody just passes NULL, thereby obtaining the default security descriptor. But what is the default security descriptor?

Of course, the place to start is MSDN, in the section titled Security Descriptors for New Objects.

It says that the default DACL comes from inheritable ACEs (if the object belongs to a hierarchy, like the filesystem or the registry); otherwise, the default DACL comes from the primary or impersonation token of the creator.

But what is the default primary token?

Gosh, I don't know either. So let's write a program to find out.

```#include <windows.h>
#include <sddl.h> // ConvertSecurityDescriptorToStringSecurityDescriptor

int WINAPI
WinMain(HINSTANCE, HINSTANCE, LPSTR, int)
{
HANDLE Token;
if (OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &Token)) {
DWORD RequiredSize = 0;
GetTokenInformation(Token, TokenDefaultDacl, NULL, 0, &RequiredSize);
TOKEN_DEFAULT_DACL* DefaultDacl =
reinterpret_cast<TOKEN_DEFAULT_DACL*>(LocalAlloc(LPTR, RequiredSize));
if (DefaultDacl) {
SECURITY_DESCRIPTOR Sd;
LPTSTR StringSd;
if (GetTokenInformation(Token, TokenDefaultDacl, DefaultDacl,
RequiredSize, &RequiredSize) &&
InitializeSecurityDescriptor(&Sd, SECURITY_DESCRIPTOR_REVISION) &&
SetSecurityDescriptorDacl(&Sd, TRUE,
DefaultDacl->DefaultDacl, FALSE) &&
ConvertSecurityDescriptorToStringSecurityDescriptor(&Sd,
SDDL_REVISION_1, DACL_SECURITY_INFORMATION, &StringSd, NULL)) {
MessageBox(NULL, StringSd, TEXT("Result"), MB_OK);
LocalFree(StringSd);
}
LocalFree(DefaultDacl);
}
CloseHandle(Token);
}
return 0;
}
```

Okay, I admit it, the whole purpose of this entry is just so I can call the function ConvertSecurityDescriptorToStringSecurityDescriptor, quite possibly the longest function name in the Win32 API. And just for fun, I used the NT variable naming convention instead of Hungarian.

If you run this program you'll get something like this:

```D:(A;;GA;;;S-1-5-21-1935655697-839522115-854245398-1003)(A;;GA;;;SY)
```

Pull out our handy reference to the Security Descriptor String Format to decode this.

• "D:" - This introduces the DACL.
• "(A;;GA;;;S-...)" - "Allow" "Generic All" access to "S-...", which happens to be me. Every user by default has full access to their own process.

Next time, I'll teach you how to decode that S-... thing.

• #### What happens to those "To Any Soldier" care packages

Commentator and novelist Christian Bauman recalls the excitement of receiving mail from anonymous well-wishers back home during his deployment with the U.S. Army in Somalia in the early 1990s.

This was a fascinating listen.

The coup, of course, was getting a letter with a snapshot or two inside. I don't know why, but the further west the return address, the more likely the envelope had a picture. And the more north, the more likely the picture was, shall we say, "revealing". Triangulate this equation, and you discover that the girls in the northwest get a real charge out of showing the troops exactly what it is they are fighting for.

Make sure to stick to the end for the punch line. (Every good story has a punch line.)

• #### Tony Harding laces up again

The skater you love to hate is back.

Tonya Harding will lace up for a single game tomorrow with the Indianapolis Ice, which coincidentally happens to be "Guaranteed Fight Night". (If there's no fight, you get a free ticket to another game.)

Personally, I don't think it's right when somebody benefits from having done something wrong.

• #### Why are dialog boxes initially created hidden?

You may not have noticed it until you looked closely, but dialog boxes are actually created hidden initially, even if you specify WS_VISIBLE in the template. The reason for this is historical.

Rewind back to the old days (we're talking Windows 1.0), graphics cards are slow and CPUs are slow and memory is slow. You can pick a menu option that displays a dialog and wait a second or two for the dialog to get loaded off the floppy disk. (Hard drives are for the rich kids.) And then you have to wait for the dialog box to paint.

To save valuable seconds, dialog boxes are created initially hidden and all typeahead is processed while the dialog stays hidden. Only after the typeahead is finished is the dialog box finally shown. And if you typed far ahead enough and hit Enter, you might even have been able to finish the entire dialog box without it ever being shown! Now that's efficiency.

Of course, nowadays, programs are stored on hard drives and you can't (normally) out-type a hard drive, so this optimization is largely wasted, but the behavior remains for compatibility reasons.

Actually this behavior still serves a useful purpose: If the dialog were initially created visible, then the user would be able to see all the controls being created into it, and watch as WM_INITDIALOG ran (changing default values, hiding and showing controls, moving controls around...) This is both ugly and distracting. ("How come the box comes up checked, then suddenly unchecks itself before I can click on it?")

• #### Why do operations on "byte" result in "int"?

(The following discussion applies equally to C/C++/C#, so I'll use C#, since I talk about it so rarely.)

People complain that the following code elicits a warning:

```byte b = 32;
byte c = ~b;
// error CS0029: Cannot implicitly convert type 'int' to 'byte'
```

"The result of an operation on 'byte' should be another 'byte', not an 'int'," they claim.

Be careful what you ask for. You might not like it.

Suppose we lived in a fantasy world where operations on 'byte' resulted in 'byte'.

```byte b = 32;
byte c = 240;
int i = b + c; // what is i?
```

In this fantasy world, the value of i would be 16! Why? Because the two operands to the + operator are both bytes, so the sum "b+c" is computed as a byte, which results in 16 due to integer overflow. (And, as I noted earlier, integer overflow is the new security attack vector.)

Similarly,

```int j = -b;
```
would result in j having the value 224 and not -32, for the same reason.

Is that really what you want?

Consider the following more subtle scenario:

```struct Results {
byte Wins;
byte Games;
};

bool WinningAverage(Results captain, Results cocaptain)
{
return (captain.Wins + cocaptain.Wins) >=
(captain.Games + cocaptain.Games) / 2;
}
```

In our imaginary world, this code would return incorrect results once the total number of games played exceeded 255. To fix it, you would have to insert annoying int casts.

``` return ((int)captain.Wins + cocaptain.Wins) >=
((int)captain.Games + cocaptain.Games) / 2;
```
So no matter how you slice it, you're going to have to insert annoying casts. May as well have the language err on the side of safety (forcing you to insert the casts where you know that overflow is not an issue) than to err on the side of silence (where you may not notice the missing casts until your Payroll department asks you why their books don't add up at the end of the month).
• #### Char.IsDigit() matches more than just "0" through "9"

What people might not realize is that Char.IsDigit() matches more than just "0" through "9".

Valid digits are members of the following category in UnicodeCategory: DecimalDigitNumber.

But what exactly is a DecimalDigitNumber?

DecimalDigitNumber
Indicates that the character is a decimal digit; that is, in the range 0 through 9. Signified by the Unicode designation "Nd" (number, decimal digit). The value is 8.

At this point you have to go to the Unicode Standard Committee to see exactly what qualifies as "Nd", and then you get lost in a twisty maze of specifications and documents, all different.

So let's run an experiment.

```class Program {
public static void Main(string[] args) {
System.Console.WriteLine(
System.Text.RegularExpressions.Regex.Match(
"\x0661\x0662\x0663", // "١٢٣"
"^\\d+\$").Success);
System.Console.WriteLine(
System.Char.IsDigit('\x0661'));
}
}
```

The characters in the string are Arabic digits, but they are still digits, as evidenced by the program output:

```True
True
```

Uh-oh. Do you have this bug in your parameter validation? (More examples..) If you use a pattern like `@"^\d\$"` to validate that you receive only digits, and then later use `System.Int32.Parse()` to parse it, then I can hand you some Arabic digits and sit back and watch the fireworks. The Arabic digits will pass your validation expression, but when you get around to using it, boom, you throw a `System.FormatException` and die.

• #### Returning to Sweden, this time with some actual knowledge of Swedish

I will be in Stockholm from March 24th to April 7th, with an excursion to Göteborg thrown in at some point yet to be determined, probably the 29th to the 1st. So the blog will be on autopilot for a few weeks. (Assuming I can even generate that much advance material. If not, then it will just plain go quiet.)

• #### C++ scoped static initialization is not thread-safe, on purpose!

The rule for static variables at block scope (as opposed to static variables with global scope) is that they are initialized the first time execution reaches their declaration.

Find the race condition:

```int ComputeSomething()
{
static int cachedResult = ComputeSomethingSlowly();
return cachedResult;
}
```

The intent of this code is to compute something expensive the first time the function is called, and then cache the result to be returned by future calls to the function.

A variation on this basic technique is is advocated by this web site to avoid the "static initialization order fiasco". (Said fiasco is well-described on that page so I encourage you to read it and understand it.)

The problem is that this code is not thread-safe. Statics with local scope are internally converted by the compiler into something like this:

```int ComputeSomething()
{
static bool cachedResult_computed = false;
static int cachedResult;
if (!cachedResult_computed) {
cachedResult_computed = true;
cachedResult = ComputeSomethingSlowly();
}
return cachedResult;
}
```

Now the race condition is easier to see.

Suppose two threads both call this function for the first time. The first thread gets as far as setting cachedResult_computed = true, and then gets pre-empted. The second thread now sees that cachedResult_computed is true and skips over the body of the "if" branch and returns an uninitialized variable.

What you see here is not a compiler bug. This behavior is required by the C++ standard.

You can write variations on this theme to create even worse problems:

```class Something { ... };
int ComputeSomething()
{
static Something s;
return s.ComputeIt();
}
```

This gets rewritten internally as (this time, using pseudo-C++):

```class Something { ... };
int ComputeSomething()
{
static bool s_constructed = false;
static uninitialized Something s;
if (!s_constructed) {
s_constructed = true;
new(&s) Something; // construct it
atexit(DestructS);
}
return s.ComputeIt();
}
// Destruct s at process termination
void DestructS()
{
ComputeSomething::s.~Something();
}
```

Notice that there are multiple race conditions here. As before, it's possible for one thread to run ahead of the other thread and use "s" before it has been constructed.

Even worse, it's possible for the first thread to get pre-empted immediately after testing s_constructed but before setting it to "true". In this case, the object s gets double-constructed and double-destructed.

That can't be good.

But wait, that's not all. Not look at what happens if you have two runtime-initialized local statics:

```class Something { ... };
int ComputeSomething()
{
static Something s(0);
static Something t(1);
return s.ComputeIt() + t.ComputeIt();
}
```

This is converted by the compiler into the following pseudo-C++:

```class Something { ... };
int ComputeSomething()
{
static char constructed = 0;
static uninitialized Something s;
if (!(constructed & 1)) {
constructed |= 1;
new(&s) Something; // construct it
atexit(DestructS);
}
static uninitialized Something t;
if (!(constructed & 2)) {
constructed |= 2;
new(&t) Something; // construct it
atexit(DestructT);
}
return s.ComputeIt() + t.ComputeIt();
}
```

To save space, the compiler placed the two "x_constructed" variables into a bitfield. Now there are multiple non-interlocked read-modify-store operations on the variable "constructed".

Now consider what happens if one thread attempts to execute "constructed |= 1" at the same time another thread attempts to execute "constructed |= 2".

On an x86, the statements likely assemble into

```  or constructed, 1
...
or constructed, 2
```
without any "lock" prefixes. On multiprocessor machines, it is possible for the two stores both to read the old value and clobber each other with conflicting values.

On ia64 and alpha, this clobbering is much more obvious since they do not have a single read-modify-store instruction; the three steps must be explicitly coded:

```  ldl t1,0(a0)     ; load
stl t1,1,0(a0)   ; store
```

If the thread gets pre-empted between the load and the store, the value stored may no longer agree with the value being overwritten.

So now consider the following insane sequence of execution:

• Thread A tests "constructed" and finds it zero and prepares to set the value to 1, but it gets pre-empted.
• Thread B enters the same function, sees "constructed" is zero and proceeds to construct both "s" and "t", leaving "constructed" equal to 3.
• Thread A resumes execution and completes its load-modify-store sequence, setting "constructed" to 1, then constructs "s" (a second time).
• Thread A then proceeds to construct "t" as well (a second time) setting "constructed" (finally) to 3.

Now, you might think you can wrap the runtime initialization in a critical section:

```int ComputeSomething()
{
EnterCriticalSection(...);
static int cachedResult = ComputeSomethingSlowly();
LeaveCriticalSection(...);
return cachedResult;
}
```

Because now you've placed the one-time initialization inside a critical section and made it thread-safe.

But what if the second call comes from within the same thread? ("We've traced the call; it's coming from inside the thread!") This can happen if ComputeSomethingSlowly() itself calls ComputeSomething(), perhaps indirectly. Since that thread already owns the critical section, the code enter it just fine and you once again end up returning an uninitialized variable.

Conclusion: When you see runtime initialization of a local static variable, be very concerned.

Page 4 of 5 (50 items) 12345