Microsoft can be quite obsessive about instrumentation and metrics. We have a significant body of tools that perform static and dynamic analysis on our operating systems. Some of these tools (for example prefast and FxCop) are public, some are not.
Friday I posted a small snippet of code that showed a couple of the issues using named shared synchronization objects. The example was abstract, but the problems called out is quite real.
If you run the Microsoft prefix tool on the example, it will point out the first problem in the snippet: The calls to CreateEvent are made without specifying a DACL. This is bad, because the objects get a DACL based on the creator's token, which means you don't get to control it, and it's entirely possible that the DACL you get is to permissive.
Even if you're not worried about a bad guy (and you always have to worry about the bad guy), if you're using the shared synchronization object in a client/server scenario, it means that the client might not have rights to access the server because the newly created object will get a security descriptor based on the token of the creator. Even worse, it's possible that the SD in the client grants the server rights but not vice-versa. That means that depending on which side gets to run first, you might get different results. That kind of timing problem can be a nightmare to debug.
If the ACL creates is too permissive, then things get scary. If the DACL grants full access, it means that a bad guy can do ANYTHING to your event - they can change it's state, they can set a new DACL on it (locking your application out), etc. Weak DACLs are a recipe for denial of service attacks (or worse - depending on the access rights granted and the circumstances, they can even enable elevation of privilege attacks).
Of course the fix is simple: Provide a DACL that grants the relevant access rights to the principles that are going to be opening the shared synchronization object.
That's what we did when our source code analysis tools showed the first problem - they correctly detected that we were using a named object and hadn't specified a DACL for the named object.
A couple of weeks later, I got a new bug report, this time from the security guys. It seems that they had ran a second tool which looked for ACLs that were too permissive. In their scan, they found our named shared object, which had granted GENERIC_ALL access to the windows audio service.
I fixed the problem by tightening the ACL on the shared object to only grant SYNCHRONIZE access to the audio service (that's all the audio service needed), and I ran afoul of the second problem.
As the documentation for CreateEvent clearly spells out, if the named synchronization object already existed, CreateEvent API (and CreateMutex and CreateSemaphore) opens the object for EVENT_ALL_ACCESS (or SEMAPHORE_ALL_ACCESS or MUTEX_ALL_ACCESS). That means that your DACL needs to grant EVENT_ALL_ACCESS (or...) or the call to CreateEvent will fail. And I just changed the DACL to only grant SYNCHRONIZE access.
The problem is that we used CreateEvent to prevent from the race condition that occurs if you try to call OpenEvent/CreateEvent - it's possible for the OpenEvent call to fail and have all the consumers of the named event fall through to the CreateEvent API - to avoid this race, the code simply used CreateEvent. I wasn't willing to add a call to OpenEvent because it would leave the race condition still present.
The good news is that the COSD people had identified this problem in the named synchronization object APIs and in Vista the new CreateEventEx was added to enable applications to work around this deficiency in the APIs: The CreateEventEx API allows you to specify an access mask to be used on the object after it is opened. So now it's possible to have accurate DACLs on named synchronization objects AND prevent squatting attacks.
Couldn't you do a CreateEvent, then when you get ERROR_ACCESS_DENIED, try OpenEvent?
Possibly. It assumes that the only reason for getting the access denied error is the DACL, but it might be a viable solution for downlevel OS's.
So just to be clear, this problem is unsolvable given the existing APIs for pre-Vista systems? Unsolvable meaning solved with a good choice of DACL and no race condition for event (or...) creation.
@Dean Harding ...
From this post:
"The problem is that we used CreateEvent to prevent from the race condition that occurs if you try to call OpenEvent/CreateEvent - it's possible for the OpenEvent call to fail and have all the consumers of the named event fall through to the CreateEvent API - to avoid this race, the code simply used CreateEvent. I wasn't willing to add a call to OpenEvent because it would leave the race condition still present."
From Friday under "Things that are not wrong":
"Using CreateEvent in the worker thread instead of OpenEvent - remember, in the original incarnation of this code, the two components that deal with the event run in different processes - using CreateEvent allows you to protect against race conditions creating the event (if you call OpenEvent and follow it with a call to CreateEvent if the OpenEvent fails, there's a window where the other side could also call CreateEvent)."
er ... nevermind the repeating of the open/create items ... apparently *I* was the one not reading thoroughly since I missed that it was offered as the alternative only to ERROR_ACCESS_DENIED
By the way: ACLs for named objects tied to an user session should grant permissions only to SYSTEM, Everyone (possibly) and the current logon id. And none other.
The logon id is a volatile group generated on the fly when a session is created, it can be told apart from the others by the SE_GROUP_LOGON_ID flag, and all processes created in a session by standard means (from another process, Task Manager from CTRL-ALT-DEL, run as another user, and even run as limited user) are members of it. Failure to respect this unwritten rule can be very serious.
For some silly reason, the logon id is not available in SDDL nor can be retrieved by CreateWellKnownSid, so you have to go through the pains of opening the current token, querying for its groups and finding the one with SE_GROUP_LOGON_ID set.
This post (and others) are why I try to get my guys reading Larry's and Raymond's blogs. It is not only to expose them to obscure issues, but to get them realizing that programming is harder than just stringing a bunch of routine calls together and calling it a day. (And I don't want to hear any Unix geeks saying that if we just programmed for Linux, you wouldn't have these problems. RIGHT...)
Of course, I haven't exactly thouroughly thought this though so I might be missing something, but I was asssuming that if you get ERROR_ACCESS_DENIED in the CreateEvent for some *other* reason (e.g. another user already opened one or something) then you'd still get ERROR_ACCESS_DENIED in the OpenEvent as well.
Is Prefix available outside Microsoft?
I searched both google and MSDN, and got only a couple of presentations (out dated) from Microsoft Research and mentioning it only in passing in various articles on security and reliability enhancements in Microsoft products in recent years.
It's edge cases like this that Win32 has traditionally failed to handle as well as the native API. This is the kind of thing that comes to mind when I say that Win32 don't always expose the full underlying functionality.
I am glad that you blogged about this issue and that CreateEventEx makes it easier to handle correctly.
I'm a bit disappointed that the DesiredAccess parameter of NtCreateEvent didn't bubble up to Win32 until the 4th major revision of the OS, though.
Foolhardy, you're 100% right. I was quite surprised when I discovered this issue, and it is a total shame that the guys who defined the Win32 API got it so horribly wrong.
To be honest, I wonder why - this is literally the first time I've run into a Windows API that was introduced with Windows NT that did the wrong thing w.r.t. security functionality.
A bit unfair!
Friday's quiz contained some questionable programming techniques. To be fair, you did point out that they were intentional. But the answer to the quiz is that it didn't pass two internal MS tests, "some of which are not public"?
It's no wonder no one got the full answer. ;-)
Dan, the tools that identified the problem aren't public, it doesn't change the issue. The only difference here is that instead of catching the problem during the FSR (final security revew) as a part of threat model validation, it was caught by an automated tool before then.
Adrian, I just checked appendix C in the linked reference, it does't indicate that the assignment operator functions as a sequence points.
Unless you're creating a securable object while impersonating, you can safely assume that the default DACL you get if you supply NULL security attributes is not too permissive. Yes, it is technically possible to call SetTokenInformation(TokenDefaultDacl) and specify a weak DACL but that's just asking for trouble.
If all involved processes run as the same principal, using NULL security attributes is the right thing to do.
Note also that most unnamed objects (like events and mutexes) don't have security descriptors because they can't be opened by name. In this case it doesn't matter what DACL you specify.
Processes, threads and tokens don't have names but they still have security descriptors because they can be opened through other means. Processes and threads are typically created with NULL security attributes, and this is usually safe even if the caller is impersonating, because processes and threads get their default DACLs from the process token, not impersonation token.
I think the only case where using NULL security attributes is typically wrong is DuplicateTokenEx. This is because tokens should always have security descriptors that grant enough access to the token's principal, otherwise assigning this token to a thread or a process can lead to bizarre errors.
Maybe a LITTLE attention to spelling and grammer would be beneficial to all.