I’ve decided to revert to storytelling mode today. So sit right back and you’ll hear a tale, a tale of my fateful trip (or click something useful- you’re the one deciding, and I’m the one typing to satisfy whatever inner demon has driven me to do this today)…
It began when someone found a bug in the static version of KMDF- the one almost nobody gets to use and which we have been trying to discontinue any support. But people were still using it, and we had no clear migration path, and there was a bug. So we decided to begin testing it. Poor Neslihan got that task (well at least it wasn’t me, this time). Soon we had added a static KMDF driver to our list of test drivers used in our daily automated testing.
Then, of course, we got the bugcheck. As often happens, I played “Johnny-on-the-spot” and (virtually speaking) leapt onto the remote debug session accompanying the report. Buffer overflow? In FxDriverEntry? But only on Itanium (I checked the other architectures that ran the same test, of course- no problems)? What in blazes is this about?
Well, the method is well documented, and not all that hard to understand- a “cookie" gets initialized, it gets stored to the local stack frame in such a way that code overflowing buffers in the stack frame will overwrite the stored copy. On exit, code gets called that checks that value to see if that happened. It’s a little more complicated than that, and I’m not going to be precise, because part of that complication is related to securing the method against the sort of people who like to exploit buffer overflows [so I’m not going to make the method clear for them- they can do their own research without my assistance]. The bugcheck analysis will even tell you the two values- expected and actual, making it easy for people who really think overflows are a bad idea to make a quick guess as to where the overflow came from.
But the circumstances seemed suspicious, so I went and read the source code- H’mm- there’s a default value the cookie is initialized to… Interesting- that’s the value we have- but it doesn’t match the expected value.
At which time, something I’ve known for years hit me like the proverbial sledgehammer- our entry code calls GsDriverEntry (which supports the stack probes inserted by the GS compiler switch, hence its name) when it finishes. GsDriverEntry initializes the cookie, which means that that was why the values didn’t match. Like the sword of Damocles, this had been hanging over our heads for years- the first time the compiler decided to do stack probes in our entry code, everything would break. Ouch…
I’ll leave a bunch of story out here- things got “interesting” at that point, but the problem eventually got solved, both for us and for other people facing similar issues, as well. Short answer is that everyone now initializes the cookie right at the start, just as should happen…
But a while back I had occasion to tweak the build process for the WdfVerifier WDK applet, and afterwards I ran it briefly to make sure I didn’t break anything in the process. Oh, joy of joys- NONE of the 1.9 client drivers are being properly identified. They are all identified as “Inbox”, which is what I do when the client identification method I described last year fails on me.
Already beginning to panic, I think- did someone change FxDriverEntry, and I didn’t even notice it? So I go to our source control system, and look at the change history. The most recent change is the fix for the problem with the stack probes (yes it occurred on static, but what scared me then was it could have happened to any version, because were weren't doing things properly). But that just calls library routines to initialize said cookie… Oh, blazes, those routines must be calling an import I wasn’t accounting for!!! Why??? Because I NULL the IAT to force access violations, and I handle the exception by giving up on getting the true version, and fall back to calling it “Inbox”- which is, after all, exactly what I am seeing. Oh, well- so much for my having thoroughly considered all the test and product consequences of that change when it was made…
Well, easy enough to find out what that import might be. One nice trick Ilias showed me is that you can open any binary in WinDbg as a crash dump, and happily resolve symbols and disassemble code. So I pick a random driver on my dev box, and do so.
What, no imports? But, but--- hmm- it uses a fixed address (load of a 64-bit immediate value into RAX, since my dev box is an X64, thank you- you x86 dinosaurs can keep your wimpy processors). What’s with this?
The answer lies in wdm.h, of course- KeQueryTickCount turns out to always be a macro- on x86 and IA64, it accesses KeTickCount [which is an import, and if you followed my earlier tale, it’s clear adding an import to my existing hack-o-matic mechanism is a trivial task]- but on X64, it accesses an essentially hardcoded address in a “SharedUserData” area. This snippet is from wdm.h, so you can see for yourself…
#define KI_USER_SHARED_DATA 0xFFFFF78000000000UI64
#define SharedUserData ((KUSER_SHARED_DATA * const)KI_USER_SHARED_DATA)
#define SharedInterruptTime (KI_USER_SHARED_DATA + 0x8)
#define SharedSystemTime (KI_USER_SHARED_DATA + 0x14)
#define SharedTickCount (KI_USER_SHARED_DATA + 0x320)
#define KeQueryInterruptTime() *((volatile ULONG64 *)(SharedInterruptTime))
#define KeQuerySystemTime(CurrentCount) \
*((PULONG64)(CurrentCount)) = *((volatile ULONG64 *)(SharedSystemTime))
#define KeQueryTickCount(CurrentCount) \
*((PULONG64)(CurrentCount)) = *((volatile ULONG64 *)(SharedTickCount))
My. my. my- this is a problem- no import I can just tweak to point into my code. For a plus, at least the value can’t change (if it did, existing drivers would, after all, cease to work)- well, at least not easily, so I’ll save any remaining paranoia about that for another time. But I’m going to access violate trying to access that address in user mode (not to mention the reams of experimental evidence I had just accumulated by noticing the problem in the first place)…
So in reading about exception handling, a phrase happens to catch my eye- putting it in my words, it says that when continuing an exception, you can alter the context record supplied to you with the exception (and I know this process well, but that’s perhaps a story for a different time). Whoa- that must mean I can change the contents of the registers programmatically, and then tell it to repeat the failed instruction! Now, I can’t find that illustrated in any of the samples the material I’m reading points me to [nor could I find it an an internet search, although the latter was by no means exhaustive]. But that HAS to be what it means, right?
So I start coding- first the half dozen or so lines needed to handle the KeTickCount import. Then an exception filter [only for AMD64, of course]. The logic is my usual bit of precise work: If it is an access violation, and if it is a read, and it is a read of that exact address, and one of the integer registers in the context record has that same address, then change that register’s entry in the context record to the address of the proxy I’ve decided to keep in WdfVerifier, and tell the OS to repeat the failed instruction. I began with RAX, because after all, that’s what I had seen in my investigation, and it seemed the most likely place for a while, but I added the whole set since under the circumstances, it seemed unlikely to do any harm. Anything that didn’t match that exact pattern, and I gave up- just execute the handler, which does nothing. The attempt to run the driver entry code to extract the client version will fail, but WdfVerifier keeps running, and the machine itself is still quite safe from my hackery.
Worked the first time, it did (not counting my usual compiles to get rid of typos). Problem solved. Total work time from start to finish- something like 5 or 6 hours- good enough- after all, I did have to do some research… Of course, there’s always somebody who can do it better, faster, and quicker- and they usually jump out of the woodwork if I start talking about things in that fashion, but it’s my story, so I’ll brag now and regret it later…
Exceptions do indeed rule! I love it when a plan comes together…
A glimpse now at handling the new import…
// The descriptors are an array of entries per module, each terminated with an all-0 entry.
// Each entry has an RVA for the module image name, and a second for a 0-terminated list of RVAs to the structures used for
// resolving names of exported entries from the module (in loading these get resolved to real addresses and are plugged into
// the loaded image's import address table in the same order). So we now know how to write an image loader if we need to...
// First, find the descriptor for the KMDF loader, and quit if it isn't there.
DWORD StringCopy = (DWORD) -1, WdfBind = (DWORD) -1, InitEvent = (DWORD) -1, TickCount = (DWORD) -1;
// Then look for the Bind and Unbind entry points. Case counts!
enum {OutdatedTechnology, HasBind, HasUnbind, OneCoolDriver};
BYTE TargetIdentification = OutdatedTechnology;
....
else if (IsKernel)
{
if (0 == strcmp("RtlCopyUnicodeString", (PSTR) NameDescriptor.Name))
StringCopy = ImportsIndex;
else if (0 == strcmp("KeInitializeEvent", (PSTR) NameDescriptor.Name))
InitEvent = ImportsIndex;
else if (0 == strcmp("KeTickCount", (PSTR) NameDescriptor.Name))
TickCount = ImportsIndex;
ImportsIndex++;
}
.....
DWORD IATSize, RelocationsSize;
LONGLONG OurFakeTickCount = 0x8badf00ddeadbeefI64; // ersatz Tick count
PVOID* ImportAddresses = (PVOID*)
ImageDirectoryEntryToDataEx(DriverImage, TRUE, IMAGE_DIRECTORY_ENTRY_IAT, &IATSize, &Unused);
PIMAGE_BASE_RELOCATION Relocations = (PIMAGE_BASE_RELOCATION)
ImageDirectoryEntryToDataEx(DriverImage, TRUE, IMAGE_DIRECTORY_ENTRY_BASERELOC, &RelocationsSize, &Unused);
if (NULL == ImportAddresses)
{
FreeLibrary((HMODULE) DriverImage);
return true;
}
// Plug them in here, and we are good to go!
DWORD OldProtect;
if (!VirtualProtect(ImportAddresses, IATSize, PAGE_READWRITE, &OldProtect))
{
FreeLibrary((HMODULE) DriverImage);
return true;
}
memset(ImportAddresses, 0, IATSize);
ImportAddresses[StringCopy] = FakeOutStringCopy;
ImportAddresses[WdfBind] = CollectVersion;
if (InitEvent != (DWORD) -1)
ImportAddresses[InitEvent] = FakeEventInit;
if (TickCount != (DWORD) -1)
ImportAddresses[TickCount] = &OurFakeTickCount;
VirtualProtect(ImportAddresses, IATSize, OldProtect, &OldProtect);
So for some other glimpses, this is one part of the other hackery (complete with my elementary-level annotations):
class AllKMDFDrivers
{
MachineKey& Owner;
KMDFDriverList& InstalledKMDFDrivers;
LoadedDrivers& CurrentlyLoadedDrivers;
LoaderDiagnosticsFlag LoaderFlag;
String RuntimeVersion;
DWORD Major, Minor, Build;
bool ServiceUsesKMDFLoader(__in PCWSTR ServiceName, __in RegistryKey& Service, __out MyBindInfoAlias& Binding);
static void FakeOutStringCopy(__in PVOID, __in PVOID); // Fake RtlCopyUnicodeString entry
static void FakeEventInit(__in PVOID, __in ULONG, __in ULONG); // Fake KeInitializeEvent entry
static ULONG CollectVersion(__out MyBindInfoAlias& BindingOut, __in PVOID, __in MyBindInfoAlias& BindingIn, __in PVOID);
#if defined(_AMD64_)
static LONG Filterx64Exception(__in EXCEPTION_POINTERS* ExceptionInfo, __in LONGLONG& FakeTickCount);
#endif
While the rest of it looks like this (and again, this is partial)- yes, feel free to hate my stylistic indifference to the herd- perhaps I’ll get fired for this, and all can breathe a vast sigh of relief…
First, the code that calls the filter
MangledDriverEntry CrashAndBurn = (MangledDriverEntry) ((PBYTE) DriverImage + EntryRva);
__try
{
CrashAndBurn(Binding, &Binding);
}
#if !defined(_AMD64_)
__except(EXCEPTION_EXECUTE_HANDLER)
#else
__except(Filterx64Exception(GetExceptionInformation(), OurFakeTickCount))
#endif
{
}
and now our filter:
#if defined(_AMD64_)
/**********************************************************************************************************************************
LONG AllKMDFDrivers::Filterx64Exception(__in EXCEPTION_POINTERS* ExceptionInfo, __in LONGLONG& FakeTickCount)
Ahh, the joys of low-level intervention by the truly incorrigible! On AMD64, we will get an exception when the code tries to get
the tick count. This value resides at a known address (which cannot change because if it did, all existing drivers would then fail
to work- although I suppose someone could resort to what I am about to do- bring it on, I'll see if I can keep up with the absurd
arms race).
So, first, verify from the exception record that we are getting an access violation of some sort reading that known address. Then
see if one of the integer registers has that address (start with RAX, since that's where it currently is, and I bet it doesn't
change much). If it does, change it to point to the value given to this function (which resides in WdfVerifier, since I made it
a reference, and will let the compiler play enforcer), and then continue execution.
In all other cases, execute the handler, which will do nothing (meaning the version will continue to be unknown).
**********************************************************************************************************************************/
LONG AllKMDFDrivers::Filterx64Exception(__in EXCEPTION_POINTERS* ExceptionInfo, __in LONGLONG& FakeTickCount)
{
// cf definition of SharedTickCount in wdm.h
static const DWORD64 KnownTickCountAddress = 0xFFFFF78000000320UI64;
if (NULL == ExceptionInfo || NULL == ExceptionInfo->ExceptionRecord || NULL == ExceptionInfo->ContextRecord)
return EXCEPTION_EXECUTE_HANDLER;
switch (ExceptionInfo->ExceptionRecord->ExceptionCode)
{
case EXCEPTION_ACCESS_VIOLATION:
case EXCEPTION_IN_PAGE_ERROR: // Unlikely, but what the heck...
if (2 > ExceptionInfo->ExceptionRecord->NumberParameters)
return EXCEPTION_EXECUTE_HANDLER;
break;
default:
return EXCEPTION_EXECUTE_HANDLER;
}
if (EXCEPTION_NONCONTINUABLE == ExceptionInfo->ExceptionRecord->ExceptionFlags)
return EXCEPTION_EXECUTE_HANDLER;
if (EXCEPTION_READ_FAULT != ExceptionInfo->ExceptionRecord->ExceptionInformation[0] ||
KnownTickCountAddress != ExceptionInfo->ExceptionRecord->ExceptionInformation[1])
return EXCEPTION_EXECUTE_HANDLER;
if (KnownTickCountAddress == ExceptionInfo->ContextRecord->Rax)
{
ExceptionInfo->ContextRecord->Rax = (DWORD64) &FakeTickCount;
return EXCEPTION_CONTINUE_EXECUTION;
}
else if (KnownTickCountAddress == ExceptionInfo->ContextRecord->Rbx)
The rest I leave as an exercise to the reader, having already put all 0.378 of you through the treadmill today…