• #### It rather involved being on the other side of this airtight hatchway: Open access to the application directory

• 24 Comments

A security vulnerability report arrived claiming that the Program X installer was insecure because it loaded a DLL (let's call it HAHA.DLL) from the current directory, thereby being susceptible to a current directory attack. (Other terms for this type of attack are DLL planting and DLL side-loading.)

The vendors who were responsible for Program X forwarded the report to Microsoft because their program never loaded HAHA.DLL directly; it was being loaded by a system component.

The first order of business was to verify that it was actually a DLL planting vulnerability. And it wasn't. It was an application directory attack, not a current directory attack. It turns out that a lot of purported DLL planting vulnerability reports are actually application directory attacks. DLLs in the application directory take priority over system DLLs because the directory is the Windows equivalent of what on the Mac is called an application bundle.¹ Which only serves to highlight the importance of securing your application directory.

In the original report, Program X was in a directory called something like \\server\software\install, which was filled with setup programs for various applications. As a result, all of the programs were soaking in the same hot-tub.

When this issue was pointed out to the vendors of Program X, they responded, "No, this is still a bug. You need to add HAHA.DLL to the KnownDlls list so that it cannot be overridden by the application directory."

The KnownDlls list is not a security feature. It is a performance feature. The fact that KnownDlls overrides the application directory is a side-effect of its implementation (namely, to avoid directory searching for popular DLLs), and it is arguably a bug, since it breaks contractual behavior: The application directory no longer takes precedence over the system directory. The Application Compatibility folks spend a lot of time studying the KnownDlls list to make sure that the DLLs in there are ones that no properly-functioning application should be trying to override with a local copy.

Even if HAHA.DLL were added to the KnownDlls list, that does not guarantee that it will always be loaded from the system directory. If somebody can attack your application directory, then they can drop a DLL redirection manifest into the directory or use DotLocal DLL redirection, both of which also override KnownDlls. (Observe that both of these attacks require write access to the application directory.)

The application directory is your safety bubble. If you let anybody into your safety bubble, then it isn't very safe any more.

In the parlance of airtight hatchways: Granting open write access to your application directory is equivalent to leaving open the door to your airtight hatchway.

¹ I used to say simply "The directory is the application bundle", but I'm now forced to use the much more awkward formulation because at least one person thought I was talking about Windows Store application bundles.

• #### The case of the auto-hide taskbar

• 43 Comments

A customer reported that their taskbar would sometimes spontaneously go into auto-hide mode. What made this particularly insidious was that they had deployed a group policy to prevent users from changing the auto-hide state (because they never wanted the taskbar to auto-hide), so when the taskbar went into auto-hide mode, there was no way to get it out of that mode!

The customer's first investigation was to find out where the auto-hide state was recorded. A little bit of registry spelunking (because as far as these people are concerned, everything is in the registry) showed that a single bit in the StuckRects2 registry value controlled the auto-hide setting. They used Process Monitor and observed that it was Explorer that was updating the value. That was as far as they could troubleshoot the problem and came to the Windows team for further guidance.

It turns out that watching the registry value get updated doesn't tell you anything interesting. Explorer always writes that value when you log off, and the value written is the taskbar's current auto-hide state. The real culprit is the person who changed the taskbar's state, causing Explorer to save the updated state at logoff. And that culprit is somebody who called SHApp­Bar­Message with the ABM_SETSTATE parameter, in order to turn on the ABS_AUTOHIDE bit.

I warned you many years ago that the auto-hide and always-on-top states are user settings, and programs should modify them only under the instructions of the user.

The support technician was able to put together an instrumented version of the SHApp­Bar­Message function that logged any attempt to put the taskbar into auto-hide mode. (This step took a little while because the first attempt wasn't quite right and ended up not working.)

A few days later, the support technician reported back: The culprit was found with his hand in the cookie jar! One of the applications the customer was using was indeed calling SHApp­Bar­Message with the ABM_SETSTATE parameter, and passing the ABS_AUTOHIDE flag to make the taskbar auto-hide, and the application never called the function again to restore it back to normal. Result: Taskbar goes to auto-hide and stays there.

Mind you, even if the application remember to set the auto-hide flag back to its original value, that still wouldn't have been the correct solution. Suppose two programs did this.

bool fWasTaskbarAutoHide;

OnStartup()
{
fWasTaskbarAutoHide = GetTaskbarAutoHideState();
SetTaskbarAutoHide(true);
}

OnExit()
{
SetTaskbarAutoHide(fWasTaskbarAutoHide);
}


The user first runs the other program, which remembers that the taskbar is not auto-hide, then sets it to auto-hide. Now the user runs your program, which remembers that the taskbar is auto-hide, and then sets the taskbar (redundantly) to auto-hide. The user exits the first program, which sets the taskbar to normal. Now the taskbar is in normal state even though the your program wants it to be auto-hide. Finally, your program exits, and it "restores" the taskbar to auto-hide.

This is just another case of using a global setting to solve a local problem. The local solution is to create a fullscreen window, and the taskbar will get out of the way automatically.

The customer went to the online support forum for the program that was setting the taskbar to auto-hide and forgetting to restore it. And, how about that, there was a thread in that forum called something like "After I run Program X, my taskbar gets set to auto-hide."

Bonus reading: How your taskbar auto-hide settings can keep getting overwritten. It seems this problem happens a lot. This is the sort of problem you get when you decide to expose a user setting programmatically: Applications start messing with the setting when they shouldn't.

• #### Sometimes people can be so helpless: Finding the owner of a Web page

• 13 Comments

Internal to Microsoft are thousands of Web sites. This is a story about one of them.

On an internal discussion list, somebody asked

We just created a new Flurb. Does anyone know how to get listed on http://internalsite/newflurbs?

I hadn't heard of that site before, but I checked it out. Neat, it's basically a blog which announces new Flurbs. I can see how somebody would want their Flurb to be listed there. I also saw lots of pieces of information on the page which the person appears not to have noticed. I replied,

Um, how about the Email link in the navigation bar? Or did you try that and it didn't work?

Also, each entry has the same name at the bottom. You could try contacting that person.

Just giving you some ideas on problem-solving techniques.

Pre-emptive snarky comment: "Hey, Raymond, where's your contact link?" I was forced to disable the contact link some time ago because it was being used primarily to send spam. Finding my email address is left as an exercise. (It's not hard.)

• #### Why is LOCALE_SDURATION so dorky-looking?

• 31 Comments

For formatting time spans, you can use the LOCALE_SDURATION format string, but the result is a dorky hh:mm:ss.ffff format. Why isn't there a LOCALE_SLONG­DURATION format that is fancier like hh hours, mm minutes, and ss.ffff seconds?

You have the complexities of natural language to thank.

In the general case, there is not enough information to provide the appropriate grammatical context in order to know the correct format. This isn't a big deal in English, since English words typically do not inflect for case (pronouns and genetive being the most commonly-encountered exceptions), but in many other languages, choosing the exact form of the word "hours" depends on grammatical information that cannot be captured in a simple call to Get­Locale­Info.

For example, if you wanted to say "Last modified hh hours, mm minutes, and ss.ffff seconds ago", the word "hours" would need one form, whereas if you had wanted to say "Active for hh hours, mm minutes, and ss.ffff seconds", the word "hours" would need a different form. Some languages have quite a large number of grammatical cases (I'm looking at you, Finnish), and expressing all of this programmatically in a uniform way across all languages is impractical. The preposition since might take the accusative case in one language, but the genitive in another.¹

And we haven't even gotten into the crazy world of singular/plural/dual/paucal, or whether zero is singular or plural.

The language folks may have realized that they didn't want to dig themselves into a hole like they did with genitive months.

¹ And then there's German, where some prepositions take multiple cases depending on context. Consider, for example, the preposition unter, meaning under.

Sentence Case Translation Context
Wir laufen unter die Brücke. Accusative We run under the bridge. We start outside the bridge, go under it, then go out the other side stay underneath. [Thanks to Piotr and Axel for the correction.] (The path takes us under a bridge.)
Wir laufen unter der Brücke. Dative We run under the bridge. We stay under the bridge the whole time. (It's raining, so we are doing our running exercise under a bridge in order to stay dry.)

I've internalized the rule for deciding which case to use, so much so that it's hard for me to explain it, but I'll try anyway. If the preposition applies throughout the entire activity, you use the dative. But if the point of the sentence is that situation changed from "not applicable" to "applicable" (in our example, from "not under" to "under"), then use the accusative. This is usually described in grammar books as change of position or motion toward a goal.

• #### Microspeak: Landing, especially the heated kind

• 14 Comments

Work on Windows occurs in several different branches of the source code, and changes in one branch propagate to other branches. When talking about a feature or other task becoming visible in a branch, the preferred jargon word at Microsoft is landing. In its purest form:

We expect the feature to land in the trunk early next week.

The term land when used in this way is typically used to describe a feature arriving in a branch different from its home branch.

From this basic meaning, extended usages arise.

The term landing is often accompanied by additional aviation adjectives to describe how smoothly the feature will arrive or the task will be completed. In these extended usages, the location of landing is often the feature's home branch.

We're coming in for a hard landing on bugs.

A hard landing is one that is rather inelegant. An example would be a feature that arrives fully functional but rather unpolished, or in the above example, that the bugs are all resolved, but perhaps with more bugs marked Won't fix or Postponed than management would have liked.

The feature is going to land hot.

A feature with a hot landing barely makes its deadline. You can also say that a feature is coming in hot if it is headed for a hot landing, and a feature is running hot if its current trajectory suggests that it's going to land hot.

The last thing that came in hot was Feature X and we did not land it well.

I like the above citation because it employs both metaphors.

We did not have a good process in place for managing the specs that came in hot.
Several deployments are coming in hot due to other resource commitments.

More generally, something is coming in hot if it is running right up against a deadline and is at risk for being late.

• #### Where is this CRC that is allegedly invalid on my hard drive?

• 29 Comments

If you're unlucky, your I/O operation will fail with ERROR_CRC, whose description is "Data error (cyclic redundancy check)." Where does NTFS keep this CRC, what is it checking, and how can you access the value to try to repair the data?

Actually, NTFS does none of that stuff. The CRC error you're getting is coming from the hard drive itself. Hard drives nowadays are pretty complicated beasts. They don't just plop data down and suck it back. They have error-checking codes, silent block remapping, on-board caching, sector size virtualization, all sorts of craziness.

What's actually happening is that the file system asks the hard drive to read some data, and instead of handing data back, the hard drive reports, "Sorry, I couldn't read it back because of a CRC error." NTFS itself doesn't do any CRC checking.

"Well, that's awfully misleading. If NTFS is reporting a CRC error, then that makes the user think that NTFS is maintaining CRCs. Shouldn't it just report 'general I/O error' instead of a more specific error?"

NTFS is just bubbling up the BIOS reported hard drive error codes, and those error codes were returned all the way back to the application. Who knows, maybe the end-user knows enough about drive technology that they can tell the difference between a CRC error and a seek error. (For example, a seek error may be fixed by removing the floppy disk and reinserting it, or by recalibrating.)

What about the converse? If an I/O operation completes successfully, does that provide metaphysical certitude that the data read back exactly matches the data that was originally written?

No. It only provides metaphysical certitude that the hard drive reported that the data read back exactly matches the data that was originally written, as far as it could tell.

Generally speaking, upper layers of a system trust that a lower layer is functioning properly (and often they have no way of detecting a malfunction in the lower layer, anyway). If the hard drive says that it read the data successfully, well, the hard drive is the expert at this sort of thing, so who are we to say, "Nuh uh, I think you're wrong"?

• #### It rather involved being on the other side of this airtight hatchway: Disabling Safe DLL searching

• 17 Comments

The Microsoft Vulnerability Research team discovered a potential current directory attack in a third party program. The vendor, however, turned around and forwarded the report to the Microsoft Security Response Center:

Our investigation suggests that this issue is due to a bug in Microsoft system DLLs rather than our program. When a process is launched, for example, when the user double-clicks the icon in Explorer, a new process object is created, and the DLLs are loaded by a component known as the Loader. The Loader locates the DLLs, maps them into memory, and then calls the DllMain function for each of the modules. It appears that some Microsoft DLLs obtain DLLs from the current directory and are therefore susceptible to a current directory attack. We created a simple Win32 application which demonstrates the issue:

#include <windows.h>

int __cdecl main(int argc, char **argv)
{
return MessageBox(NULL, "Test", "Test", MB_OK);
}


If you place a fake copy of DWMAPI.DLL in the same directory as the application, then the Loader will use that fake copy instead of the system one.

This technique can be used to attack many popular programs. For example, placing a fake copy of DWMAPI.DLL in the C:\Program Files\Internet Explorer directory allows it to be injected into Internet Explorer. Placing the file in the C:\Program Files\Adobe\Reader 9.0\Reader directory allows it to be injected into Adobe Reader.

(I like how the report begins with some exposition.)

The vendor appears to have confused two directories, the current directory and the application directory. They start out talking about a current directory attack, but when the money sentence arrives, they talk about placing the rogue DLL "in the same directory as the application," which makes this not a current directory attack but an application directory attack.

We saw some time ago that the directory is the application bundle, and the application bundle can override DLLs in the system directory. Again, this is just another illustration of the importance of securing your application directory.

The specific attacks listed at the end of the report require writing into C:\Program Files, but in order to drop your rogue DWMAPI.DLL file into that directory, you need to have administrative privileges in the first place.

In other words, in order to attack the system, you first need to get on the other side of the airtight hatchway.

There was one final attempt to salvage this bogus vulnerability report:

We can also reproduce the problem without requiring write access to the Program Files directory by disabling Safe DLL searching.

Nice try. In order to disable Safe DLL searching, you need to have administrator privileges, so you're already on the other side of the airtight hatchway. And if you elevate to administrator and disable safe DLL searching, then is it any surprise that you have unsafe DLL searching? This is just another case of If you set up an insecure system, don't be surprised that there's a security vulnerability.

• #### Why don't elevated processes inherit their environment variables from their non-elevated parent?

• 23 Comments

As a general rule, child processes inherit the environment of their parent. But if the parent is non-elevated and the child is elevated, then this inheritance does not happen. Why not?

There are two answers to this question. For the kernel-color glasses answer, I defer to Chris Jackson, the App Compat Guy. It's interesting to see how it all works, but it doesn't explain why the mechanism was designed to block environment variable inheritance.

The reason for the design is that allowing an elevated process to inherit the PATH from a non-elevated process creates an attack vector.

The non-elevated process sets its PATH to put some attacker-controlled directories ahead of the directories the elevated application actually expects. For example, suppose the elevated application links to C:\Program Files\Common Files\Contoso\Contoso­Grid­Control.dll. It arranges for this by setting the system PATH to include the C:\Program Files\Common Files\Contoso directory. Or maybe the program calls Load­Library on a DLL that might not exist, and it handles the case that the call fails by disabling some optional feature. (Whether this is a good idea or not is beside the point.)

The attacker changes the PATH to read \\rogue\server;C:\Program Files\Common Files\Contoso, so that the library search finds the evil copy on the rogue server before finding the expected version in the Common Files directory (or in the case of a DLL that may not exist, it finds the evil copy on the rogue server instead of failing outright).

Bingo, the attacker has injected arbitrary code into an elevated process. Game over.

If the environment and current directory were inherited, then malware could ask to elevate Program X with a custom current directory or environment. The user will merely be asked if they want to run Program X elevated, unaware that it is being run in a nonstandard manner, using an execution environment that did not receive administrator approval. As a result, the malware would be able to sneak into the administrator account under sheep's clothing (the sheep being Program X).

What if you want to run another program elevated, and with a custom current directory or environment?

Write a wrapper program which sets the current directory and environment, then launches the desired target process. Then ask the user for permission to run the wrapper elevated.

• #### I didn't go to //build/ in San Francisco, but I'll be at RAMP in Budapest

• 48 Comments

Larry went to //build/, but I didn't. On the other hand, I will be at RAMP in Budapest. I will be presenting (in English) on the evolution of Windows, specifically on the lessons learned over the first two decades of Windows that led to the design of WinRT, the Windows Runtime.

Although the conference has sold out, you can register for free to view the sessions online via live streaming. I'm on at 12:15 (Budapest local time) on July 12. It's the last session before lunch, so everybody will be hungry and anxious for my talk to be over.

To whet your appetite, here's a screen shot from my presentation:

 File Edit View Search Run Watch Options Calls Help ┌─┤∙├──────────────────────────local───────────────────────────┤↑├─┬─┤∙├reg┤↑├─┐ │ │ AX = 001F │ │ │ BX = 0949 │ │ │ CX = 0FC0 │ ├─┤∙├──────────────source1 CS:IP hello.c (ACTIVE)──────────────┤↑├─┤ DX = 0A00 │ │1: #include ↑│ SP = 0FC0 │ │2: ▒│ BP = 0000 │ │3: int FAR PASCAL WinMain(HINSTANCE hInstance, ▒│ SI = 0000 │ │4: HINSTANCE hPrevInstance, ▒│ DI = 0CFB │ │5: LPSTR lpCmdLine, █│ DS = 1827 │ │6: int nCmdShow) ▒│ ES = 0000 │ │7: { ▒│ SS = 1827 │ │8: return MessageBox(NULL, ▒│ CS = 1027 │ │9: "Hello, world!", ▒│ IP = 01A8 │ │10: "My first program", ▒│ FL = 2200 │ │←▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒→↓│ │ ┌─┤∙├─────────────────────────command──────────────────────────┤↑├─┤NV UP EI PL│ │CV2207 Message: Loaded symbols for C:\PROGRAM\HELLO.EXE │NZ NA PO NC│ │CV1053 Warning: TOOLS.INI not found │ │ │> │ │ │ │ │ └──────────────────────────────────────────────────────────────────┴───────────┘

Although I am familiar with Hungarian notation, I know no Hungarian. I do know a good amount of German, and I hope that plus English will be enough to let me carry out some simple transactions.

Update: I'm told the recording will be available on InfoQ at no charge, but I have no details beyond that.

• #### You can read as well as I can, or maybe not

• 14 Comments

Occasionally, somebody will ask for help on a distribution list, and it turns into a really annoying case of hand-holding.

From: X

I'm using the XYZ toolset to do some document management, and I want the server to run a script whenever somebody tries to modify the master template, so it can run validations before accepting the update, such as verifying that the person making the change has received the proper approvals. Is that possible?

It turns out that this is something the XYZ toolset already knows how to do.

From: Raymond

You can create a conditions configuration file which adds a condition that validates that the request satisfies whatever conditions you require.

From: X

Yes, that is what I am looking for. Where can I find information on how to write the validation script and how to implement it on the server?

From: Raymond

On http://xyztoolset/, go to Server setup, then Conditions.

Another colleague with a lot of experience with the XYZ toolset stepped in with some more useful advice:

From: A

Instead of developing scripts from scratch, you may want to start with the pre-written scripts that come with the XYZ toolset add-on pack. There are already modules for things like scanning the Description field for approval IDs. Note also that you may want to include some way of changing the rules dynamically as your processes change (for example, maybe one of the approvers goes on vacation and delegates approval authority to somebody else, or maybe your project goes into a "no approval necessary" phase) rather than just hard-coding the rules.

From: X

That's a good idea. Is there a way to easily disable a validation script?

From: Raymond

Um, you can just build this into your validation script.

if (File.Exists(@"\\project\admin\no_validation")) {
return Validation.Passed;
}


From: A

Or you can have a magic word in the Description that disables validation. Features like this and the one Raymond describes are available in the add-on pack. Look in the Sentinel and Description­Match modules.

From: X

I'm having trouble getting this working. The documentation says I should do something like

<condition file="$\path\to\file.ext" action="C:\path\to\validate.xyz" />  If I use a shared network path for my validation script, I get "access denied". <condition file="$\Nosebleed\MasterTemplate.xml"
action="\\project\admin\validate.xyz" />


If I use an internal path:

<condition file="$\Nosebleed\MasterTemplate.xml" action="$\Scripts\validate.xyz" >


I get "file not found". I added \$\Scripts\validate.xyz to the document repository, so the server should be able to see it. Am I missing something obvious here?

(I like how this person just made up a feature, in this case, using a repository path as an action rather than a physical file path.)

From: Raymond

My psychic powers tell me that the account under which the server is running does not have access to \\project\admin\validate.xyz. And adding the validation script to the document repository allows the server to access it only if the server has an active (and up-to-date) workspace joined to the repository. Sure the server has a copy of validate.xyz, but that copy is in the repository database. (Adding a file to the repository is more than just a "copy" operation.)

I would not be surprised if having the server also maintain a live workspace in itself is not a recommended practice.

From: X

I agree with your assessment of the "access denied" issue, but I really don't want the validate.xyz script to reside on an external share. Where is the best place to put the script? On the server or a share?

From: Raymond

The documentation for the XYZ add-on pack recommends putting the scripts on the server, accessible via a share.

From: X

But that's what I did, and the result was "access denied."

From: Raymond

No, that's not what you did. The recommendation is to put the scripts on the server (C:\Scripts\validate.xyz) and then share out your scripts (net share Scripts=C:\Scripts) so administrators can update them remotely.

From: X

I read the XYZ Toolkit documentation for conditions, and it says "Get an IT-managed share created." I suppose I need to contact the IT department to have that done. Correct?

At this point, I got tired of hand-holding.

From: Raymond

You can read just as well as I can.

Privately, I sent a message to A:

From: Raymond
To: A

You can read just as well as I can.

Actually, that statement is a lie.

My colleague "A" replied, "Yes, I thought that to myself when you used that line last week, too!"

