November, 2007

  • The Old New Thing

    Why are INI files deprecated in favor of the registry?

    • 139 Comments

    Welcome, Slashdot readers. Remember, this Web site is for entertainment purposes only.

    Why are INI files deprecated in favor of the registry? There were many problems with INI files.

    • INI files don't support Unicode. Even though there are Unicode functions of the private profile functions, they end up just writing ANSI text to the INI file. (There is a wacked out way you can create a Unicode INI file, but you have to step outside the API in order to do it.) This wasn't an issue in 16-bit Windows since 16-bit Windows didn't support Unicode either!
    • INI file security is not granular enough. Since it's just a file, any permissions you set are at the file level, not the key level. You can't say, "Anybody can modify this section, but that section can be modified only by administrators." This wasn't an issue in 16-bit Windows since 16-bit Windows didn't do security.
    • Multiple writers to an INI file can result in data loss. Consider two threads that are trying to update an INI file. If they are running simultaneously, you can get this:
      Thread 1Thread 2
      Read INI file
      Read INI file
      Write INI file + X
      Write INI file + Y
      Notice that thread 2's update to the INI file accidentally deleted the change made by thread 1. This wasn't a problem in 16-bit Windows since 16-bit Windows was co-operatively multi-tasked. As long as you didn't yield the CPU between the read and the write, you were safe because nobody else could run until you yielded.
    • INI files can suffer a denial of service. A program can open an INI file in exclusive mode and lock out everybody else. This is bad if the INI file was being used to hold security information, since it prevents anybody from seeing what those security settings are. This was also a problem in 16-bit Windows, but since there was no security in 16-bit Windows, a program that wanted to launch a denial of service attack on an INI file could just delete it!
    • INI files contain only strings. If you wanted to store binary data, you had to encode it somehow as a string.
    • Parsing an INI file is comparatively slow. Each time you read or write a value in an INI file, the file has to be loaded into memory and parsed. If you write three strings to an INI file, that INI file got loaded and parsed three times and got written out to disk three times. In 16-bit Windows, three consecutive INI file operations would result in only one parse and one write, because the operating system was co-operatively multi-tasked. When you accessed an INI file, it was parsed into memory and cached. The cache was flushed when you finally yielded CPU to another process.
    • Many programs open INI files and read them directly. This means that the INI file format is locked and cannot be extended. Even if you wanted to add security to INI files, you can't. What's more, many programs that parsed INI files were buggy, so in practice you couldn't store a string longer than about 70 characters in an INI file or you'd cause some other program to crash.
    • INI files are limited to 32KB in size.
    • The default location for INI files was the Windows directory! This definitely was bad for Windows NT since only administrators have write permission there.
    • INI files contain only two levels of structure. An INI file consists of sections, and each section consists of strings. You can't put sections inside other sections.
    • [Added 9am] Central administration of INI files is difficult. Since they can be anywhere in the system, a network administrator can't write a script that asks, "Is everybody using the latest version of Firefox?" They also can't deploy scripts that say "Set everybody's Firefox settings to XYZ and deny write access so they can't change them."

    The registry tried to address these concerns. You might argue whether these were valid concerns to begin with, but the Windows NT folks sure thought they were.

    Commenter TC notes that the pendulum has swung back to text configuration files, but this time, they're XML. This reopens many of the problems that INI files had, but you have the major advantage that nobody writes to XML configuration files; they only read from them. XML configuration files are not used to store user settings; they just contain information about the program itself. Let's look at those issues again.

    • XML files support Unicode.
    • XML file security is not granular enough. But since the XML configuration file is read-only, the primary objection is sidestepped. (But if you want only administrators to have permission to read specific parts of the XML, then you're in trouble.)
    • Since XML configuration files are read-only, you don't have to worry about multiple writers.
    • XML configuration files files can suffer a denial of service. You can still open them exclusively and lock out other processes.
    • XML files contain only strings. If you want to store binary data, you have to encode it somehow.
    • Parsing an XML file is comparatively slow. But since they're read-only, you can safely cache the parsed result, so you only need to parse once.
    • Programs parse XML files manually, but the XML format is already locked, so you couldn't extend it anyway even if you wanted to. Hopefully, those programs use a standard-conforming XML parser instead of rolling their own, but I wouldn't be surprised if people wrote their own custom XML parser that chokes on, say, processing instructions or strings longer than 70 characters.
    • XML files do not have a size limit.
    • XML files do not have a default location.
    • XML files have complex structure. Elements can contain other elements.

    XML manages to sidestep many of the problems that INI files have, but only if you promise only to read from them (and only if everybody agrees to use a standard-conforming parser), and if you don't require security granularity beyond the file level. Once you write to them, then a lot of the INI file problems return.

  • The Old New Thing

    Hotkeys involving the Windows logo key are reserved by the system

    • 77 Comments

    Hotkeys involving the Windows logo key are reserved by the system. New ones are added over time. For example, Windows 98 added Win+D to show the desktop. Windows 2000 added Win+U to call up the Utility Manager. Windows XP added Win+B to move keyboard focus to the taskbar notification area. And a whole bunch of new hotkeys were added in Windows Vista, such as Ctrl+Win which is used by speech recognition. (It's also a bit of a pun: "Control Windows", get it?)

    The fact that these hotkeys are reserved is hard to find. It's buried in a bullet point a third of the way down the Guidelines for Keyboard User Interface Design document. It's also highlighted in the WHQL keyboard requirements right in the section title: "Windows Logo Key Support (Reserved for Operating System Use)". But even if you didn't find the documentation (and I don't blame you), the history of Windows clearly indicates that new Windows logo hotkeys arrive on a pretty regular basis. If your program uses one of those hotkeys, there's a good chance it'll conflict with a future version of Windows.

    Why isn't this mentioned in the RegisterHotKey documentation? Perhaps it should. But historically, function documentation merely told what a function does, not how it should be used. How a function should be used is a value judgement, and it was traditionally not the role of function documentation to make value judgements or specify policy. If you go to a programming language's specification document, you'll find a discussion of what the language keywords do, but no guidance as to how you should use them. Like a language specification, function documentation historically limited itself to how something is done, not whether it ought to be done. Recommendations traditionally belong in overviews, white papers, and other guidance documents. After all, the CreateFile documentation doesn't say "Do not create application data in the My Documents folder." The CreateFile function doesn't care; its job is to create files. Whether a file is created in the correct location is a guideline or policy decision at a higher level.

    Note: I don't know whether that is still the position of the function documentation team, that it limit itself to the facts and avoid value judgements and guidance. My goal today is to draw attention to the guidance.

  • The Old New Thing

    VirtualLock only locks your memory into the working set

    • 60 Comments

    When you lock memory with VirtualLock it locks the memory into your process's working set. It doesn't mean that the memory will never be paged out. It just means that the memory won't be paged out as long as there is a thread executing in your process, because a process's working set need be present in memory only when the process is actually executing.

    (Earlier versions of the MSDN documentation used to say this more clearly. At some point, the text changed to say that it locked the memory physically rather than locking it into the working set. I don't know who changed it, but it was a step backwards.)

    The working set is the set of pages that the memory manager will keep in memory while your program is running, because it is the set of pages that the memory manager predicts your program accesses frequently, so keeping those pages in memory when your program is executing keeps your program running without taking a ridiculous number of page faults. (Of course, if the prediction is wrong, then you get a ridiculous number of page faults anyway.)

    Now look at the contrapositive: If all the threads in your process are blocked, then the working set rules do not apply since the working set is needed only when your process is executing, which it isn't. If all your threads are blocked, then the entire working set is eligible for being paged out.

    I've seen people use VirtualLock expecting that it prevents the memory from being written to the page file. But as you see from the discussion above, there is no such guarantee. (Besides, if the user hibernates the computer, all the pages that aren't in the page file are going to get written to the hibernation file, so they'll end up on disk one way or another.)

    Even if you've magically managed to prevent the data from being written to the page file, you're still vulnerable to another process calling ReadProcessMemory to suck the data out of your address space.

    If you really want to lock memory, you can grant your process the SeLockMemoryPrivilege privilege and use the AWE functions to allocate non-pageable memory. Mind you, this is generally considered to be an anti-social thing to do on a paging system. The AWE functions were designed for large database programs that want to manage their paging manually. And they still won't prevent somebody from using ReadProcessMemory to suck the data out of your address space.

    If you have relatively small chunks of sensitive data, the solution I've seen recommended is to use CryptProtectData and CryptUnprotectData. The encryption keys used by these functions are generated pseudo-randomly at boot time and are kept in kernel mode. (Therefore, nobody can ReadProcessMemory them, and they won't get captured by a user-mode crash dump.) Indeed, this is the mechanism that many components in Windows 2003 Server to reduce the exposure of sensitive information in the page file and hibernation file.

    Follow-up: I've been informed by the memory manager folks that the working set interpretation was overly conservative and that in practice, the memory that has been virtually locked won't be written to the pagefile. Of course, the other concerns still apply, so you still have to worry about the hibernation file and another process sucking the data out via ReadProcessMemory.

  • The Old New Thing

    What to do when the steering column is stuck and the ignition won't turn

    • 47 Comments

    One evening, I parked pointing downhill and like the book says, I turned my wheels to the right before parking. But I turned it a bit too far, because when I returned to the car and inserted the key into the ignition, the key wouldn't turn.

    The wrong thing to do is to force the key until it breaks.

    Here's the right thing to do:

    Take the steering wheel and turn it left and right. One direction will have a little more play than the other. Grab the wheel with your left hand and apply pressure turning it in the direction with more play while using your right hand to turn the key in the ignition. (Or, if you're like me and can't figure it out, just wobble the wheel back and forth until the key unsticks.)

    Note: I scheduled this item independently of its partner. By an amazing coincidence, both items warn against forcing something until it breaks.

  • The Old New Thing

    Psychic debugging: IP on heap

    • 43 Comments

    Somebody asked the shell team to look at this crash in a context menu shell extension.

    IP_ON_HEAP:  003996d0
    
    ChildEBP RetAddr
    00b2e1d8 68f79ca6 0x3996d0
    00b2e1f4 7713a7bd ATL::CWindowImplBaseT<
                               ATL::CWindow,ATL::CWinTraits<2147483648,0> >
                         ::StartWindowProc+0x43
    00b2e220 77134be0 USER32!InternalCallWinProc+0x23
    00b2e298 7713a967 USER32!UserCallWinProcCheckWow+0xe0
    ...
    
    eax=68f79c63 ebx=00000000 ecx=00cade10 edx=7770df14 esi=002796d0 edi=000603cc 
    eip=002796d0 esp=00cade4c ebp=00cade90 iopl=0         nv up ei pl nz na pe nc 
    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206 
    002796d0 c744240444bafb68 mov     dword ptr [esp+4],68fbba44
    

    You should be able to determine the cause instantly.

    I replied,

    This shell extension is using a non-DEP-aware version of ATL. They need to upgrade to ATL 8 or disable DEP.

    This was totally obvious to me, but the person who asked the question met it with stunned amazement. I guess the person forgot that older versions of ATL are notorious DEP violators. You see a DEP violation, you see that it's coming from ATL, and bingo, you have your answer. When DEP was first introduced, the base team sent out mail to the entire Windows division saying, "Okay, folks, we're turning it on. You're going to see a lot of application compatibility problems, especially this ATL one."

    Psychic powers sometimes just means having a good memory.

    Even if you forgot that information, it's still totally obvious once you look at the scenario and understand what it's trying to do.

    The fault is IP_ON_HEAP which is precisely what DEP protects against. The next question is why IP ended up on the heap. Was it a mistake or intentional?

    Look at the circumstances surrounding the faulting instruction again. The faulting instruction is the window procedure for a window, and the action is storing a constant into the stack. The symbols of the caller tell us that it's some code in ATL, and you can even go look up the source code yourself:

    template <class TBase, class TWinTraits>
    LRESULT CALLBACK CWindowImplBaseT< TBase, TWinTraits >
      ::StartWindowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam) {
        CWindowImplBaseT< TBase, TWinTraits >* pThis =
                  (CWindowImplBaseT< TBase, TWinTraits >*)
                      _AtlWinModule.ExtractCreateWndData();
        pThis->m_hWnd = hWnd; 
        pThis->m_thunk.Init(pThis->GetWindowProc(), pThis); 
        WNDPROC pProc = pThis->m_thunk.GetWNDPROC(); 
        ::SetWindowLongPtr(hWnd, GWLP_WNDPROC, (LONG_PTR)pProc);
        return pProc(hWnd, uMsg, wParam, lParam);
    } 
    

    Is pProc corrupted and we're jumping to a random address on the heap? Or was this intentional?

    ATL is clearly generating code on the fly (the window procedure thunk), and it is in execution of the thunk that we encounter the DEP exception.

    Now, you didn't need to have the ATL source code to realize that this is what's going on. It is a very common pattern in framework libraries to put a C++ wrapper around window procedures. Since C++ functions have a hidden this parameter, the wrappers need to sneak that parameter in somehow, and one common technique is to generate some code on the fly that sets up the hidden this parameter before calling the C++ function. The value at [esp+4] is the window handle, something that can be recovered from the this pointer, so it's a handly thing to replace with this before jumping to the real C++ function.

    The address being stored as the this parameter is 68fbba44, which is inside the DLL in question. (You can tell this because the return address, which points to the ATL thunk code, is at 68f79ca6 which is in the same neighborhood as the mystery pointer.) Therefore, this is almost certainly an ATL thunk for a static C++ object.

    In other words, this is extremely unlikely be a jump to a random address. The code at the address looks too good. It's probably jumping there intentionally, and the fact that it's coming from a window procedure thunk confirms it.

    But our tale is not over yet. The plot thickens. We'll continue next time.

  • The Old New Thing

    Hidden gotcha: The command processor's AutoRun setting

    • 37 Comments

    If you type cmd /? at a command prompt, the command processor will spit out pages upon pages of strange geeky text. I'm not sure why the command processor folks decided to write documentation this way rather than the more traditional manner of putting it into MSDN or the online help. Maybe because that way they don't have to deal with annoying people like "editors" telling them that their documentation contains grammatical errors or is hard to understand.

    Anyway, buried deep in the text is this little gem:

    If /D was NOT specified on the command line, then when CMD.EXE starts, it
    looks for the following REG_SZ/REG_EXPAND_SZ registry variables, and if
    either or both are present, they are executed first.
    
        HKEY_LOCAL_MACHINE\Software\Microsoft\Command Processor\AutoRun
    
            and/or
    
        HKEY_CURRENT_USER\Software\Microsoft\Command Processor\AutoRun
    

    I sure hope there is some legitimate use for this setting, because the only time I see anybody mention it is when it caused them massive grief.

    I must be losing my mind, but I can't even write a stupid for command to parse the output of a command.

    C:\test>for /f "usebackq delims=" %i in (`dir /ahd/b`) do @echo %i
    

    When I run this command, I get

    RECYCLER
    System Volume Information
    WUTemp
    

    Yet when I type the command manually, I get completely different output!

    C:\test>dir /ahd/b
    test_hidden
    

    Have I gone completely bonkers?

    The original problem was actually much more bizarro because the command whose output the customer was trying to parse merely printed a strange error message, yet running the command manually generated the expected output.

    After an hour and a half of head-scratching, somebody suggested taking a look at the command processor's AutoRun setting, and lo and behold, it was set!

    C:\test>reg query "HKCU\Software\Microsoft\Command Processor" /v AutoRun
    
    ! REG.EXE VERSION 3.0
    
    HKEY_CURRENT_USER\Software\Microsoft\Command Processor
        AutoRun     REG_SZ  cd\
    

    The customer had no idea how that setting got there, but it explained everything. When the command processor ran the dir /ahd/b command as a child process (in order to parse its output), it first ran the AutoRun command, which changed the current directory to the drive's root. As a result, the dir /ahd/b produced a listing of the hidden subdirectories of the root directory rather than the hidden subdirectories of the C:\test directory.

    In the original formulation of the problem, the command the customer was trying to run looked for its configuration files in the current directory, and the cd\ in the AutoRun meant that the program looked for its configuration files in the root directory instead of the C:\test directory. Thus came the error message ("Configuration file not found") and the plea for help that was titled, "Why can't the XYZ command find a configuration file that's right there in front of it?"

    Like I said, I'm sure there must be some valid reason for the AutoRun setting, but I haven't yet found one. All I've seen is the havoc it plays.

  • The Old New Thing

    I'm going to keep trying on size fours until I find one that fits

    • 32 Comments

    Everybody has noticed by now the phenomenon of vanity sizing, wherein the size numbers printed on the tag shrink over time even though the clothes have stayed the same size. It's a phenomenon of our own doing. People want to remember themselves as the size they were when they were younger, and fashion designers are eager to please. I remember reading an article about this phenomenon a few years ago that quoted a woman searching for a dress as saying, "I'm going to keep trying on size fours until I find one that fits." She was a size four, gosh darn it.

    I know people who have to shop in the "Young Misses" section because the clothing in the "Women" section is too big. They haven't gotten any smaller, but the clothes have gotten larger. I used to wear a medium; now I have to get a small, and my bathroom scale will assure you that I'm definitely not getting smaller.

    The article suggests that sizes will stabilize thanks to online shopping. In the store, you can try on an article of clothing to see how it really fits; when buying online, you don't have the luxury. The size number had better be accurate or you're going to be returning it. At least that's the theory. I'm just waiting for the madness to end.

  • The Old New Thing

    Staying on top of things with timely updates in separator pages

    • 32 Comments
    Public Service Announcement
    This weekend marks the end of Daylight Saving Time in most parts of the United States, the first "return to standard time" cutoff under the new transition rules.

    For a month prior to the transition this past spring, the separator pages generated by the shared company printers contained a handy reminder that Daylight Saving Time began early this year.

    The reminder remained on the separator pages as late as the summer solstice.

    Maybe the people who design the separator pages didn't have anything more interesting to say, so they let the old message ride.

  • The Old New Thing

    You just have to accept that the file system can change

    • 31 Comments

    A customer who is writing some sort of code library wants to know how they should implement a function that determines whether a file exists. The usual way of doing this is by calling GetFileAttributes, but what they've found is that sometimes GetFileAttributes will report that a file exists, but when they get around to accessing the file, they get the error ERROR_DELETE_PENDING.

    The lesser question is what ERROR_DELETE_PENDING means. It means that somebody opened the file with FILE_SHARE_DELETE sharing, meaning that they don't mind if somebody deletes the file while they have it open. If the file is indeed deleted, then it goes into "delete pending" mode, at which point the file deletion physically occurs when the last handle is closed. But while it's in the "delete pending" state, you can't do much with it. The file is in limbo.

    You just have to be prepared for this sort of thing to happen. In a pre-emptively multi-tasking operating system, the file system can change at any time. If you want to prevent something from changing in the file system, you have to open a handle that denies whatever operation you want to prevent from happening. (For example, you can prevent a file from being deleted by opening it and not specifying FILE_SHARE_DELETE in your sharing mode.)

    The customer wanted to know how their "Does the file exist?" library function should behave. Should it try to open the file to see if it is in delete-pending state? If so, what should the function return? Should it say that the file exists? That it doesn't exist? Should they have their function return one of three values (Exists, Doesn't Exist, and Is In Funky Delete State) instead of a boolean?

    The answer is that any work you do to try to protect users from this weird state is not going to solve the problem because the file system can change at any time. If a program calls "Does the file exist?" and the file does exist, you will return true, and then during the execution of your return statement, your thread gets pre-empted and somebody else comes in and puts the file into the delete-pending state. Now what? Your library didn't protect the program from anything. It can still get the delete-pending error.

    Trying to do something to avoid the delete-pending state doesn't accomplish anything since the file can get into that state after you returned to the caller saying "It's all clear." In one of my messages, I wrote that it's like fixing a race condition by writing

    // check several times to try to avoid race condition where
    // g_fReady is set before g_Value is set
    if (g_fReady && g_fReady && g_fReady && g_fReady && g_fReady &&
        g_fReady && g_fReady && g_fReady && g_fReady && g_fReady &&
        g_fReady && g_fReady && g_fReady) { return g_Value; }
    

    The compiler folks saw this message and got a good chuckle out of it. One of them facetiously suggested that they add code to the compiler to detect this coding style and not optimize it away.

  • The Old New Thing

    Alternate theories on how Putin can retain power after his second term expires

    • 31 Comments

    Everybody has their own pet theory on how Vladimir Putin can retain power after the end of his second official term. I earlier wrote of one such theory: Change the constitution to permit a third term. In the intervening months, other theories have been put forth. Each one chips away at a different part of the statement that "the president can be elected to at most two consecutive terms":

    • Attack the word the president: Transfer all the powers of the presidency to the prime minister, then have Putin be the next prime minister. There is no limit on how many terms someone can serve as prime minister.
    • Attack the word elected: Since the prime minister is first in line should the president step down, have Putin become the next prime minister and install a crony as president. When the president resigns on his first day, Putin becomes president without being elected.
    • Attack the word two by changing the constitution, as noted already.
    • Attack the word consecutive: Similar to the attack on the word elected, and then point out that since there was an intervening president, Putin's next two terms will not be consecutive with the first two.

    There's also another method that merely plays by a different set of rules: Install a crony as president and pull the strings from backstage.

    As I noted in my earlier entry, I'm much more fascinated by the machinery being put into motion to arrange for Putin to retain power than in whether he actually does.

    On an unrelated note, Mark MacKinnon explains the magic phrase to use as a foreigner when pulled over arbitrarily by the police.

Page 1 of 3 (29 items) 123