January, 2006

Larry Osterman's WebLog

Confessions of an Old Fogey
  • Larry Osterman's WebLog

    Blog Hiccups

    • 11 Comments

    Btw, the entire blogs.msdn.com site had an upgrade to a newer version of Community Server last night, as a consequence, some things are somewhat confused right now. 

    In general, the transition has been pretty seamless, but there are some quirks (like the extra blank lines in my top post).

    The good news is that the Telligent guys are working on the issues as they've been reported, and the issues should be resolved really quickly.

     

  • Larry Osterman's WebLog

    What's wrong with this code, part 17 - the answer

    • 8 Comments

    Yesterday's post discussed a hypothetical API to retrieve data from the registry.  The security hole in the original code is that if the value in the registry is exactly 512 bytes long, the buffer isn't null terminated.  That means that the caller, who is expecting a null terminated string, won't always get a null terminated string.  As MSDN says:

    If the data has the REG_SZ, REG_MULTI_SZ or REG_EXPAND_SZ type, the string may not have been stored with the proper null-terminating characters. For example, if the string data is 12 characters and the buffer is larger than that, the function will add the null character and the size of the data returned is 13*sizeof(TCHAR) bytes. However, if the buffer is 12*sizeof(TCHAR) bytes, the data is stored successfully but does not include a terminating null. Therefore, even if the function returns ERROR_SUCCESS, the application should ensure that the string is properly terminated before using it; otherwise, it may overwrite a buffer. (Note that REG_MULTI_SZ strings should have two null-terminating characters, but the function only attempts to add one.)

     

    There's another, more subtle problem, the routine's parameters (in particular the lpszValue parameter) isn't SAL annotated.  This means that static analysis tools like Prefast can't really correctly analyze the function.  So the developer fixed the security bug by ensuring that the returned string is null terminated.

    BOOL GetStringValueFromRegistry(HANDLE KeyHandle,
                                        LPCWSTR ValueName,
                                        ULONG dwLen,
                                        __out_ecount(dwLen) LPWSTR lpszValue)
    {
        BOOL returnCode;
        WCHAR buffer[512];
        DWORD bufferSize = sizeof(buffer);
        DWORD valueType;
        returnCode = RegQueryValueExW(KeyHandle,
                                     ValueName,
                                     NULL,
                                     &valueType,
                                     (LPBYTE)buffer,
                                     &bufferSize) == ERROR_SUCCESS;

        if (returnCode) {
            /*
             ** Check we got the right type of data and not too much
             */

            if (bufferSize > dwLen * sizeof(WCHAR) ||
                bufferSize % sizeof(WCHAR) != 0 ||
                (valueType != REG_SZ &&
                 valueType != REG_EXPAND_SZ))
            {
                returnCode = FALSE;
            }
            else
            {
                /*
                 ** Copy back the data
                 */
                if (valueType == REG_EXPAND_SZ)
                {
                    DWORD requiredBufferSize;
                    lpszValue[0] = TEXT('\0');
                    requiredBufferSize = ExpandEnvironmentStringsW(buffer,
                        (LPWSTR)lpszValue,
                        dwLen);
                    if ((requiredBufferSize == 0) || (requiredBufferSize > dwLen))
                    {
                        returnCode = FALSE;
                    }
                }
                else
                {
                    DWORD cchString;
                    CopyMemory((PVOID)lpszValue,
                                buffer,
                                bufferSize);
                    cchString = bufferSize/ sizeof(WCHAR);
                    WinAssert(cchString < dwLen);
                    lpszValue[cchString-1] = TEXT('\0');
                }
            }
        }
        return returnCode;
    }

    Mea Culpas:  Raymond caught the fact that function won't compile if you don't #define UNICODE because it doesn't explicitly call the W version of the RegQueryValueEx API.  He also noticed that the code doesn't check for failure in ExpandEnvironmentStringsW.  Both fixes are applied above. In the "not a bug, but wierd" category, he noted that the function will never fill more than 256 characters in the output buffer, which needs to be clearly documented.

    One final mea culpa: When I originally wrote this code, I wanted to show off how the above fix was itself broken.  Unfortunately I can't, because I believe that this code is correct :).  The original code from which this example was taken was  broken but in rewriting this for publication, I inadvertently fixed the 2nd bug (the original code used other APIs than the APIs shown in this example).  I'm going to try to come up with a similar example using other APIs that will show the two step problem.

    Vassili Bourdo also caught the ExpandEnvironmentStrings issue

    Kudos: Skywing found the root problem - that the length of the returned string isn't correctly checked and the string isn't correctly null terminated.

    Other comments:

    DanT questioned the security hole issue.  This is a security hole, but it requires another piece of code to call strcpy on the returned data.  But this is the root cause of that problem - if it had returned a null terminated string, then the other code wouldn't be a security hole.  That's how root cause analysis works - you find the root of the bug, not just the code that's in error.  In hindsight, the RegQueryValueEx API should have been fixed, but since that function was introduced in NT 3.1, it's too late to make such a sweeping change to the API - stuff WILL break if the fix is applied at that level.  That's why RegGetValue was introduced - it fixes the problem entirely.

  • Larry Osterman's WebLog

    What's wrong with this code, part 17

    • 29 Comments
    Time for another "What's wrong with this code".  This time, it's an exercise in how a fix for a potential security problem has the potential to go horribly wrong.  This is a multi-part bug, so we'll start with the original code.

    We start the exercise with some really old code:

    BOOL GetStringValueFromRegistry(HKEY KeyHandle,
                        LPCWSTR ValueName,
                        ULONG dwLen,
                        LPWSTR lpszValue)
    {
        BOOL returnCode;
        WCHAR buffer[256];
        DWORD bufferSize = sizeof(buffer);
        DWORD valueType;
        returnCode = RegQueryValueEx(KeyHandle,
                                                                ValueName,
                                                                NULL,
                                                                &valueType,
                                                                (LPBYTE)buffer,
                                                                &bufferSize) == ERROR_SUCCESS;

        if (returnCode) {
            /*
             ** Check we got the right type of data and not too much
             */

            if (bufferSize > dwLen * sizeof(WCHAR) ||
                (valueType != REG_SZ &&
                 valueType != REG_EXPAND_SZ))
            {
                returnCode = FALSE;
            }
            else
            {
                /*
                 ** Copy back the data
                 */
                if (valueType == REG_EXPAND_SZ)
                {
                    lpszValue[0] = TEXT('\0');
                    ExpandEnvironmentStringsW(buffer,
                                            (LPWSTR)lpszValue,
                                            dwLen);
                }
                else
                {
                    CopyMemory((PVOID)lpszValue,
                                (PVOID)buffer,
                                dwLen * sizeof(WCHAR));
                }
            }
        }
        return returnCode;
    }

    There's a security hole in this code, but it's not really obvious.  If you've been paying attention and it's blindingly obvious what's going on, please give others a chance :)

    As always, kudos and mea culpas on each step of the way.

     

  • Larry Osterman's WebLog

    A peek behind the beep

    • 10 Comments
    Or rather, a peek behind system sounds.

    Windows 3.1 (I think - it might have been Win95) introduced the concept of "application events" to Windows (I prefer to call them system sounds).

    But how do these events actually work? It turns out that documentation of how the schema for windows sounds work is relatively hard to find, but it's not very complicated.  This topic in MSDN covers defining application events (I call them system sounds), but it really doesn't describe the schema used to extend them.  This topic also talks about how to register events for your application.  But I'm not aware of any comprehensive description of what it takes to add a sound alias to Windows.

    There are 5 steps needed to define a new system sound.  You need to:

    1. Tell Windows about your application.
    2. Tell Windows about the application event (alias) for the sound.
    3. Author the .WAV file for your sound.
    4. Associate your WAV file with your application event
    5. Write your app to call PlaySound

     

    I'm not going to go into the authoring thingy, I have no idea how one would go about doing that :).  But I do know how to tell Windows about steps 1, 2, and 4.

    To let Windows know about the sound, you need to do two things.  The first is that you need to be sure that PlaySound can figure out what file is associated with the sound, the second is that you need to be able to tell the control panel the "name" of your sound. 

    In order to tell PlaySound and the control panel applet about your event, you need to define an "alias" for that event.  Once you've done that, simply insert a call to PlaySound specifying the SND_ALIAS flag, and you're done programming the event (that's #4).  But you still have to plumb in the support to map between the alias and the .WAV file that holds the sound, and that means going to the registry.

     

    System sounds live under the "AppEvents\Apps" registry key under HKEY_CURRENT_USER.  From there, you can define the names of your sound aliases, set their display name, and otherwise control how the system manages your sounds.

    The Apps key has a pretty straightforward layout:  There are a number of keys under the Apps key, one for each application that's registered to play sounds.

            HKEY_CURRENT_USER\
                AppEvents\
                    Schemes\
                        Apps\
                            .Default\
                            MSNMSGR\

    The ".Default" key is reserved for use by Windows, nobody should ever add an alias to that section.  Instead, you should create a new key under "Apps" with the filename (no extension) of your executable.  So if Outlook wants to add its own sounds, it should create a key named "Outlook".  The default value for the applications key is the name that the sounds control panel applet will use for the applications section. 

            HKEY_CURRENT_USER\
                AppEvents\
                    Schemes\
                        Apps\
                            .Default\
                            MSNMSGR\
                            <APPNAME>\ <default Value "Your app name">

    There's a significant drawback to doing this though - you can't support MUI builds (multiple languages on the same machine).  To enable MUI support, you should also create a value with the name DispFileName.  DispFileName should be a REG_SZ formatted as a shell indirect string.  mmsys.cpl will use SHLoadIndirectString to retrieve the actual value in a MUI safe form.  For Vista and beyond, there's a new registry API RegLoadMUIString that lets you do this directly from the registry.  You will also want to specify the SND_APPLICATION flag to PlaySound to tell it to look somewhere other than .Default.  When you specify SND_APPLICATION, PlaySound looks up the currently running executable name and uses that (minus path and extension) for the <appname>

            HKEY_CURRENT_USER\
                AppEvents\
                    Schemes\
                        Apps\
                            .Default\
                            MSNMSGR\
                            <APPNAME>\ <default Value "Your app name">
                                VALUE: DispFileName, REG_SZ <value <shell path for your app name>>

    Phew, we're almost done.  All we need to do now is to add a the actual sounds.  Under the app specific key, add a new key for each of your sounds - the name of the key matches the name of your alias.  Under the key with the alias, create yet another key named ".Current", the value of that should be either a REG_SZ or a REG_EXPAND_SZ that contains the path of your sound.

            HKEY_CURRENT_USER\
                AppEvents\
                    Schemes\
                        Apps\
                            .Default\
                            MSNMSGR\
                            <APPNAME>\ <default Value "Your app name">
                                <Alias Name>\ <default Value: The display name for your alias>
                                    .Current <default Value "Path to your .WAV file">

    The sounds control panel applet will use the display name specified for your alias.

    That's about it - it's a smidge more complicated than the MSDN documentation makes it out to be, but...  Enjoy!

     

     

    Btw, if you look at a current XP or Vista machine, you'll find two registry keys under AppEvents, "EventLabels" and "Schemes".  The "EventLabels" key is an alternate mechanism to defines the mapping between the event alias and the display name used for your sound in the control panel.

            HKEY_CURRENT_USER\
                AppEvents\
                    EventLabels\
                    Schemes

    Event Labels define the display name of the event.  It provides a flat list that maps between event aliases and their display names (similar to using the default value for the event listed above).  To add a new display name for a given alias, simply create a key with the name of your alias, and set the default value to the display name for the event.  Once again, for MUI builds, you can use the DispFileName trick to specify a resource that will be used for the name.

            HKEY_CURRENT_USER\
                AppEvents\
                    EventLabels\
                        CriticalBatteryAlarm <default value "Critical Battery Alarm">
                            VALUE: DispFileName, REG_SZ <value "@mmsys.cpl,-5827">
                    Schemes

  • Larry Osterman's WebLog

    Sisyphus 'R' US

    • 18 Comments

    Lately I've felt like Sisyphus.  It's a natural part of the process of software engineering, but it doesn't change the feeling.

    In every software project, after all the new code's been written, focus shifts to resolving and removing the bug backlog.  Of course fixing bugs is an ongoing part of the process, but you've always got the temptation of new work to keep you going.  But on a given project, when you're done with the new work, you've got to fix the bugs before you can move on to the next project.  During this period, you live and breathe by your outstanding bug count, looking for the time when you hit the magical ZBB (Zero Bug Bounce).  This is, without a doubt my absolute favorite part of a project - I absolutely LOVE this phase (I think it's a hunter-gatherer thing).  It is unbelievably satisfying to find the 18 month old bug that's been wreaking havoc throughout the system (that happened last week).

    Having said that, while you're trying to get to ZBB, the testers in your organization are constantly finding new bugs.  The good news is that you're not writing new code, so you're hopefully not introducing new bugs (it happens, but bug fixes tend to have fewer bugs than the original code).  But there's a period of time when the incoming bug rate is higher than your bug fix rate.  You work your bottom off trying to reduce your bug backlog.  There are really three ways you can reduce your backlog.  First off, you look at bugs and eliminate the duplicates (I found 4 dups in my bug backlog on Friday, for example).  Second, you can investigate the bug and determine that the bug is actually in a diferent component and assign it to the new component.  And thirdly you can fix the code. 

    All the while, while you're trying to reduce your backlog, the testers are still finding new bugs.  In addition, other developers in the organization are busy investigating their bugs, and as a result, they're reassigning their bugs to you.

    After a while, it all settles down - bugs get triaged appropriately, duplicates get weeded out of the database, the testers start running out of bugs to find (because you've fixed them), and eventually you start heading on the path towards shipping your product.

    But right now, it's Sisyphus time - my incoming rate is as high as or higher than the fix rate.  It doesn't seem to matter how many bugs I resolve per day, my outstanding bug count seems to go up.  The other day, I resolved ten bugs during the course of the day, and at the end of the day I had the same outstanding bug count as I did when I came in.

    I'm working as hard as I can, but against the outstanding bugs metric,  I'm not quite keeping up.  This can be an incredibly poisonous situation to be in, because you can't ever relent on the pace.  If you slack off at all, then your all important bug total will explode and you've got still more work to do.

    The good news is that it's temporary.  As I said, it settles down in a while, and the incoming rate will start drop below the fix rate. 

    Right now, on the other hand, I feel like that guy sitting there rolling the ball up the hill, never to see it hit the top.  I know I'll get it up there, and I'll be able to stand at the top knowing that I've really achieved something.

     

    So I go into the office every single weekday to push that dad-blamed rock up the hill again.  Who knows, maybe I'll make it over the top this time!

    Edit: Changed some text to remove some negativity.

  • Larry Osterman's WebLog

    Riffing on Raymond - Splay Trees

    • 10 Comments
    Raymond's post today on splay trees (brief summary: splay trees are interesting, but when you do an in-order traversal, they degrade to a linked list) reminded me of some "fun" I had with the NT 3.1 browser.

    The browser service is mostly dead these days, but in the early 1990s, the idea of finding the names of all the computers in your workgroup was flat-out cool.  Starting with Lan Manager, there was a mechanism built into the system to allow enumerating all the servers in your domain.  Windows for Workgroups expanded on that capability to allow for browsing not only the servers in your workgroup but also the workgroups on your network.  All of it was very cool.

    The browser architecture was pretty straightforward.  There was a client piece and a server piece.  In the browser architecture, only certain machines functioned as active browser servers, these were the machines that collected server announcements from the network.  Clients talked to the browser servers, and retrieved the list of servers from that service.

    When I implemented the browser for NT 3.1, it became very clear that the list of "servers" was going to be quite large, since every NT 3.1 machine was also a server.  There could literally be thousands and thousands of computers in a given domain, so it was important to store the list of servers in the most efficient fashion as possible.

    Since the NT filesystem team had built a robust splay tree package, and since the most common operation performed on the server was processing announcements from servers, it made perfect sense to use this splay tree package - after all, announcements from machines come in random order, so the tree structure should be pretty robust.

    One of the complications of the browser was that the browser had to support multiple network adapters and multiple network protocols.  A typical NT 3.1 client machine had two or three protocols running (typically TCP/IP and one of NetBEUI, IPX/SPX or XNS), the browser client had to retrieve the list of servers from the browser server for each network and merge the lists.  Since I had experience with the splay tree package, I figured it made sense to simply re-use the splay tree for the client.  To merge the lists, all I had to do was to retrieve the lists from each network and insert the entries into the splay tree that would eventually be returned to the user.

    The merge process was essentially:

        while (node = SplayTreeGetNextInorderNode(TransportSpecificSplayTree))
        {
            if (SplayTreeLookupNode(MergedSplayTree, node) == NULL)
            {
                SplayTreeInsertNode(MergedSplayTree, node);
            }
        }

    It sounded great on paper, and worked even better.  I was a totally happy camper...

     

    Until some point later on in the process, as we were digging deep into the browser's performance.  We were running nightly analysis runs to ensure that the browser was functioning correctly and we noticed that sometimes retrieving the list of servers would take a LONG time - sometimes several seconds, in fact, which was totally unacceptable.

    I started digging into the problem, and identified a number of problems, but I was quite surprised to realize that a significant part of the bottleneck was actually on the CLIENT.  I was really mystified about this, until I really started digging into the problem.  It turns out that the client would retrieve the list of servers and insert it into the splay tree.  Since the list of servers was maintained in a splay tree on the server, and retrieved by doing an in-order traversal of the tree, the list coming from the server was sorted alphabetically.  When I inserted all the entries in the tree, it formed a nice balanced tree (one of the characteristics of a splay tree is that it's resilient against in-order insertions).  The problem occurred when I tried to merge the list from other networks.

    If you've been paying attention you see the problem.  Since the transport specific splay tree is in-order, if the server represented by "node" is already on the list, this functions as an in-order traversal of the linked list.  If the lists are disjoint, this isn't a critical problem since we'll be doing a lot of insertions, which (as I mentioned) results in a balanced tree.

    But if the lists of servers are identical (or almost identical), then each insertion degrades the tree to a linear list.  That means that the nice balanced splay tree is a linked list, and the insertion algorithm listed above degrades into a O(n3) traversal.  It really wasn't pretty.  And, of course, on the Microsoft corporate network we had thousands of computers in the same domain with the exact same set of network transports.  All of a sudden my clever insertion algorithm starts looking like the root cause of a lot of performance issues.

    The good news is that since all the transport specific lists were sorted in-order, I realized I could rely on this behavior and inserted a O(n) insertion sort, which sped up the client-side processing by an order of magnitude.

    The moral of the story: Understand the performance characteristics of your underlying data structures BEFORE you implement them, or you're going to be sorry.

     

    A side-note to readers unfamiliar with "O(<something>)": The O(n) is what's known as "Big-O" notation, it's a representation of the order of magnitude of an algorithm.

  • Larry Osterman's WebLog

    Nineteen years ago...

    • 13 Comments
    Nineteen years ago today, on a crisp Saturday morning in Scarsdale, NY, Valorie Lynne Holden and Lawrence William Osterman were married.

    I find it astonishing when I realize that it has been nineteen years that we've been married, in all honesty, it doesn't seem like it's been nearly that long.

    Valorie truly is my other half (in truth, she does far more her share).  She's there when I'm sick, she's there whenever I need her.  She schleps the kids around all the time, without complaint, keeps the household running, etc (all the stuff that I should do but don't). 

    All the while maintaining a full course load in her quest to get the stupid piece of paper that will finally let her get a job as a teacher (July!).

    Valorie also doesn't like it when I make a public fuss over her, but on our anniversary, I honestly don't care - we have a very long history of excruciatingly painful public embarrassments of each other, starting way back in 1985 when I hired a madrigal singer to serenade her in one of her classes (and escalating from there - the last time was when Valorie had me arrested at a group meeting back on our 10th anniversary), consider this post as another step in the progression :)

    You stay together with someone for 24 years and you also develop your own little traditions - we stopped giving each presents for our anniversary many years ago (it got to be a bit much with xmas, our anniversary, my birthday and valentines day in a 6 week period), instead we have "the card hunt".  I started it about 12 years ago or so on Valorie's birthday - instead of taping her cards to her presents, I hid the cards throughout the house.  Her "challenge" was to find the cards.  We've expanded the game to all "card giving" opportunities, so we're doing the card hunt thingy for our anniversary, birthdays, Mother's Day, Father's Day, etc.  Even the kids have gotten into the act, they want card hunts on their birthday's too :).It's a smidge of a challenge at this point trying to find new places to hide the cards where they'll be discoverable (there's no point in hiding cards where the recipient can't find them), but what the heck.

    I love you dear, Happy Anniversary.

     

    Btw, the cards shouldn't be too hard to find this year, they're all in plain sight :)

  • Larry Osterman's WebLog

    Young Turks

    • 42 Comments
    Ok, this is a bit of a rant.  I recently encountered an email exchange from someone I respect where the person in question asked (more-or-less) "I can't, for the life of me, see why on earth this particular piece of functionality exists in Windows".

    Now this person is somewhat younger than I (ok, most everyone in the industry is somewhat younger than I), but he is a super smart guy.

    The thing is, he has NO CLUE about how the personal computer world operated back in the early 80's when Windows was designed.  Windows was designed to run on machines with 512K of RAM, on machines with a 10M hard disk.  In addition, the CPU on which Windows was intended to run didn't support memory protection, so the concept of "separation of privilege" was meaningless.  MS-DOS (on which Windows 1.0 was built) had a long history of putting critical OS information into an application's data space.  For Windows, things were no different - the line between application and system was often blurred. 

    Whenever there was a possibility of offloading potentially optional functionality onto the running application, Windows took it.  Instead of having a preemptive scheduler, Windows used a cooperative scheduler.  That meant that applications never had to deal with ugly issues like synchronization of data, etc.  The consequence of this cooperation was that a single errant Windows application could hang all the running applications. 

    But that was ok, because the overhead of the infrastructure to FIX the problem (per-application message queues, etc) would have meant that Windows wouldn't be able to run on its target systems.  And adding all that extra stuff really wouldn't make that much of a difference since the applications were all running in the same address space (along with Windows and the operating system).

    So it's not surprising that there were a lot of things present in the early versions of Windows that would make people cringe today.  Sometimes this isn't a problem, but one of the key values of the Windows platform is that Microsoft very rarely intentionally breaks applications.  We'll break applications when the applications depend on a security flaw, and sometime applications will break when there's a fundamental architectural shift occurring (we already know that some multimedia apps are broken in Vista because they depend on being able to call multimedia APIs during DLL initialization, which only worked by luck in XP). 

    But barring that, Microsoft's made a strong commitment to not break customers applications.  The good thing is that it means that the Windows platform is remarkably stable.  Many applications written for Windows 1.0 still run on Windows Vista.  It means that corporations that have made an investment in technology aren't going to lose that investment by moving to a newer version of Windows.  It also means means that every version of Windows carries forward the designs from previous versions.

    If there was any "mistake" made, it was Microsoft's unceasing commitment to backwards compatibility.  And I personally believe that a huge part of the reason for Windows success in the marketplace IS that commitment.  If we didn't have it, people would have moved onto other platforms long ago.

    So when someone starts questioning why ancient stuff exists in Windows, they really need to understand the environment in which those decisions were made.  Part of the value of being a young turk is that they challenge the decisions that were made by their elders.  But before you decide to challenge an earlier decision, you need to understand the environment in which the decision was made.  Sometimes what no longer makes sense did at one time.

     

     

    Btw, before people start claiming that this was somehow "Microsoft's" fault, the original Mac OS had many of the same issues, it was designed to run on a machine with 128K of RAM and didn't even HAVE a hard disk - it only supported a 400k floppy disk.  The designers of the Mac OS made many of the same decisions that the Windows designers did (Mac OS was also a cooperative multitasking environment), in addition, the Mac designers went even further and put significant parts of the OS into the system ROMs on the Mac, further blurring the lines between application and system.

  • Larry Osterman's WebLog

    My new "favorite" WIn32 API

    • 24 Comments
    Every once in a while, you discover a new Win32 API that you've never heard of.  The other day, one of the guys in my group sent an email extolling the values of a new WIn32 API that was added for Windows Professional X64 edition and Windows Server 2003 SP1 (and of course Windows Vista).

    To read a value from the registry, historically you called the RegQueryValueEx.  Unfortunately, the RegQueryValueEx API suffered from a number of fatal problems.  The biggest one was that it didn't adequately type check the data being returned - for example, if the registry contained a string value, it was possible that the data in the registry might not be null terminated, resulting in the following warning in the documentation:

    If the data has the REG_SZ, REG_MULTI_SZ or REG_EXPAND_SZ type, the string may not have been stored with the proper null-terminating characters. For example, if the string data is 12 characters and the buffer is larger than that, the function will add the null character and the size of the data returned is 13*sizeof(TCHAR) bytes. However, if the buffer is 12*sizeof(TCHAR) bytes, the data is stored successfully but does not include a terminating null. Therefore, even if the function returns ERROR_SUCCESS, the application should ensure that the string is properly terminated before using it; otherwise, it may overwrite a buffer. (Note that REG_MULTI_SZ strings should have two null-terminating characters, but the function only attempts to add one.)

    Unfortunately, many people didn't implement this logic correctly (it's quite hard to get this right for all cases).  In addition to the null termination issue, the caller needed to deal with ANY data type being returned - you had to add in checks to ensure that the type of data returned matched the type of data you expected.  The root cause of this is a "leaky abstraction" issue - the NT base registry API simply stores blobs of data with the type information maintained as metadata alongside the data being stored.  Thus when you retrieve a value from the registry, you get the data in the underlying store and the metadata back.  But there's no attempt at ensuring that the metadata matches the intent of the application because the intent of the application isn't known.

    So a new API was added to the Windows API set that resolves these issues, RegGetValue.  I just converted a 50 line routine to use it, the entire routine 50 line routine turned into a one line call to RegGetValue.  Using RegGetValue, I was able to remove:

    • The code that checked the type of data in the registry
    • The logic to handle REG_EXPAND_SZ (it's automatically handled by RegGetValue)
    • Code to ensure null termination of the registry string.
    • Code to validate that the length of the registry string was "appropriate" (a multiple of 2).

    The bottom line was that I was able to remove a whole chunk of potentially buggy code and replace it with a single API call.  Heck, I didn't even need to open the registry key, since the RegGetValue API will even open and close the key for you (it opens the key for KEY_QUERY_VALUE if you care).

  • Larry Osterman's WebLog

    What registry entries are needed to register a COM object.

    • 4 Comments
    So I'm finally done with the "COM activation" posts. One question that was asked during the series was (paraphrased) "Why on earth are you doing this".  The simple answer is: The registry keys used for COM are a great example of Cargo cult programming.  Everyone uses a template that they found in a previous COM object without really understanding what they need.  The thing is, for the vast majority of COM objects, there are only about a half a dozen things that they'd ever have to worry about.  This is further complicated by the fact that Visual Studio's templates add a lot of stuff that isn't always required (because VS can't know what the right behavior is).

    A large part of the "cargo cult" nature of this is the fact that there are a bewildering set of registry settings that can be set for COM objects, and it's not clear which, if any apply.  So I'm attempting to lay out a series of articles that can help people determine what they need to set.

    So we start at the beginning.  You've decided that you are going to author a COM object.  The reasons for it aren't important, you're writing a COM object.

    Well, before you even get started, you need the minimal COM object registration.

    Next, you need to determine if you need an APPID.  If you do, you need to add an APPID registry key to your COM object's registration.

    After thinking about the APPID, you need to determine if the interfaces supported by your object need to be marshaled.  You have two choices when marshaling interfaces:

    1. You can marshal your interface by using a proxy DLL.
    2. You can marshal your interfaces by using a typelib.

    In general, marshaling by a proxy DLL will be somewhat faster (because it doesn't have to parse the typelib to determine the marshaling semantics), and will more accurately match the exact interface semantics.

    On the other hand, marshaling by typelib allows you to easily implement automation and interoperate with VB6 and .Net languages.

    There's no right answer that fits everyone, you need to make the choice that's right for your object.  Once you've made that choice, you can either look at the registry changes needed to register a proxy DLL, or the registry changes needed to register a typelib.

    And finally, you need to decide if you need a progid for your COM object.  If you do, then you need to apply the registry changes for a PROGID.

    Now this article doesn't even come close to covering all the myriad of options available for COM objects.  But it covers all the cases I've hit over the past few years of (admittedly unsophisticated) COM use.

    Btw, one of the things I noticed while writing this series is that the MSDN samples for ActiveX controls, especially the Hello sample do a great job of of showing all the COM registrations needed for ActiveX controls, including all the registry changes needed for each of the pieces above.

  • Larry Osterman's WebLog

    COM registration of PROGIDs.

    • 4 Comments
    We're almost done with this series (phew). Up until now, everything I've talked about w.r.t. COM registration has been required for one scenario or another.  However, there's one other common aspect of COM registration that is a wonderful convenience (although not necessary except for some relatively rare circumstances).

    Of course, I'm talking about the progid.  The PROGID provides the ability to define a string alias for a particular COM object.  Thus with the PROGID, you can access a COM object without having to know it's CLSID.  This can be quite handy, especially when you're working in languages that don't provide easy access to a GUID data type.  A PROGID is simply a string representation of the class. 

    By convention the PROGID has the form: <Program>.<Component>.<Version> and should be less than 39 characters in length.  There are a couple of other restrictions spelled out here.

    So what are the minimal set of registry keys needed for a progid?

    Well, they're:

    Key: HKEY_CLASSES_ROOT\<ProgID>
        Default Value: Friendly name for ProgID.  Should contain the version number.
    Key: HKEY_CLASSES_ROOT\<ProgID>\CLSID
        Default Value: CLSID of the object that matches this progid.

    There's an alternate form of the progid, known as the version independent progid, this is simply a progid without the <Version> part of the name.  It has the same format as the version specific progid:

    Key: HKEY_CLASSES_ROOT\<Version Independent ProgID>
        Default Value: Friendly name for ProgID.
    Key: HKEY_CLASSES_ROOT\<Version Independent ProgID>\CLSID
        Default Value: CLSID of the object that matches this progid.
    Key: HKEY_CLASSES_ROOT\<Version Independent ProgID>\CurVer
        Default Value: Version specific PROGID for this COM object.

    You don't have to specify the CurVer key if you don't have multiple versions, but it's probably a good idea.

    You also need to hook the progid's back up to the CLSID key for your COM object:

    Key: HKEY_CLASSES_ROOT\CLSID\<Class ID>\ProgID
        Default Value: <PROGID> for this COM class.
    Key: HKEY_CLASSES_ROOT\CLSID\<Class ID>\VersionIndependentProgIDProgID
        Default Value: <Version Independent PROGID> for this COM class.

    Again, PROGID's are nice-to-have, but in no way are they required.  But they can be quite convenient so...

     

  • Larry Osterman's WebLog

    COM registration if you need a typelib

    • 8 Comments
    The problem with the previous examples I posted on minimal COM object registration is that they don't always work.  As I mentioned, if you follow the rules specified, while your COM object will work just fine from Win32 applications, you'll have problems if you attempt to access it from a managed environment (either an app running under the CLR or another management environment such as the VB6 runtime or the scripting host).

    For those environments, you need to have a typelib.  Since typelib's were designed primarily for interoperating with visual basic, they don't provide full access to the functionality that's available via MIDL (for instance, unnamed unions get turned into named unions, the MIDL boolean type isn't supported, etc), but if you gotta interoperate, you gotta interoperate.

    So you've followed the examples listed here and you've registered your COM object, now how do you hook it up to the system?

    First, you could call the RegisterTypeLib function, which will perform the registration, but that would be cheating :)  More importantly, there are lots of situations where it's inappropriate to use RegisterTypeLib - for instance, if you're building an app that needs to be installed, you need to enumerate all the registry manipulations done by your application so they can be undone.

    So if you want to register a typelib, it's a smidge more complicated than registering a COM component or interface.

    To register a typelib, you need (from here):

    Key: HKEY_CLASSES_ROOT\Typelib\<LibID>\
    Key: HKEY_CLASSES_ROOT\Typelib\<LibID>\<major version>.<minor version>\   
        Default Value: <friendly name for the library> Again, not really required, but nice for oleview
    Key: HKEY_CLASSES_ROOT\Typelib\<LibID>\<major version>.<minor version>\HELPDIR   
        Default Value: <Directory that contains the help file for the type library>
    Key: HKEY_CLASSES_ROOT\Typelib\<LibID>\<major version>.<minor version>\FLAGS   
        Default Value: Flags for the ICreateTypeLib::SetLibFlags call (typically 0)
    Key: HKEY_CLASSES_ROOT\Typelib\<LibID>\<major version>.<minor version>\<LCID for library>
    Key: HKEY_CLASSES_ROOT\Typelib\<LibID>\<major version>.<minor version>\<LCID>\<Platform>
        Default Value: <File name that contains the typelib>

    Notes:

    If your typelib isn't locale-specific, you can specify 0 for the LCID.  Looking at my system, that's typically what most apps do.

    <Platform> can be win32, win64 or win16 depending on the platform of the binary.
     

    But this isn't quite enough to get the typelib hooked up  - the system still doesn't know how to get access to the type library.  To do that, you need to enhance your CLSID registration to let COM know that there's a typelib available.  With the typelib, a managed environment can synthesize all the interfaces associated with a class.  To do that, you enhance the class registration:

    Key: HKEY_CLASSES_ROOT\CLSID\<CLSID>\TypeLib = <LibID>

    But we're still not quite done.  For each of the interfaces in the typelib, you can let the system do the marshaling of the interface for you without having to specify a proxy library.  To do this, you can let the standard proxy marshaler do the work.  The universal marshaler has a clsid of {00020424-0000-0000-C000-000000000046}, so instead of using the interface registration mentioned in the last article, you can replace it with:

    Key: HKEY_CLASSES_ROOT\Interface\<IID>\
        Default Value: <friendly name for the interface> Again, not really required, but nice for oleview
    Key: HKEY_CLASSES_ROOT\Interface\<IID>\ProxyStubClsid32\
        Default Value: {00020424-0000-0000-C000-000000000046}
    Key: HKEY_CLASSES_ROOT\Interface\<IID>\TypeLib\
        Default Value: <LibID>

    Now instead of using the proxy code in a proxy DLL, the system will do the marshaling for you.

    Next: Ok, but what if I don't want to deal with all those ugly GUID thingies?

  • Larry Osterman's WebLog

    COM registration for cross process access

    • 13 Comments

    Yesterday I posted a minimal COM registration.  But it had some serious issues.  Among them the COM objects couldn't be used cross process, and they couldn't be used from a STA application unless the object aggregated the free threaded marshaller.

    So what if you want to go cross-process?  Well, in order to go cross process, you need to be able to know how to marshal the parameters for your interfaces.  COM knows how to marshal the standard interfaces (like IClassFactory, IUnknown, etc) but most objects need more than just those interfaces.

    There are basically two ways of letting COM know about your interfaces.  The first is by using a typelib, the second is by using a proxy DLL.  If you don't need to worry about COM interop or interaction with older scripting architectures (like VB6 or script), then using the proxy DLL is unquestionably the way to go - when structures are mapped to a typelib, there is a slight loss of fidelity which can cause "issues".  As an example of the loss of fidelity, typelibs can't contain unnamed unions, while C allows them.  Thus if an interface attempts to marshal a structure containing an unnamed union, the typelib will replace the unnamed union with a named union.  Normally this isn't a problem, the actual structure data doesn't change, but it means that the definitions don't round-trip.

    I'm not going to discuss typelibs currently (they're the next post in this mini-series), this time I want to talk about using proxy DLLs for your interface marshaling.

    A proxy DLL is simply a DLL that contains the logic needed to marshal your interfaces.  To build one, follow the examples in MSDN here.  There are lots of options when building proxy DLLs, personally I prefer to merge the proxy DLL with an existing DLL (it just seems cleaner), that's a bit trickier, but not too hard (if you define the REGISTER_PROXY_DLL definition, then the _p file generated uses a hard coded name of DllRegisterServer, you need to define the ENTRY_PREFIX macro to rename the built-in name, etc).  The macros to make that stuff work are described here.

    Once you've gotten the proxy definitions for your interfaces built, you need to let COM know about them.  When COM realizes it has to marshaling a COM object, it starts looking for information to let it know how to marshal its interfaces.  First it checks to see if the object supports IMarshal, to let the object do custom marshaling.  If that doesn't work, it starts looking elsewhere.  One of first places looks is to see if it's been explicitly told about how to marshal the interface by looking in HKCR\Interface for the IID.

    First, you need to come up with a GUID for the proxy DLL, uuidgen can come up with one quickly.  And you need to let COM know about it (using the minimal set of registrations mentioned in the other article):

    Key: HKEY_CLASSES_ROOT\CLSID\<PS Factory GUID>\
        Default Value: <MyInterfaceName>_PSFactory    // Again, not needed, but convenient
    Key: HKEY_CLASSES_ROOT\Interface\<IID>\InProcServer32\
        Default Value: <Proxy Server DLL>
     

    Next, you want to register the interfaces and let COM know how to find your proxy DLL.  Add the following to the registry for each of your interfaces:

    Key: HKEY_CLASSES_ROOT\Interface\<IID>\
        Default Value: <friendly name for the interface> Again, not really required, but nice for oleview
    Key: HKEY_CLASSES_ROOT\Interface\<IID>\ProxyStubClsid32\
        Default Value: <Proxy Stub CLSID>

    And with that, you're done.  COM can now marshal your custom interfaces across process boundaries (or apartment boundaries).  Once again, 2 keys for each interface, plus 2 keys for the proxy DLL, which is a fair amount less than some of the stuff I've seen in the registry.

    Next, what if you want to interoperate with VB or .Net?

  • Larry Osterman's WebLog

    Minimal COM object registration

    • 14 Comments
    Yesterday I mentioned the APPID registry key.  I also mentioned the effort going on within Microsoft to reduce redundant COM related registry keys.

    We use about a dozen COM objects internally within the audio subsystem of Vista, and after talking to the COM guys about what we REALLY need to specify, we've settled on the following for registration of the objects.  I believe it's the minimal set of registry values you need to have a fully functional COM object:

    Key: HKEY_CLASSES_ROOT\CLSID\<clsid>\
        Default Value: "Friendly Name for the class"         not actually required, but useful for tools like oleview and dcomcnfg
    Key: HKEY_CLASSES_ROOT\CLSID\<clsid>\InprocServer32\"
        Default Value: REG_EXPAND_SZ: <DLL Name>
        Value "ThreadingModel" REG_SZ "both"

    That's it - two keys and a single non default value.

    Of course, by specifying such a restricted set of keys, you lose a lot of functionality.  For instance, since there's no PROGID specified, you can't call CLSIDFromProgID or activate the object via name (such as using the szProgID version of the CComPtrBase<>::CoCreateInstance method).  In addition, without additional information, you can't marshal the object cross process.  It's also not accessible from script, or from the CLR (because there's no typelib registered).  And finally, while the object is marked with a threading model of "both", it can't be used from the STA unless the COM object aggregates the FreeThreadedMarshaller.

    But if all you need is to be able to have an application activate your COM object in-proc, this is all you need.

    Next: What do you need to add if you need your object to work cross-process?

    Edit: Added note about aggregating the free threaded marshaller.

  • Larry Osterman's WebLog

    Silly little bit of Broadway trivia I didn't know

    • 9 Comments

    As I mentioned, my mom, Daniel and I went to see "Dirty Rotten Scoundrels" on B'way the other day.

    A great show, Daniel's first time in a real Broadway theater.  I do have to say that his jaw was slack during some of the jokes - it's a very earthy show.

     

    Anyway, one of the stars of the show is Joanna Gleason (along with Lord John Worphin and Fiyero).  I've been a fan of Ms. Gleason's work since I saw her in "Into the Woods" (a retelling of Cinderella, Red Riding Hood, Jack and the Beanstalk, Rapunzel by Sondheim - it's out on video, and worth watching).

    Anyway, I was trolling through IMDB the other day and discovered that she's Monty Hall's (as in "Let's Make a Deal"'s Monty Hall) daughter.

    Who knew (ok, I'm sure that she did, but...)?

    Edit: Ok, so I can't punctuate, happy?

     

  • Larry Osterman's WebLog

    When do you need an APPID in your COM registration?

    • 3 Comments

    I recently received this question from a reader:

    When Does ATL component need to register a AppId for it to work( Our component is an IE plug-on)? There is a macro:DECLARE_REGISTRY_APPID_RESOURCEID. It will triger three registry entries for AppId by the default DllRegisterServer implementation from ATLDllModuleT(seems to me). Do we need all of them? I am try to remove all possible unused registry entries from setup file.

    It's actually a good question, because the documentation of all the rules associated with COM activation are not incredibly well documented.  Most people simply copy template examples from another working COM object and use those.  Every once in a while, when I'm bored, I scan through the registry on my machines looking at the various COM registrations - you can often tell which groups own which components simply by looking at the common items (especially misspellings and the like) in their COM registry.

    We've actually had a team of people in-house analyzing the COM registrations of the various COM objects in the system to ensure that there's no redundant or invalid information in them, it's been an eye opener (I was able to remove about 80% of the registry entries for the audio stack after chatting with the guy reporting these issues).

    So when DO you need an appid?  Well, for many cases, you don't, but ATL doesn't know that so by default it assumes that you will need one.

    You need to specify an appid if one of the following is true:

    1. Your COM object is hosted in a service
    2. Your COM object has custom launch permissions
    3. Your COM object has custom access permissions
    4. You want to specify a default bitness for the COM server when launching an out-of-proc COM object. 
    5. Your COM object runs as another user
    6. Your COM object needs to run on another system (unless you specify a COSERVERINFO in your call to CoCreateInstanceEx)
    7. Your COM object needs to set the default authentication level (used if you don't call CoInitializeSecurity directly)
    8. Your COM object needs to specify the RPC endpoints for communication (or the TCP port number, etc)
    9. Your COM object should run on the same machine as the storage (ActivateAtStorage)
    10. Your COM object runs using the DLL surrogate
    11. Your COM object needs to set its software restriction trust level

    If those are true, you need an appid, otherwise it's unnecessary (and takes up space on the disk).

    Essentially, if you ever need to specify one of the appid keys, you need to specify an appid.  If you don't need to specify those keys, then you don't need an appid.  From what I've seen, the vast majority of COM objects don't.

     

    The DECLARE_REGISTRY_APPID_RESOURCEID macro allows you to specify an RGS file that will be used to register the APPID for your COM object if you need one, but if you don't want ATL to register one, you can simply remove the macro.

  • Larry Osterman's WebLog

    User Experiences

    • 27 Comments

    I'm back!

    It was a restful, but uneventful holiday vacation, it was great seeing everyone in the family again, and christmas wasn't that bad (we opened presents in 4 separate rounds with different groups of friends).

    I got the Deathstar that was my "big" christmas present, I can't WAIT to start building it.

    I also got an iRiver H10 portable media player.  My first media player was an old Creative Nomad that never really worked, so I had a bit of trepidation about the iRiver.

    So far, I've been really happy with it, the device is a bit big for a 20G device (about the size of a Dell DJ or 1st generation iPod), but the sound quality is pretty good.  I loved the fact that it didn't come with any software, you just plugged it into my PC and it just worked.  It doesn't have the caressibility factor of an iPod (or my wife's Creative Zen), it's a more workmanlike device, but I'm happy with it.

    Having said that, I wanted to write about the OOBE (out-of-box experience).  The iRiver comes in a jet black box with a cut-out that allows you to look at the actual device.  Very nice IMHO, you can clearly see the size of the device, etc.

    My concerns started when I opened the box.

    To the left of the H10 was a plastic bag with the manual etc.  On the top of the plastic bag, plain to see was a bright orange piece of paper with:

    STOP!

    Having Trouble?

    Visit www.iriveramerica.com/support

    Before you return it...

    Contact iriver America.

    We Can Help

    What on EARTH were these guys thinking?  I just opened the box, and you present me with a big warning that tells me that I'm going to have problems with the device!  Stuff like this guarantees that I'm going to have low expectations of the device.  There was also another insert inside the plastic bag that said "Having sync problems, you may need a firmware update, see ....".

    Contrast this with the iPod experience.  You open the box and it unfolds like a flower showing you the device.  You just KNOW when you open it that it's just going to work and you're going to be happy with it.  The Creative Zen is also a pretty good OOBE (not as good as the iPod one but not bad).

    I'm sure that iRiver added this warning because they were concerned about excessive returns from people, but apparently they totally ignored the effect it would have on their OOBE.

    Sometimes I wonder (and dispair) about the lack of attention that people spend on their OOBE.  Just like the old cliche "You only get one chance to make a first impression", you only get one opportunity to run the OOBE.  The OOBE sets the tone for the consumer for the life of the product.  If you blow the OOBE, you've just made your life harder, since it will take the user longer to be delighted with your product.

    This applies across the board, btw, not just for software/hardware.  For instance, my mother, Daniel and I went to see Dirty Rotten Scoundrels in NYC (a great show, btw) over the vacation. Since the transit workers were on strike in the City, we walked from her apartment on the Upper West Side to the theater (30ish blocks).  On the way, we stopped off at a Starbucks to get a coffee (and take a quick break).  The Starbucks was crowded (as they often are), but I placed our orders, dropped coats off at the table and went back to get our drinks.  Somehow they'd managed to lose both Daniels and my drink (Mom had a drop coffee so there was no opportunity to mess it up).  My guess is that people just took our drinks because they didn't want to wait for theirs.

    Now think about this experience in terms of OOBE (or first-run).  If this had been my first time in a Starbucks, I'd never go back - they would have lost a customer for life because of their horrible service.  The same thing happens with software and hardware.  Your first experiences with a product set the stage for your subsequent uses, so it's utterly critical that you ensure not only that the users first experience with your product delight them, but it also should leave them WANTING to use your product.

    What could iRiver have done better?  Well, first off, they should have ensured that the devices sold had the most recent version of their firmware.  My device had version 1.5 of the firmware, their web site had version 2.6 something-or-other on it.  Given that they had several major revisions to the firmware, you'd think that they would be flashing the devices with current firmware.  Secondly, instead of the quick fix of adding paper inserts to the packaging, they should have investigated WHY people were returning the devices - was the problem that WMP was too hard to use?  Was it that the device had bugs?  Were the controls unintuitive?

    I can't stress how important it is to ensure that your customer have a wonderful experience from the instant that they open your box (or store door, or install your product, or...).  IMHO, this is a huge part of the reason that Apple dominates the music player industry - they spent the time to ensure that every aspect of the use of their devices delights, from the second you open the box to when you start listening to their product.

    A large part of what makes a product a great product is the attention to detail on every aspect of the experience, and that starts the very second you open the box (or turn the computer on).  Failure to realize this can cause your otherwise excellent product to fail at the marketplace.

     

Page 1 of 1 (17 items)