October, 2007

Larry Osterman's WebLog

Confessions of an Old Fogey
  • Larry Osterman's WebLog

    What happens when audio rendering fails?


    Skywing sent me an email earlier today asking me essentially "Why doesn't Windows do a better job of handling the case where the default audio device goes away?"[1]

    It's a good question, and one that we've spent a lot of time working on for quite some time (this isn't a new issue for Vista, it's been there since day one, not that it actually matters).

    For the vast majority of existing users, this isn't a problem - they only have one audio device on their machine, and thus its a moot question - their default device almost never goes away.  On the other hand, looking forwards, it's going to become a more common scenario because of the prevalence of Bluetooth audio hardware and USB headsets (for example, I'm currently listening to Internet radio over a pair of Bluetooth speakers I purchased the other day).


    The short answer to Skywing's question is: "It's the responsibility of the application to deal with handling errors".  The audio stack bubbles out the error to the application and lets it figure out how to deal with the problem.

    Which then begs the next question: "How do you handle errors so that you can recover from them?"  It turns out that the answer to that question requires a bit of digging into the audio stack.


    As I've discussed in the past, there are four major mechanisms for accessing the audio functionality in Windows Vista.  They are:

    1. MME - the legacy MME APIs, including waveOutXxx, waveInXxx, and mixerXxxx
    2. DirectSound/DirectShow
    3. Media Foundation (new in Vista)
    4. WASAPI (new in Vista)

    For the MME APIs, an application usually accesses the audio stack using the WAVE_MAPPER (or MIDI_MAPPER) pseudo-device.  The nice thing about the WAVE_MAPPER device is that it doesn't matter which device is the current device, it just uses the one chosen as the user's device.  Alternatively, for the MME APIs, you can select a specific device.  For the MME APIs, devices are numbered from 0 to <n>; for appcompat reasons, starting in Windows Server 2003, device 0 is typically the user's default device (there were applications that hard coded device 0 for their output device, which caused issues when you had more than one audio device).

    For DirectShow and DirectSound, the call to the DirectSoundCreate API takes a GUID which represents the output or input device, or NULL to represent the default device.  In addition, it also provides a mechanism to address the default voice communications device (DSDEVID_DefaultVoicePlayback).

    For Media Foundation and WASAPI, you specify the specific audio endpoint on which you want to render or capture audio in the initialize call (MFCreateAudioRenderer for MF, for WASAPI, you activate an IAudioClient interface on the endpoint).

    In either case, once you start streaming to a device, the only mechanism in place when something goes wrong is that an API call fails.  That means that it's up to the application to figure out how to recover from any failure.

    For the wave APIs and DSound, you really don't have any choice but to detect the failure close down your local resources and restart streaming - the legacy APIs don't allow you a good mechanism for detecting the cause behind a streaming failure.

    For MediaFoundation, MF generates events using its' event generator mechanism to inform an application of interesting events; among the events that can be received is an event indicating that the audio device was removed.  There are other relevant events generated, including events that are generated when the audio service is stopped (which also stops streaming), when the mix format for the audio endpoint is changed, etc.

    For WASAPI, a WASAPI client can retrieve the IAudioSessionControl service from an IAudioClient object and can use the IAudioSessionControl::RegisterAudioSessionNotification to register an IAudioSessionEvents interface.  The audio service will call the IAudioSessionEvents::OnSessionDisconnected method when it tears down an audio stream (all these notifications are also passed through to MediaFoundation's event mechanism).

    In Windows Vista, there are six different disconnect events generated, and there's a specific set of recovery steps for each of them:

    Disconnect Reason Meaning Recovery Steps
    DisconnectReasonDeviceRemoval The device used to render the streams on this endpoint has been removed. Stop the stream, re-enumerate the audio endpoints and chose a new endpoint.  If your application is rendering to the default endpoint, just call IMMDeviceEnumerator::GetDefaultAudioEndpoint to determine the new default endpoint.
    DisconnectReasonServerShutdown The audio service has been shutdown. Restart the audio service if possible; inform the user of the problem - there's no easy way of recovering from this one automatically.
    DisconnectReasonFormatChanged The mix format for the audio engine has changed. Close any existing streams and reopen them in the new format.  Make sure that you rebuild your client side audio graph to ensure that you are generating output in the new mix format
    DisconnectReasonSessionLogoff The user has logged off the terminal services session in which the audio session was running Close any existing streams.  It's highly unlikely that this notification will be seen since the operating system tears down the processes for a user when that user logs off
    DisconnectReasonSessionDisconnected The user was streaming audio to the console and a TS client connected indicating that the server should redirect audio to the client (or vice versa) Treat this event like you would a DisconnectReasonDeviceRemoval event.
    DisconnectReasonExclusiveModeOverride The user has opened an audio stream on the endpoint in exclusive mode.  This force-closes all shared mode streams (you can override this behavior in mmsys.cpl) Close any existing streams, return the error to the user or poll waiting on the endpoint to become available in the future.



    [1] That's not the actual question that he asked, but the answer to his question is included in the answer to my question, so...

  • Larry Osterman's WebLog

    An Overview of Windows Sound and "Glitching" Issues


    Nick White over at the Windows Vista Blog just posted an article written by Steve Ball, the PM in charge of the sounds team.


    It does a pretty good job of covering why my $2000 PCs sometimes glitches like crazy, while my $20 CD player works perfectly every single time.


    It's worth a read.

  • Larry Osterman's WebLog

    Why do people think that Windows is "easy"?


    Every once in a while, someone sends me mail (or a pointer to a blog post) and asks "Why can't you guys do something like that?".  The implication seems to be that Windows would be so much better if we simply rewrote the operating system using technology <foo>.

    And maybe they're right.  Maybe Windows would be better if we threw away the current kernel and rewrote it using <pick your favorite operating environment>.  I don't know, and I doubt that I'll ever find out.

    The reason is that making any substantial modifications to an operating system as large and as successful as Windows is hard.  Really, really, really hard.  You can see this with Vista - in the scheme of things, there were relatively few changes made to existing elements of the operating system (as far as I can tell, the biggest one was the conversion from the XP display driver model to the Vista display driver model), but even those changes have caused a non trivial amount of pain for our customers.

    Even relatively small modifications can cause pain to customers - one of the changes I made to the legacy multimedia APIs was to remove support for NT4 style audio drivers from winmm.  This functionality has been unsupported since 1998, and we were unaware of any applications that actually used it.  Shortly after Beta2 shipped, we started receiving bug reports from the field - people reported that some call center applications had stopped working.  We started digging and discovered that these call centers were using software that depended on the NT4 style audio drivers.  These call centers didn't have the ability to upgrade their software (the vendor had gone out of business, and the application worked just fine for their needs).  So we put the support for NT4 drivers back, because that was what our customers needed to have happen.


    Windows is an extraordinarily complicated environment - as a result, it's extremely unlikely that any changes along the line of "throw away the kernel and replace it with <foo>" are going to happen.   Of course, I've been wrong before :).

  • Larry Osterman's WebLog

    "Memory Leak" when using the Vista Audio API notification routines


    We recently got an internal report from someone using the internal audio notification APIs that they were leaking memory and they wanted to help from us debugging the problem.

    I took a look and discovered that the problem was a circular reference that was created when they called:


        hr = CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_INPROC_SERVER, __uuidof(IMMDeviceEnumerator), (void**)&m_pEnumerator);
        if (FAILED(hr))
            return hr;

        if (FAILED(m_pEnumerator->RegisterEndpointNotificationCallback(this)));


    The root cause of the problem is that the IMMDeviceEnumerator::RegisterEndpointNotificationCallback takes a reference to the IMMNotificationClient object passed in.  This shouldn't be a surprise, because the Counted Pointer design pattern requires that every time you save a pointer to an object, you take a reference to that object (and every interface that derives from IUnknown implements the Counted Pointer design pattern).  Since the RegisterEndpointNotificationCallback saves it's input pointer for later consumption (when it generates the notification), it has to take a reference to the object.

    At the heart of the problem is the fact that CFoo object only calls UnregisterEndpointNotificationCallback in its destructor (which will never be called).  If the CFoo object had a "Shutdown()" or other form of finalizer, the call to UnregisterEndpointNotificationCallback could be moved to the finalizer, thus removing the circular reference and avoiding the memory leak.  This is by far the best solution - I'm a huge fan of deterministic finalism.


    Unfortunately, sometimes it's not possible to have a "Shutdown()" method (for instance, if you're implementing an interface that doesn't implement the finalizer design pattern (in fact, this was the case for the person who reported the problem to us).

    In that case, you really want to depend on the fact that the reference count reflects your external references, not your internal references.  Effectively, you want to maintain two separate reference counts, one for external clients, the other for internal usage.

    One way to achieve this is to use a delegator object - instead of handing "this" to the RegisterEndpointNotificationCallback, you pass a small object that implements IMMNotificationClient.  So:

    class CFooDelegator: public IMMNotificationClient
        ULONG   m_cRef;
        CFoo *m_pFoo;

        CFooDelegator(CFoo *pFoo) :
        STDMETHODIMP OnDeviceStateChanged(LPCWSTR pwstrDeviceId, DWORD dwNewState) 
            if (m_pFoo)
                m_pFoo->OnDeviceStateChanged(pwstrDeviceId, dwNewState);
            return S_OK;
        STDMETHODIMP OnDeviceAdded(LPCWSTR pwstrDeviceId) 
            if (m_pFoo)
            return S_OK;
        STDMETHODIMP OnDeviceRemoved(LPCWSTR pwstrDeviceId)
            if (m_pFoo)
            return S_OK;
        STDMETHODIMP OnDefaultDeviceChanged(EDataFlow flow, ERole role, LPCWSTR pwstrDefaultDeviceId)
            if (m_pFoo)
                m_pFoo->OnDeviceAdded(flow, role, pwstrDefaultDeviceId);
            return S_OK;
        STDMETHODIMP OnPropertyValueChanged(LPCWSTR pwstrDeviceId, const PROPERTYKEY key)
            if (m_pFoo)
                m_pFoo->OnPropertyValueChanged(pwstrDeviceId, key);
            return S_OK;

        void OnPFooFinalRelease()
            m_pFoo = NULL;

        STDMETHOD(QueryInterface) (REFIID riid,
                                   LPVOID FAR* ppvObj)
            *ppvObj = NULL;
            if (riid == IID_IUnknown)
                *ppvObj = static_cast<IUnknown *>(this);
            else if (riid == IID_IMMNotificationClient)
                *ppvObj = static_cast<IMMNotificationClient *>(this);
                return E_NOINTERFACE;
            return S_OK;
            return InterlockedIncrement((LONG *)&m_cRef);
        STDMETHOD_(ULONG,Release) ()
            ULONG lRet = InterlockedDecrement((LONG *)&m_cRef);
            if (lRet == 0)
                delete this;
            return lRet;

    You then have to change the CFoo::Initialize to construct a CFooDelegator object before calling RegisterEndpointNotification().

    You also need to change the destructor on the CFoo:


    It's important to call UnregisterEndpointNotificationCallback before you call OnPFooFinalRelease - if you don't, there's a possibility that the client's final release of the CFoo might occur while a notification function is being called - if that happens, the destructor might complete and you end up calling back into a partially destructed object.  And that's bad :).  The good news is that the UnregisterEndpointNotificationCallback function guarantees that all notification routines have completed before it returns.

    It's important to realize that this issue occurs with ALL the audio notification callback mechanisms:IAudioEndpointVolume::RegisterControlChangeNotify, IPart::RegisterControlChangeCallback, and IAudioSessionControl::RegisterAudioSessionNotification.

  • Larry Osterman's WebLog

    Every threat model diagram should tell a story.


    Adam Shostack has another threat modeling post up on the SDL blog entitled "Threat Modeling Self Checks and Rules of Thumb".  In it, he talks about threat models and diagrams (and he corrects a mistake in my "rules of thumb" post (thanks Adam)).

    There's one thing he mentions that is really important (and has come up a couple of times as I've been doing threat model reviews of various components).  That a threat model diagram should tell a story.

    Not surprisingly, I love stories.  I really love stories :).  I love telling stories, I love listening to people telling stories.


    I look for stories in the most unlikely of places, including places that you wouldn't necessarily think of.


    So when I'm reviewing a threat model, I want to hear the story of your feature.  If you've done a good job of telling your story, I should be able to see what you've drawn and understand  what you're building - it might take a paragraph or two of text to provide surrounding context, but it should be clear from your diagram.


    What are the things that help to tell a good story?  Well, your story should be coherent.  Stories have beginnings, middles and ends.  Your diagram should also have a beginning (entrypoint), a middle (where the work is done) and an end (the output of the diagram).  For example, in my PlaySound threat model, the beginning is the application calling PlaySound, the end is the audio engine.  Obviously other diagrams have other inputs and outputs, but your code is always invoked by something, and it always does something.  Your diagram should reflect this.

    In addition, it's always a good idea to look for pieces that are missing.  For instance, if you're doing a threat model for a web browser, it's highly likely that the Internet will show up somewhere in the model.  If it doesn't, it just seems "wrong".  Similarly, if your feature name is something like "Mumblefrotz user experience", then I'd hope to find something that looks like a "user" and something that looks like a "Mumblefrotz". 

    Adam's post calls out other inconsistencies that interfere with the storytelling, as does my "rules of thumb" post.


    I really like the storytelling metaphor for threat model diagrams because if I can understand the story, it really helps me find the omissions - there's almost always something missed in the diagram, and a coherent story really helps to understand that.  In many ways, pictures do a far better job of telling stories than words do.

  • Larry Osterman's WebLog

    The evolution of a data structure - the WAVEFORMAT.


    In the beginning, there was a need to be able to describe the format contained in a stream of audio data.

    And thus the WAVEFORMAT structure was born in Windows 3.1.

    typedef struct WAVEFORMAT {
            WORD    wFormatTag;
            WORD    nChannels;
            DWORD   nSamplesPerSec;
            DWORD   nAvgBytesPerSec;
            WORD    nBlockAlign;

    The problem with the WAVEFORMAT is that it was ok at expressing audio streams that contained samples whose size was a power of 2, but there was no way of representing audio streams that contained samples whose size was something other than that (like 24bit samples).

    So the PCMWAVEFORMAT was born. 

    typedef struct PCMWAVEFORMAT {
            WAVEFORMAT      wf;
            WORD            wBitsPerSample;

    If the application passed in a WAVEFORMAT with a wFormatTag of WAVE_FORMAT_PCM, it was required to actually pass in a PCMWAVEFORMAT so that the audio infrastructure could determine the number of bits per sample.

    That worked fine and solved that problem, but the powers that be quickly realized that relying on the format tag for extensibility was going to be a problem in the future. 

    So once again, the structure was extended, and for Windows NT 3.5 and Windows 95, we got the WAVEFORMATEX that we know and love:

    typedef struct tWAVEFORMATEX
        WORD        wFormatTag;         /* format type */
        WORD        nChannels;          /* number of channels (i.e. mono, stereo...) */
        DWORD       nSamplesPerSec;     /* sample rate */
        DWORD       nAvgBytesPerSec;    /* for buffer estimation */
        WORD        nBlockAlign;        /* block size of data */
        WORD        wBitsPerSample;     /* number of bits per sample of mono data */
        WORD        cbSize;             /* the count in bytes of the size of */
                                        /* extra information (after cbSize) */

    This solved the problem somewhat.  But there was a problem -  while all the APIs were changed to express a WAVEFORMATEX, there were still applications that passed in a WAVEFORMAT to the API (and there were WAV files that had been authored with WAVEFORMAT structures).  The root of the issue is that there was no way of distinguishing between a WAVEFORMAT (which didn't have a cbSize field) and a WAVEFORMATEX (which did).  To resolve this, for WAVEFORMAT structures kept in files, the file metadata provided the size of the structure, so we could use the size of the structure to distinguish the various forms.

    When the structure was passed in as a parameter to a function, there was still a problem.  For that, the code that parses WAVEFORMATEX structure must rely on the fact that if the wFormatTag field in the WAVEFORAMAT structure was WAVE_FORMAT_PCM, then the WAVEFORMAT structure is actually a PCMWAVEFORMAT, which is the same as a WAVEFORMATEX with a cbSize field set to 0.  For all other formats, the code simply assumes that the caller is passing in a WAVEFORMATEX structure.


    Unfortunately, the introduction of the WAVEFORMATEX wasn't quite enough.  When you're dealing with two channel audio streams, it's easy to simply say that channel 0 is left and channel 1 is right (or whatever).  But when you're dealing with a multichannel audio stream, it's not possible to determine which channel goes with which speaker.  In addition, with a WAVEFORMATEX, there's still a problem with non power-of-2 formats.  This time, the problem happens when you take a 24bit waveformat and try to pack it into 32bit samples - doing this can dramatically speed up any manipulation that needs to be done on the samples, so it's highly desirable.

    So one final enhancement was made to the WAVEFORMAT structure, the WAVEFORMATEXTENSIBLE (introduced in Windows 2000):

    typedef struct {
        WAVEFORMATEX    Format;
        union {
            WORD wValidBitsPerSample;       /* bits of precision  */
            WORD wSamplesPerBlock;          /* valid if wBitsPerSample==0 */
            WORD wReserved;                 /* If neither applies, set to zero. */
        } Samples;
        DWORD           dwChannelMask;      /* which channels are */
                                            /* present in stream  */
        GUID            SubFormat;

    In the WAVEFORMATEXTENSIBLE, we have the old WAVEFORMATEX, and adds a couple of fields that allow the caller to specify packing of the samples, and to allow the caller to describe which channels in the stream should be redirected to which speaker.  For example, if the dwChannelMask is SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_LOW_FREQUENCY | SPEAKER_TOP_FRONT_LEFT, then channel 0 is the front left channel, channel 1 is the front right channel, channel 2 is the subwoofer, and channel 3 is the top front left speaker.  The way you identify a WAVEFORMATEXTENSIBLE is that the Format.wFormatTag field is set to WAVE_FORMAT_EXTENSIBLE and the Format.cbSize field is always set to 0x16.

     That's where things live for now - who knows if there will be another revision in the future.

  • Larry Osterman's WebLog

    Larry and the "Ping of Death"


    Also known as "Larry mounts a DDOS attack against every single machine running Windows NT"

    Or: No stupid mistake goes unremembered.


    I was recently in the office of a very senior person at Microsoft debugging a problem on his machine.  He introduced himself, and commented "We've never met, but I've heard of you.  Something about a ping of death?"

    Oh. My. Word.  People still remember the "ping of death"?  Wow.  I thought I was long past the ping of death (after all, it's been 15 years), but apparently not.  I'm not surprised when people who were involved in the PoD incident remember it (it was pretty spectacular), but to have a very senior person who wasn't even working at the company at the time remember it is not a good thing :).

    So, for the record, here's the story of Larry and the Ping of Death.

    First I need to describe my development environment at the time (actually, it's pretty much the same as my dev environment today).  I had my primary development machine running a version of NT, it was running a kernel debugger connected to my test machine over a serial cable.  When my test machine crashed, I would use the kernel debugger on my dev machine to debug it.  There was nothing debugging my dev machine, because NT was pretty darned reliable at that point and I didn't need a kernel debugger 99% of the time.  In addition, the corporate network wasn't a switched network - as a result, each machine received datagram traffic from every other machine on the network.


    Back in that day, I was working on the NT 3.1 browser (I've written about the browser here and here before).  As I was working on some diagnostic tools for the browser, I wrote a tool to manually generate some of the packets used by the browser service.

    One day, as I was adding some functionality to the tool, my dev machine crashed, and my test machine locked up.

    *CRUD*.  I can't debug the problem to see what happened because I lost my kernel debugger.  Ok, I'll reboot my machines, and hopefully whatever happened will hit again.

    The failure didn't hit, so I went back to working on the tool.

    And once again, my machine crashed.

    At this point, everyone in the offices around me started to get noisy - there was a great deal of cursing going on.  What I'd not realized was that every machine had crashed at the same time as my dev machine had crashed.  And I do mean EVERY machine.  Every single machine in the corporation running Windows NT had crashed.  Twice (after allowing just enough time between crashes to allow people to start getting back to work).


    I quickly realized that my test application was the cause of the crash, and I isolated my machines from the network and started digging in.  I quickly root caused the problem - the broadcast that was sent by my test application was malformed and it exposed a bug in the bowser.sys driver.  When the bowser received this packet, it crashed.

    I quickly fixed the problem on my machine and added the change to the checkin queue so that it would be in the next day's build.


    I then walked around the entire building and personally apologized to every single person on the NT team for causing them to lose hours of work.  And 15 years later, I'm still apologizing for that one moment of utter stupidity.

  • Larry Osterman's WebLog

    Sorry about not posting...


    Work got a bit insane last week (work fell on me like a ton of bricks), then on Thursday I left to go on a vacation to visit family back east, so no posts in a while.

    On the other hand, we all had a great time, and got to meet up with the family.

     One of the more unusual places we ended up at was the Gospel Chicken House.  The GCH is a converted chicken house in whichthey play gospel music every Saturday and Sunday evening.  And it really is a chicken house - they've paved the floor, painted it and put in an AC system, and installed old pews that appear to have been ripped out of a dozen or so churches over the years. 

    While we were there, we saw the band "Ronnie Williams and the Carter Family Sound", which is basically a Carter Family tribute band (Lorrie Carter, the granddaughter of Mother Maybelle Carter (and daughter of Anita Carter) is a member of the band). 

    All the performers at the GCH essentially perform for tips - they pass around an offering basket during the performance and the audience gives what their heart tells them to give. On the walls are pictures of the various groups that have played there, it was quite the collection.

    All in all, a place with really amazing character, and I'm really glad I had the opportunity to go (and bummed that I didn't get to stay there longer).


    On Saturday, we went to Kings Dominion, which is about 10 minutes from my mother-in-law's place, and the kids spent all day on the rides - a great time for all :).  I have pictures of that, but all you can see are blurry images of children screaming as they fly past the camera :).


    Now we're back, and I'm hoping to have something more technical to write about in the near future.


  • Larry Osterman's WebLog

    Hold on, I know this "old guy"...


    Last week, I'd noticed headlines about a California appeals court reinstating an age discrimination suit brought by a "Google manager".  I sort-of ignored it, because I thought it didn't affect me in any meaningful way.  Then this morning, I ran into Lauren Weinstein's post about this and realized that the "old guy" was Brian Reid.

    Hold on, I KNOW Brian Reid (or to be more specific, I know of him - he was a grad student at Carnegie-Mellon and had left the university shortly before I arrived there in 1980 (to my knowledge, we've never met)).  Brian's PhD thesis was based on a text processing program he wrote called "Scribe" which was the a major text processing system used at Carnegie-Mellon (it has since been supplanted by TeX).  Brian had left Carnegie-Mellon to form a company "Unilogic" whose purpose was to turn Brian's work on Scribe into a product (the first product I've encountered that was copy protected).  The Wikipedia entry for Scribe has some details on the program and some of the controversy surrounding it[1]. 

    I still use "scribe-ism's" in my email - instead of saying <flame>(whatever)</flame>, I often write @flame(on)(whatever)@flame(off).


    The thing is, I consider Brian Reid (and others who were at Carnegie-Mellon at more-or-less the same time as I was) as contemporaries - I don't think of Brian as an "old guy" (even though he's almost 10 years older than I am).  Heck, most of the people I hang out with outside of work are in their early 50s.

    And that, in turn makes me feel old :).


    [1]Carnegie-Mellon was given a license to the Scribe program due to the fact that the work was originally authored at Carnegie-Mellon, I remember at one point when the license to Scribe on one of the undergraduate computers expired (likely because of the suit mentioned on Dave Touretzky's site here).  It was NOT a fun (and may be a part of the reason why I have such a problem with subscription based software).

  • Larry Osterman's WebLog

    Must my service name have the name of the executable in which it's contained?


    It must be psychic debugging week 'round here.  I received the following email on an internal mailing list earlier today:

    Regarding windows service with ServiceType “SERVICE_WIN32_SHARE_PROCESS”, is there a restriction on the service name such that it can only take the same name as the service executable? I can install a service with share process under different names, but the service cannot be started. If I try to start it manually, it gives the following error, “Error 1083: The executable program that this service is configured to run in does not implement the service.” If I set the service name to be the same as the executable, it will install and start properly.

    My psychic response was:

    What names are you providing in your service call dispatch table?

    In this case, the good news is that the error code mostly gives a reasonable hint as to where to start looking.

    Essentially my thought processes were: "Hmm.  I wonder where the service controller gets the list of services that are implemented in the process?  If I remember correctly, it gets it from the service dispatch table passed into the StartServiceCtrlDispatcher API.  Maybe I'll look there..." 

    That then led me to the SERVICE_TABLE_ENTRY structure passed into StartServiceCtrlDispatcher, which contains an lpServiceName field.  Aha, I knew I was close.  Then I saw the comment that says that the lpServiceName field is ignored if the service is a SERVICE_WIN32_OWN_PROCESS (which means that many services probably don't even bother setting the field to anything other than the empty string), and I knew I'd figured it out.


    I'm not surprised that this caused problems, because it's not at all obvious that this random field in a data structure would change meaning depending on a flag in the service's configuration.

  • Larry Osterman's WebLog

    The Windows command line is just a string...


    Yesterday, Richard Gemmell left the following comment on my blog (I've trimmed to the critical part):

    I was referring to the way that IE can be tricked into calling the Firefox command line with multiple parameters instead of the single parameter registered with the URL handler.

    I saw this comment and was really confused for a second, until I realized the disconnect.  The problem is that *nix and Windows handle command line arguments totally differently.  On *nix, you launch a program using the execve API (or  it's cousins execvp, execl, execlp, execle, and execvp).  The interesting thing about these APIs is that they allow the caller to specify each of the command line arguments - the signature for execve is:

    int execve(const char *filename, char *const argv [], char *const envp[]);

    In *nix, the shell is responsible for turning the string provided by the user into the argv parameter to the program[1].


    On Windows, the command line doesn't work that way.  Instead, you launch a new program using the CreateProcess API, which takes the command line as a string (the lpComandLine parameter to CreateProcess).  It's considered the responsibility of the newly started application to call the GetCommandLine API to retrieve that command line and parse it (possibly using the CommandLineToArgvW helper function).

    So when Richard talked about IE "tricking" Firefox by calling it with multiple parameters, he was apparently thinking about the *nix model where an application launches a new application with multiple command line arguments.  But that model isn't the Windows model - instead, in the Windows model, the application is responsible for parsing it's own command line arguments, and thus IE can't "trick" anything - it's just asking the shell to pass a string to the application, and it's the application's job to figure out how handle that string.

    We can discuss the relative merits of that decision, but it was a decision made over 25 years ago (in MS-DOS 2.0).


    [1] Yes, I know that the execl() API allows you to specify a command line string, but the execl() API parses that command line string into argv and argc before calling execve.

  • Larry Osterman's WebLog

    Some final thoughts on Threat Modeling...


    I want to wrap up the threat modeling posts with a summary and some comments on the entire process.  Yeah, I know I should have done this last week, but I got distracted :). 

    First, a summary of the threat modeling posts:

    Part 1: Threat Modeling, Once again.  In which our narrator introduces the idea of a threat model diagram

    Part 2: Threat Modeling Again. Drawing the Diagram.  In which our narrator introduces the diagram for the PlaySound API

    Part 3: Threat Modeling Again, Stride.  Introducing the various STRIDE categories.

    Part 4: Threat Modeling Again, Stride Mitigations.  Discussing various mitigations for the STRIDE categories.

    Part 5: Threat Modeling Again, What does STRIDE have to do with threat modeling?  The relationship between STRIDE and diagram elements.

    Part 6: Threat Modeling Again, STRIDE per Element.  In which the concept of STRIDE/Element is discussed.

    Part 7: Threat Modeling Again, Threat Modeling PlaySound.  Which enumerates the threats against the PlaySound API.

    Part 8: Threat Modeling Again, Analyzing the threats to PlaySound.  In which the threat modeling analysis work against the threats to PlaySound is performed.

    Part 9: Threat Modeling Again, Pulling the threat model together.  Which describes the narrative structure of a threat model.

    Part 10: Threat Modeling Again, Presenting the PlaySound threat model.  Which doesn't need a pithy summary, because the title describes what it is.

    Part 11: Threat Modeling Again, Threat Modeling in Practice.  Presenting the threat model diagrams for a real-world security problem .[1]

    Part 12: Threat Modeling Again, Threat Modeling and the firefoxurl issue. Analyzing the real-world problem from the standpoint of threat modeling.

    Part 13: Threat Modeling Again, Threat Modeling Rules of Thumb.  A document with some useful rules of thumb to consider when threat modeling.


    Remember that threat modeling is an analysis tool. You threat model to identify threats to your component, which then lets you know where you need to concentrate your resources.  Maybe you need to encrypt a particular data channel to protect it from snooping.  Maybe you need to change the ACLs on a data store to ensure that an attacker can't modify the contents of the store.  Maybe you just need to carefully validate the contents of the store before you read it.  The threat modeling process tells you where to look and gives you suggestions about what to look for, but it doesn't solve the problem.  It might be that the only thing that comes out from your threat modeling process is a document that says "We don't care about any of the threats to this component".  That's ok, at a minimum, it means that you considered the threats and decided that they were acceptable.

    The threat modeling process is also a living process. I'm 100% certain that 2 years from now, we're going to be doing threat modeling differently from the way that we do it today.  Experience has shown that every time we apply threat modeling to a product, we realize new things about the process of performing threat modeling, and find new, more efficient ways of going about the process.   Even now, the various teams involved with threat modeling in my division have proposed new changes the process based on the experiences of our current round of threat modeling.  Some of them will be adopted as best practices across Microsoft, some of them will be dropped on the floor. 


    What I've described over these posts is the process of threat modeling as it's done today in the Windows division at Microsoft.  Other divisions use threat modeling differently - the threat landscape for Windows is different from the threat landscape for SQL Server and Exchange, which is different from the threat landscape for the various Live products, and it's entirely different for our internal IT processes.  All of these groups use threat modeling, and they use the core mechanisms in similar ways, but because each group that does threat modeling has different threats and different risks, the process plays out differently for each team.

    If your team decides to adopt threat modeling, you need to consider how it applies to your components and adopt the process accordingly.  Threat Modeling is absolutely not a one-size-fits-all process, but it IS an invaluable tool.


    EDIT TO ADD: Adam Shostak on the Threat Modeling Team at Microsoft pointed out that the threat modeling team has a developer position open.  You can find more information about the position by going to here:  http://members.microsoft.com/careers/search/default.aspx and searching for job #207443.

    [1] Someone posting a comment on Bruce Schneier's blog took me for task for using a browser vulnerability.  I chose that particular vulnerability because it was the first that came to mind.  I could have just as easily picked the DMG loading logic in OSX or the .ANI file code in Windows for examples (actually the DMG file issues are in several ways far more interesting than the firefoxurl issue - the .ANI file issue is actually relatively boring from a threat modeling standpoint).

Page 1 of 1 (12 items)