Larry Osterman's WebLog

Confessions of an Old Fogey
  • Larry Osterman's WebLog

    What is AUDIODG.EXE?

    • 41 Comments

    One of the new audio components in Vista is a new process named audiodg.exe.

    If you look at it in taskmgr, the description shows "Windows Audio Device Graph Isolation", but that's not really particularly helpful when it comes to figuring out what it does.

    The short answer is that audiodg.exe hosts the audio engine for Vista.  All the DSP and other audio processing is done in audiodg.exe.  There are two reason it runs outside of the windows audio service.

    The first is that there's 3rd party code that gets loaded into audiodg.exe.  Audio hardware vendors have the ability to install custom DSPs (called Audio Processing Objects or APOs) into the audio pipeline.  For a number of reasons (reliability, serviceability, others) we're not allowed to load 3rd party code into svchost processes (svchost.exe is a generic host process for services that's used inside Windows). So we need to move all the code that interacts with these 3rd party APOs outside the audio service (that way if an APO crashes, it won't take out some other critical part of the system with it).

    The second reason for using a separate process for the audio engine is DRM.  The DRM system in Vista requires that the audio samples be processed in a protected process, and (for a number of technical reasons that are too obscure to go into) it's not possible for a svchost hosted service to run in a protected process.

     

    So why audiodg?

    As I mentioned in my post "Audio in Vista, The Big Picture", the route that audio samples take through the audio engine can be considered a directed graph.  Internally, we refer to this graph as the "Audio Device Graph" (ok, strictly speaking we call the part to the left of the mixer as the local graph, and the part to the right of the mixer the device graph, but when we consider the big picture, we just call it the audio device graph).

    So why AudioDG?

    Originally we called the process DeviceGraph.Exe.  For a number of reasons that are no longer relevant (they're related to the INF based installer technology that was used before Vista), we thought that we needed to limit our binary names to 8.3 (it's a long story - in reality we didn't, but we thought we did).  So the nice long names we had chosen (AudioEngine.Dll, AudioKSEndpoint.Dll, and DeviceGraph.Exe) had to be truncated to 8.3.

    I felt it was critically important that all the audio components had to have the word "Audio" in the beginning to make it clear that they had to do with audio functionality.  Since we thought we were limited to 8.3 names, that meant we had 3 letters to play with in the name.  For AudioEngine.Dll, it was relatively simple - it shortened to AUDIOENG.DLL.  Similarly for AudioKSEndpoint.Dll, it shortened to AUDIOKSE.DLL.

    But DeviceGraph was somewhat more complicated.  I originally went with AudioADG.EXE (audio+ADG for Audio Device Graph), but people thought it was redundant (It expanded to audio audio device graph). 

    Eventually we settled on "AUDIODG.EXE".

    So why the funky description?  Well because it accurately reflects what audiodg is - it's a part of Windows, so you get "Windows", it hosts the "Audio Device Graph", and isolates it from the Windows Audio Service.

  • Larry Osterman's WebLog

    Why is the DOS path character "\"?

    • 54 Comments
    Many, many months ago, Declan Eardly asked why the \ character was chosen as the path separator.

    The answer's from before my time, but I do remember the original reasons.

    It all stems from Microsoft's relationship with IBM.  For DOS 1.0, DOS only supported floppy disks.

    Many of the DOS utilities (except for command.com) were written by IBM, and they used the "/" character as the "switch" character for their utilities (the "switch" character is the character that's used to distinguish command line switches - on *nix, it's the "-" character, on most DEC operating systems (including VMS, the DECSystem-20 and DECSystem-10), it's the "/" character" (note: I'm grey on whether the "/" character came from IBM or from Microsoft - several of the original MS-DOS developers were old-hand DEC-20 developers, so it's possible that they carried it forward from their DEC background).

    The fact that the "/" character conflicted with the path character of another relatively popular operating system wasn't particularly relevant to the original developers - after all, DOS didn't support directories, just files in a single root directory.

    Then along came DOS 2.0.  DOS 2.0 was tied to the PC/XT, whose major feature was a 10M hard disk.  IBM asked the Microsoft to add support for hard disks, and the MS-DOS developers took this as an opportunity to add support for modern file APIs - they added a whole series of handle based APIs to the system (DOS 1.0 relied on an application controlled structure called an FCB).  They also had to add support for hierarchical paths.

    Now historically there have been a number of different mechanisms for providing hierarchical paths.  The DecSystem-20, for example represented directories as: "<volume>:"<"<Directory>[.<Subdirectory>">"FileName.Extension[,Version]" ("PS:<SYSTEM>MONITR.EXE,4").   VMS used a similar naming scheme, but instead of < and > characters it used [ and ] (and VMS used ";" to differentiate between versions of files).  *nix defines hierarchical paths with a simple hierarchy rooted at "/" - in *nix's naming hierarchy, there's no way of differentiating between files and directories, etc (this isn't bad, btw, it just is).

    For MS-DOS 2.0, the designers of DOS chose a hybrid version - they already had support for drive letters from DOS 1.0, so they needed to continue using that.  And they chose to use the *nix style method of specifying a hierarchy - instead of calling the directory out in the filename (like VMS and the DEC-20), they simply made the directory and filename indistinguishable parts of the path.

    But there was a problem.  They couldn't use the *nix form of path separator of "/", because the "/" was being used for the switch character.

    So what were they to do?  They could have used the "." character like the DEC machines, but the "." character was being used to differentiate between file and extension.  So they chose the next best thing - the "\" character, which was visually similar to the "/" character.

    And that's how the "\" character was chosen.

    Here's a little known secret about MS-DOS.  The DOS developers weren't particularly happy about this state of affairs - heck, they all used Xenix machines for email and stuff, so they were familiar with the *nix command semantics.  So they coded the OS to accept either "/" or "\" character as the path character (this continues today, btw - try typing "notepad c:/boot.ini"  on an XP machine (if you're an admin)).  And they went one step further.  They added an undocumented system call to change the switch character.  And updated the utilities to respect this flag.

    And then they went and finished out the scenario:  They added a config.sys option, SWITCHAR= that would let you set the switch character to "-".

    Which flipped MS-DOS into a *nix style system where command lines used "-switch", and paths were / delimited.

    I don't know the fate of the switchar API, it's been long gone for many years now.

     

    So that's why the path character is "\".  It's because "/" was taken.

    Edit: Fixed title - it's been bugging me all week.

     

  • Larry Osterman's WebLog

    What’s up with the Beep driver in Windows 7?

    • 93 Comments

    Earlier today, someone asked me why 64bit versions of windows don’t support the internal PC speaker beeps.  The answer is somewhat complicated and ends up being an interesting intersection between a host of conflicting tensions in the PC ecosystem.

     

    Let’s start by talking about how the Beep hardware worked way back in the day[1].  The original IBM PC contained an Intel 8254 programmable interval timer chip to manage the system clock.  Because the IBM engineers felt that the PC needed to be able to play sound (but not particularly high quality sound), they decided that they could use the 8254 as a very primitive square wave generator.  To do this, they programmed the 3rd timer on the chip to operate in Square Wave mode and to count down with the desired output frequency.  This caused the Out2 line on the chip to toggle from high to low every time the clock went to 0.  The hardware designers tied the Out2 line on the chip to the PC speaker and voila – they were able to use the clock chip to program the PC speaker to make a noise (not a very high quality noise but a noise nonetheless).

    The Beep() Win32 API is basically a thin wrapper around the 8254 PIC functionality.  So when you call the Beep() API, you program the 8254 to play sounds on the PC speaker.

     

    Fast forward about 25 years…  The PC industry has largely changed and the PC architecture has changed with it.  At this point they don’t actually use the 8254 as the programmable interrupt controller, but it’s still in modern PCs.  And that’s because the 8254 is still used to drive the PC speaker. 

    One of the other things that happened in the intervening 25 years was that machines got a whole lot more capable.  Now machines come with capabilities like newfangled hard disk drives (some of which can even hold more than 30 megabytes of storage (but I don’t know why on earth anyone would ever want a hard disk that can hold that much stuff)).  And every non server machine sold today has a PC sound card.  So every single machine sold today has two ways of generating sounds – the PC sound card and the old 8254 which is tied to the internal PC speaker (or to a dedicated input on the sound card – more on this later).

     

    There’s something else that happened in the past 25 years.  PCs became commodity systems.  And that started exerting a huge amount of pressure on PC manufacturers to cut costs.  They looked at the 8254 and asked “why can’t we remove this?”

    It turns out that they couldn’t.  And the answer to why they couldn’t came from a totally unexpected place.  The American’s with Disabilities Act.

     

    The ADA?  What on earth could the ADA have to do with a PC making a beep?   Well it turns out that at some point in the intervening 25 years, the Win32 Beep() was used for assistive technologies – in particular the sounds made when you enable the assistive technologies like StickyKeys were generated using the Beep() API.   There are about 6 different assistive technology (AT) sounds built into windows, their implementation is plumbed fairly deep inside the win32k.sys driver. 

    But why does that matter?  Well it turns out that many enterprises (both governments and corporations) have requirements that prevent them from purchasing equipment that lacks accessible technologies and that meant that you couldn’t sell computers that didn’t have beep hardware to those enterprises.

     

    This issue was first noticed when Microsoft was developing the first 64bit version of WIndows.  Because the original 64bit windows was intended for servers, the hardware requirements for 64bit machines didn’t include support for an 8254 (apparently the AT requirements are relaxed on servers).  But when we started building a client 64bit OS, we had a problem – client OS’s had to support AT so we needed to bring the beep back even on machines that didn’t have beep hardware.

    For Windows XP this was solved with some custom code in winlogon which worked but had some unexpected complications (none of which are relevant to this discussion).  For Windows Vista, I redesigned the mechanism to move the accessibility beep logic to a new “user mode system sounds agent”. 

    Because the only machines with this problem were 64bit machines, this functionality was restricted to 64bit versions of Windows. 

    That in turn meant that PC manufacturers still had to include support for the 8254 hardware – after all if the user chose to buy the machine with a 32bit operating system on it they might want to use the AT functionality.

    For Windows 7, we resolved the issue completely – we moved all the functionality that used to be contained in Beep.Sys into the user mode system sounds agent – now when you call the Beep() API instead of manipulating the 8254 chip the call is re-routed into a user mode agent which actually plays the sounds.

     

    There was another benefit associated with this plan: Remember above when I mentioned that the 8254 output line was tied to a dedicated input on the sound card?  Because of this input to the sound card, the sound hardware needed to stay powered on at full power all the time because the system couldn’t know when an application might call Beep and thus activate the 8254 (there’s no connection between the 8254 and the power management infrastructure so the system can’t power on the sound hardware when someone programs the 3rd timer on the 8254).  By redirecting the Beep calls through the system audio hardware the system was able to put the sound hardware to sleep until it was needed.

     

    This redirection also had had a couple of unexpected benefits.  For instance when you accidentally type (or grep) through a file containing 0x07 characters in it (like a .obj file) you can finally turn off the annoying noise – since the beeps are played through the PC speakers, the PC mute key works to shut them up.  It also means that you can now control the volume of the beeps. 

    There were also some unexpected consequences.  The biggest was that people started noticing when applications called Beep().  They had placed their PCs far enough away (or there was enough ambient noise) that they had never noticed when their PC was beeping at them until the sounds started coming out their speakers.

     

     

    [1] Thus providing me with an justification to keep my old Intel component data catalogs from back in the 1980s.

  • Larry Osterman's WebLog

    NextGenHacker101 owes me a new monitor

    • 102 Comments

    Because I just got soda all over my current one…

    One of the funniest things I’ve seen in a while. 

     

    And yes, I know that I’m being cruel here and I shouldn’t make fun of the kids ignorance, but he is SO proud of his new discovery and is so wrong in his interpretation of what actually is going on…

     

     

     

    For my non net-savvy readers: The “tracert” command lists the route that packets take from the local computer to a remote computer.  So if I want to find out what path a packet takes from my computer to www.microsoft.com, I would issue “tracert www.microsoft.com”.  This can be extremely helpful when troubleshooting networking problems.  Unfortunately the young man in the video had a rather different opinion of what the command did.

  • Larry Osterman's WebLog

    What's up with Audio in Windows Vista?

    • 43 Comments

    Steve Ball (the GPM for the MediaTech group (of which Windows Audio is a part)) discussed some of these changes in the Windows Audio Channel 9 video, but I'd like to spend a bit more time talking about what we've done.

    A lot of what I'm discussing is on the video, but what the heck - I've got a blog, and I need to have some content to fill in the white space, so...

     

    The Windows audio system debuted in Windows 3.1 with the "Multimedia Extensions for Windows", or MME APIs.  Originally, only one application at a time could play audio, that was because the original infrastructure didn't have support for tracking or mixing audio streams (this is also why the old audio apps like sndrec32 pop up an error indicating that another device is using the audio hardware when they encounter any error).

    When Windows 95 (and NT 3.1) came out, the MME APIs were stretched to 32 bits, but the basic infrastructure didn't change - only one application could play audio at one time.

    For Windows 98, we deployed an entirely new audio architecture, based on the Windows Driver Model, or WDM.  As a part of that architectural change, we added the ability to mix audio streams - finally you could have multiple applications rendering audio at the same time.

    There have been numerous changes to the audio stack over the years, but the core audio architecture has remained the same until Vista.

    Over the years, we've realized that there three major problem areas with the existing audio infrastructure:

    1. The amount of code that runs in the kernel (coupled with buggy device drivers) causes the audio stack to be one of the leading causes of Windows reliability problems.
    2. It's also become clear that while the audio quality in Windows is just fine for normal users, pro-audio enthusiasts are less than happy with the native audio infrastructure.  We've made a bunch of changes to the infrastructure to support pro-audio apps, but those were mostly focused around providing mechanisms for those apps to bypass the audio infrastructure.
    3. We've also come to realize that the tools for troubleshootingaudio problems aren't the greatest - it's just too hard to figure out what's going on, and the UI (much of which comes from Windows 3.1) is flat-out too old to be useful.

    Back in 2002, we decided to make a big bet on Audio for Vista and we committed to fixing all three of the problems listed above.

    The first (and biggest) change we made was to move the entire audio stack out of the kernel and into user mode.  Pre-Vista, the audio stack lived in a bunch of different kernel mode device drivers, including sysaudio.sys, kmixer.sys, wdmaud.sys, redbook.sys, etc.  In Vista and beyond, the only kernel mode drivers for audio are the actual audio drivers (and portcls.sys, the high level audio port driver).

    The second major change we made was a totally revamped UI for audio.  Sndvol32 and mmsys.cpl were completely rewritten (from scratch) to include new, higher quality visuals, and to focus on the common tasks that users actually need to do.  All the old functionality is still there, but for the most part, it's been buried deep below the UI.

    The infrastructure items I mentioned above are present in Vista Beta1, unfortunately the UI improvements won't be seen by non Microsoft people until Vista Beta2.

  • Larry Osterman's WebLog

    What are these "Threading Models" and why do I care?

    • 28 Comments

    Somehow it seems like it’s been “Threading Models” week, another example of “Blogger synergy”.  I wrote this up for internal distribution to my group about a year ago, and I’ve been waiting for a good time to post it.  Since we just hit another instance of the problem in my group yesterday, it seemed like a good time.

     

    So what is this thing called a threading model anyway?

    Ok.  So the COM guys had this problem.  NT supports multiple threads, but most developers, especially the VB developers at which COM/ActiveX were targeted are totally terrified by the concept of threading.  In fact, it’s very difficult to make thread-safe VB (or JS) applications, since those languages don’t support any kind of threading concepts.  So the COM guys needed to design an architecture that would allow for supporting these single-threaded objects and host them in a multi-threaded application.

    The solution they came up was the concept of apartments.  Essentially each application that hosts COM objects holds one or more apartments.  There are two types of apartments, Single Threaded Apartments (STAs) and Multi Threaded Apartments (MTAs).  Within a given process there can be multiple STA’s but there is only one MTA.

    When a thread calls CoInitializeEx (or CoInitialize), the thread tells COM which of the two apartment types it’s prepared to host.  To indicate that the thread should live in the MTA, you pass the COINIT_MULTITHREADED flag to CoInitializeEx.  To indicate that the thread should host an STA, either call CoInitialize or pass the COINIT_APARTMENTTHREADED flag to CoInitializeEx.

    A COM object’s lifetime is limited to the lifetime of the apartment that creates the object.  So if you create an object in an STA, then destroy the apartment (by calling CoUninitialize), all objects created in this apartment will be destroyed.

    Single Threaded Apartment Model Threads

    When a thread indicates that it’s going to be in single threaded apartment, then the thread indicates to COM that it will host single threaded COM objects.  Part of the contract of being an STA is that the STA thread cannot block without running a windows message pump (at a minimum, if they block they must call MsgWaitForSingleObject – internally, COM uses windows messages to do inter-thread marshalling).

    The reason for this requirement is that COM guarantees that objects will be executed on the thread in which they were created regardless of the thread in which they’re called (thus the objects don’t have to worry about multi-threading issues, since they can only ever be called from a single thread).  Eric mentions “rental threaded objects”, but I’m not aware of any explicit support in COM for this.

     

    Multi Threaded Apartment Model Threads

    Threads in the multi threaded apartment don’t have any restrictions – they can block using whatever mechanism they want.  If COM needs to execute a method on an object and no thread is blocked, then COM will simply spin up a new thread to execute the code (this is particularly important for out-of-proc server objects – COM will simply create new RPC threads to service the object as more clients call into the server).

    How do COM objects indicate which thread they work with?

    When an in-proc COM object is registered with OLE, the COM object creates the following registry key:

                HKCR\CLSID\{<Object class ID>}\InprocServer32

    The InprocServer32 tells COM which DLL hosts the object (in the default value for the key), and via the ThreadingModel value tells COM the threading model for the COM object.

     

    There are essentially four legal values for the ThreadingModel value.  They are:

    Apartment

    Free

    Both

    Neutral

    Apartment Model objects.

    When a COM object is marked as being an “Apartment” threading model object, it means that the object will only run in an STA thread.  All calls into the object will be serialized by the apartment model, and thus it will not have to worry about synchronization.

    Free Model objects.

    When a COM object is marked as being a “Free” threading model object, it means that the object will run in the MTA.  There is no synchronization of the object.  When a thread in an STA wants to call into a free model object, then the STA will marshal the parameters from the STA into the MTA to perform the call. 

    Both Model objects.

    The “Both” threading model is an attempt at providing the best of both worlds.  An object that is marked with a threading model of “Both” takes on the threading model of the thread that created the object. 

    Neutral Model objects.

    With COM+, COM introduced the concept of a “Neutral” threading model.  A “Neutral” threading model object is one that totally ignores the threading model of its caller.

    COM objects declared as out-of-proc (with a LocalServer32=xxx key in the class ID.) are automatically considered to be in the multi-threaded apartment (more about that below).

    It turns out that COM’s enforcement of the threading model is not consistent.  In particular, when a thread that’s located in an STA calls into an object that was created in the MTA, COM does not enforce the requirement that the parameters be marshaled through a proxy object.   This can be a big deal, because it means that the author of COM objects can be lazy and ignore the threading rules – it’s possible to create a COM object in that uses the “Both” threading model and, as long as the object is in-proc, there’s nothing that’ll check to ensure you didn’t violate the threading model.  However the instant you interact with an out-of-proc object (or call into a COM method that enforces apartment model checking), you’ll get the dreaded RPC_E_WRONG_THREAD error return.  The table here describes this in some detail.

    What about Proxy/Stub objects?

    Proxy/Stub objects are objects that are created by COM to handle automatically marshaling the parameters of the various COM methods to other apartments/processes.  The normal mechanism for registering Proxy/Stub objects is to let COM handle the registration by letting MIDL generate a dlldata.c file that is referenced during the proxy DLL’s initialization.

    When COM registers these proxy/stub objects, it registers the proxy/stub objects with a threading model of “Both”.  This threading model is hard-coded and cannot be changed by the application.

    What limitations are there that I need to worry about?

    The problem that we most often see occurs because of the Proxy/Stub objects.  Since the proxy/stub objects are registered with a threading model of “Both”, they take on the threading model of the thread that created the object.  So if a proxy/stub object is created in a single threaded apartment, it can only be executed in the apartment that created it.  The proxy/stub marshaling routines DO enforce the threading restriction I mentioned above, so applications learn about this when they unexpectedly get a RPC_E_WRONG_THREAD error return from one of their calls.  On the server side, the threading model of the object is set by the threading model of the caller of CoRegisterClassObject.  The good news is that the default ALT 7.1 behavior is to specify multi-threaded initialization unless otherwise specified (in other words, the ATL header files define _ATL_FREE_THREADED by default.

    How do I work around these limitations?

    Fortunately, this problem is a common problem, and to solve it COM provides a facility called the “Global Interface Table”.  The GIT is basically a singleton object that allows you to register an object with the GIT and it will then return an object that can be used to perform the call from the current thread.  This object will either be the original object (if you’re in the apartment that created the object) or it will be a proxy object that simply marshals the calls into the thread that created the object.

    If you have a COM proxy/stub object (or you use COM proxy/stub objects in your code), you need to be aware of when you’ll need to use the GIT to hold your object.

    Use the GIT, after you’ve called CoCreateInstance to create your COM object, call IGlobalInterfaceTable::RegisterInterfaceInGlobal to add the object to the global interface table.  This will return a “cookie” to you.  When you want to access the COM object, you first call IGlobalInterfaceTable::GetInterfaceFromGlobal to retrieve the interface.  When you’re done with the object, you call IGlobalInterface::RevokeInterfaceFromGlobal.

    In our case, we didn’t feel that pushing the implementation details of interacting with the global interface table to the user was acceptable, so we actually wrote an in-proc object that wraps our out-of-proc object. 

    Are there other problems I need to worry about?

    Unfortunately, yes.  Since the lifetime of a COM object is scoped to the lifetime of the apartment that created the object, this means that when the apartment goes away, the object will go away.  This will happen even if the object is referenced from another thread.  If the object in question is a local object, this really isn’t that big a deal since the memory backing the object won’t go away.  If, however the object is a proxy/stub object, then the object will be torn down post-haste.  The global interface table will not help this problem, since it will remove all the entries in the table that were created in the apartment that’s going away.

    Additional resources:

    The MSDN article Geek Speak Decoded #7 (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngeek/html/geekthread.asp) also has some detail on how this stuff works (although it’s somewhat out-of-date).

     

  • Larry Osterman's WebLog

    Units of measurement

    • 17 Comments

     Whenever people find out that I work at Microsoft, invariably the next question they ask is “Have you met Bill?” (The next question is: “So what’s with the stock?” – as if I had a magic 8-ball to tell them).

    I’ve met Bill socially a couple of times (at various company functions); he doesn’t know who I am thoughJ.  But there was one memorable meeting I attended with him.

    It was back in 1986ish; we were presenting the plans for Lan Manager 1.0 to him.  One portion of the meeting was about my component, DOS Lan Manager (basically an enhanced version of the MS-NET redirector, with support for a fair number of the Lan Manager APIs on the client).  My boss and I were given the job of presenting the data for that portion.

    One of the slides (not Powerpoint, it didn’t exist at the time – Lucite slides on an overhead projector) we had covered the memory footprint of the DOS Lan Manager redirector.

    For DOS LM 1.0, the redirector took up 64K of RAM.

    And Bill went ballistic.

    “What do you mean 64K?  When we wrote BASIC, it only took up 8K of RAM.  What the f*k do you think idiots think you’re doing?  Is this thing REALLY 8 F*ing BASIC’s?”

    The only answer we could give him was “Yes”J.

    To this day, I sometimes wonder if he complains that Windows XP is “16,000 F*ing BASIC’s”.

    Edit: To add what we finally did with DOS Lan Manager's memory footprint. 

    We didn't ignore Bill's comment, btw.  We worked on reducing the footprint of the DOS redirector by first moving the data into LIM Expanded memory, next by moving the code into expanded memory.  For LAN Manager 2.1, we finally managed to reduce the below 640K footprint of the DOS redirector to 128 bytes.  It took a lot of work, and some truely clever programming, but it did work.

     

  • Larry Osterman's WebLog

    Wait, that was MY bug? Ouch!

    • 100 Comments

    Over the weekend, the wires were full with reports of a speech recognition demo at the Microsoft's Financial Analysts Meeting here in Seattle that went horribly wrong. 

    Slashdot had it, Neowin had it,  Digg had it, Reuters had it.  It was everywhere.

    And it was all my fault.

     

    Well, mostly.  Rob Chambers on the speech team has already written about this, here's the same problem from my side of the fence.

    About a month ago (more-or-less), we got some reports from an IHV that sometimes when they set the volume on a capture stream the actual volume would go crazy (crazy, for those that don't know, is a technical term).  Since volume is one of the areas in the audio subsystem that I own, the bug landed on my plate.  At the time, I was overloaded with bugs, so another of the developers on the audio team took over the investigation and root caused the bug fairly quickly.  The annoying thing about it was that the bug wasn't reproducible - every time he stepped through the code in the debugger, it worked perfectly, but it kept failing when run without any traces.

     

    If you've worked with analog audio, it's pretty clear what's happening here - there's a timing issue that is causing a positive feedback loop that resulted from a signal being fed back into an amplifier.

    It turns out that one of the common causes of feedback loops in software is a concurrency issue with notifications - a notification is received with new data, which updates a value, updating the value causes a new notification to be generated, which updates a value, updating the value causes a new notification, and so-on...

    The code actually handled most of the feedback cases involving notifications, but there were two lower level bugs that complicated things.  The first bug was that there was an incorrect calculation that occurred when handling one of the values in the notification, and the second was that there was a concurrency issue - a member variable that should have been protected wasn't (I'm simplifying what actually happened, but this suffices). 

     

    As a consequence of these two very subtle low level bugs, the speech recognition engine wasn't able to correctly control the gain on the microphone, when it did, it hit the notification feedback loop, which caused the microphone to clip, which meant that the samples being received by the speech recognition engine weren't accurate.

    There were other contributing factors to the problem (the bug was fixed on more recent Vista builds than the one they were using for the demo, there were some issues with way the speech recognition engine had been "trained", etc), but it doesn't matter - the problem wouldn't have been nearly as significant.

    Mea Culpa.

  • Larry Osterman's WebLog

    Hugarian notation - it's my turn now :)

    • 36 Comments

    Following on the heals of Eric Lippert’s posts on Hungarian and of course Rory Blyth’s classic “Die, Hungarian notation… Just *die*”, I figured I’d toss my hat into the fray (what the heck, I haven’t had a good controversial post in a while).

    One thing to keep in mind about Hungarian is that there are two totally different Hungarian implementations out there.

    The first one, which is the one that most people know about, is “Systems Hungarian”.  System’s Hungarian is also “Hungarian-as-interpreted-by-Scott-Ludwig” (Edit: For Scott's side of this comment, see here - the truth is better than my original post).  In many ways, it’s a bastardization of “real” (or Apps) Hungarian as proposed by Charles Simonyi. 

    Both variants of Hungarian have two things in common.  The first is the concept of a type-related prefix, and the second is a suffix (although the Systems Hungarian doesn’t use the suffix much (if at all)).  But that’s where the big difference lies.

    In Systems Hungarian, the prefix for a type is almost always related to the underlying data type.  So a parameter to a Systems Hungarian function might be “dwNumberOfBytes” – the “dw” prefix indicates that the type of the parameter is a DWORD, and the “name” of the parameter is “NumberOfBytes”.  In Apps Hungarian, the prefix is related to the USE of the data.  The same parameter in Apps Hungarian is “cb” – the “c” prefix indicates that the parameter is a type, the “b” suffix indicates that it’s a byte parameter.

    Now consider what happens if the parameter is the number of characters in a string.  In Systems Hungarian, the parameter might be “iMaxLength”.  It might be “cchWideChar”.  There’s no consistency between different APIs that use Systems Hungarian.  But in Apps Hungarian, there is only one way of representing the parameter; the parameter would be “cch” – the “c” prefix again indicates a count, the “ch” type indicates that it’s a character.

    Now please note that most developers won’t use “cch” or “cb” as parameters to their routines in Apps Hungarian.  Let’s consider the Win32 lstrcpyn function:

     LPTSTR lstrcpyn(     
    LPTSTR lpString1,
    LPCTSTR lpString2,
    int iMaxLength
    );

    This is the version in Systems Hungarian.  Now, the same function in Apps Hungarian:

     LPTSTR Szstrcpyn(     
    LPTSTR szDest,
    LPCTSTR szSrc,
    int cbLen
    );

    Let’s consider the differences.  First off, the name of the function changed to reflect the type returned by the function – since it returns an LPTSTR, which is a variant of a string, the function name changed to “SzXxx”.  Second, the first two parameters name changed.  Instead of “lpString1” and “lpString2”, they changed to the more descriptive “szSrc” and “szDest”.  The “sz” prefix indicates that the variable is a null terminated string.  The “Src” and “Dest” are standard suffixes, which indicate the “source” and “destination” of the operation.  The iMaxLength parameter which indicates the number of bytes to copy is changed to cbLen – the “cb” prefix indicates that it’s a count of bytes, the standard “Len” suffix indicates that it’s a length to be copied.

    The interesting thing that happens when you convert from Systems Hungarian to Apps Hungarian is that now the usage of all the parameters of the function becomes immediately clear to the user.  Instead of the parameter name indicating the type (which is almost always uninteresting), the parameter name now contains indications of the usage of the parameter.

    The bottom line is that when you’re criticizing Hungarian, you need to understand which Hungarian you’re really complaining about.  Hungarian as defined by Simonyi isn’t nearly as bad as some have made it out to be.

    This is not to say that Apps Hungarian was without issue.  The original Hungarian specification was written by Doug Klunder in 1988.  One of the things that was missing from that document was a discussion about the difference between “type” and “intent” when defining prefixes.  This can be a source of a great confusion when defining parameters in Hungarian.  For example, if you have a routine that takes a pointer to a “foo” parameter to the routine, and internally the routine treats the parameter as single pointer to a foo, it’s clear that the parameter name should be “pfoo”.  However, if the routine treats the parameter as an array of foo’s, the original document was not clear about what should happen – should the parameter be “pfoo” or “rgfoo”.  Which wins, intent or type?  To me, there’s no argument, it should be intent, but there have been some heated debates about this over the years.  The current Apps Hungarian document is quite clear about this, intent wins.

    One other issue with the original document was that it predated C++.  So concepts like classes weren’t really covered and everyone had to come up with their own standard.  At this point those issues have been resolved.  Classes don’t have a “C” prefix, since a class is really just a type.  Members have “m_” prefixes before their actual name.  There are a bunch of other standard conventions but they’re relatively unimportant.

    I used Hungarian exclusively when I was in the Exchange team; my boss was rather a Hungarian zealot and he insisted that we code in strict Apps Hungarian.  Originally I chafed at it, having always assumed that Hungarian was stupid, but after using it for a couple of months, I started to see how it worked.  It certainly made more sense than the Hungarian I saw in the Systems division.  I even got to the point where I could understand what an irgch would without even flinching.

    Now, having said all that, I don’t use Hungarian these days.  I’m back in the systems division, and I’m using a home-brewed coding convention that’s based on the CLR standards, with some modifications I came up with myself (local variables are camel cased, parameters are Pascal cased (to allow easy differentiation between parameters and local variables), class members start with _ as a prefix, globals are g_Xxx).  So far, it’s working for me.

    I’ve drunk the kool-aid from both sides of the Hungarian debate though, and I’m perfectly happy working in either camp.

     

  • Larry Osterman's WebLog

    Early Easter Eggs

    • 65 Comments

    Jensen Harris's blog post today talked about an early Easter Egg he found in the Radio Shack TRS-80 Color Computer BASIC interpreter.

     

    What's not widely known is that there were Easter Eggs in MS-DOS.  Not many, but some did slip in.  The earliest one I know of was one in the MS-DOS "Recover" command.

    The "Recover" command was an "interesting" command.

    As it was explained to me, when Microsoft added support for hard disks (and added a hierarchical filesystem to the operating system), the powers that be were worried that people would "lose" their files (by forgetting where they put them).

    The "recover" command was an attempt to solve this.  Of course it "solved" the problem by using the "Take a chainsaw to carve the Sunday roast" technique.

    You see, the "Recover" command flattened your hard disk - it moved all the files from all the subdirectories on your hard disk into the root directory.  And it renamed them to be FILE0001.REC to FILE<n>.REC.

    If someone ever used it, their immediate reaction was "Why on earth did those idiots put such a useless tool in the OS, now I've got got to figure out which of these files is my file, and I need to put all my files back where they came from".  Fortunately Microsoft finally removed it from the OS in the MS-DOS 5.0 timeframe.

     

    Before it flattened your hard disk, it helpfully asked you if you wanted to continue (Y/N)?.

    Here's the Easter Egg: On MS-DOS 2.0 (only), if you hit "CTRL-R" at the Y/N prompt, it would print out the string "<developer email alias> helped with the new DOS, Microsoft Rules!"

    To my knowledge, nobody ever figured out how to get access to this particular easter egg, although I do remember Peter Norton writing a column about it in PC-WEEK (he found the text of the easter egg by running "strings" on the recover.com binary).

    Nowadays, adding an easter egg to a Microsoft OS is immediate grounds for termination, so it's highly unlikely you'll ever see another.

     

    Somewhat later:  I dug up the documentation for the "recover" command - the version of the documentation I found indicates that the tool was intended to recover files with bad sectors within it - apparently if you specified a filename, it would create a new file in the current directory that contained all the clusters from the bad file that were readable.  If you specified just a drive, it did the same thing to all the files on the drive - which had the effect of wiping your entire disk.  So the tool isn't TOTALLY stupid, but it still was pretty surprising to me when I stumbled onto it on my test machine one day.

     

  • Larry Osterman's WebLog

    Beware of the dancing bunnies.

    • 57 Comments

    I saw a post the other day (I'm not sure where, otherwise I'd cite it) that proclaimed that a properly designed system didn't need any anti-virus or anti-spyware software.

    Forgive me, but this comment is about as intellegent as "I can see a worldwide market for 10 computers" or "no properly written program should require more than 128K of RAM" or "no properly designed computer should require a fan".

    The reason for this is buried in the subject of this post, it's what I (and others) like to call the "dancing bunnies" problem.

    What's the dancing bunnies problem?

    It's a description of what happens when a user receives an email message that says "click here to see the dancing bunnies".

    The user wants to see the dancing bunnies, so they click there.  It doesn't matter how much you try to disuade them, if they want to see the dancing bunnies, then by gum, they're going to see the dancing bunnies.  It doesn't matter how many technical hurdles you put in their way, if they stop the user from seeing the dancing bunny, then they're going to go and see the dancing bunny.

    There are lots of techniques for mitigating the dancing bunny problem.  There's strict privilege separation - users don't have access to any locations that can harm them.  You can prevent users from downloading programs.  You can make the user invoke magic commands to make code executable (chmod +e dancingbunnies).  You can force the user to input a password when they want to access resources.  You can block programs at the firewall.  You can turn off scripting.  You can do lots, and lots of things.

    However, at the end of the day, the user still wants to see the dancing bunny, and they'll do whatever's necessary to bypass your carefully constructed barriers in order to see the bunny

    We know that user's will do whatever's necessary.  How do we know that?  Well, because at least one virus (one of the Beagle derivatives) propogated via a password encrypted .zip file.  In order to see the contents, the user had to open the zip file and type in the password that was contained in the email.  Users were more than happy to do that, even after years of education, and dozens of technological hurdles.

    All because they wanted to see the dancing bunny.

    The reason for a platform needing anti-virus and anti-spyware software is that it forms a final line of defense against the dancing bunny problem - at their heart, anti-virus software is software that scans every executable before it's loaded and prevents it from running if it looks like it contain a virus.

    As long as the user can run code or scripts, then viruses will exist, and anti-virus software will need to exist to protect users from them.

     

  • Larry Osterman's WebLog

    Fixing a customer problem: “No Audio Device is Installed” when launching sndvol on Windows Vista

    • 18 Comments

    Yesterday someone forwarded me an email from one of our DirectShow MVPs – he was having problems playing audio on his Windows Vista machine.

     

    Fortunately David (the MVP) had done most of the diagnostic work – the symptoms he saw were that he was receiving a “No Audio Device is Installed” error launching sndvol (and other assorted problems). 

    David tried the usual things (confirming that the driver for his audio solution was correctly installed (this probably fixes 99% of the problems)).  He also tried reinstalling the driver to no avail.

    He next ran the Sysinternals Process Monitor tool to see what was going on.  He very quickly found the following line in the output from process monitor:

    "RegOpenKey", "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\MMDevices\Audio\Render\{e4ee1234-fc70-4925-94e9-4117395f7995}", "ACCESS DENIED", "Desired Access: Write"

    With that information, he looked for the ACL on that registry key:

    clip_image002

    He then looked at the configuration for the Windows Audio service:

    image

    Woah – the Windows Audio service doesn’t have access rights to that registry key – the Windows Audio service is running as LocalService and the LocalService account doesn’t have any access to the registry key.

    At this point he decided to contact Microsoft with his problem.

    I looked at his info and quickly realized that the problem was that somehow the ACL on the registry key had been corrupted: something had removed the entries for the audio services.  On a normal Windows Vista installation this registry key’s ACL should look something like:

    endpointpermissions

    Something that ran on David’s machines went in and reset the permissions for this registry key to the ACL that is on the root node of the HKEY_LOCAL_MACHINE\Software registry hive.  I have no idea what did this, but messing with the ACLs on the registry is a known cause of various compatibility problems.  That’s why Microsoft KB 885409 has such strong warnings about why it’s important to not apply blind modifications to files or registry keys in Windows.  It’s unfortunate, but the warnings in the KB articles that say that modifying registry keys or permissions can cause your machine to malfunction are absolutely right – it’s not hard to make modifications to registry keys that can really screw up a machine, if you make the right ones.  From the KB article:

    For example, modifications to registry ACLs affect large parts of the registry hives and may cause systems to no longer function as expected. Modifying the ACLs on single registry keys poses less of a problem to many systems. However, we recommend that you carefully consider and test these changes before you implement them. Again, we can only guarantee that you can return to the recommended out-of-the-box settings if you reformat and reinstall the operating system.

    The good news is that it should be relatively simple to fix David’s problem – As far as I know, he has two options.  The first is to reinstall Windows Vista – that should reset the ACLs on the property key to their default values (because it will recreate the property keys), which should resolve the problem.

    The second solution is to add an ACL to the registry keys under the MMDevices registry key to allow the LocalService account to have permissions to modify this registry key.

  • Larry Osterman's WebLog

    Vista Ship Gift, Part 2

    • 23 Comments

    It's a Microsoft tradition that the people who worked on a project get a copy of the project when it ships.  I've got copies of OS/2 1.1, NT 3.1, Exchange 4.0, 5.0, 5.5 and 2000 on my shelves, for example, all with their shrinkwrap untouched.

    Well, my copy of Vista finally showed up yesterday, and the ship gift people totally outdid themselves this time.

    Here is. 

    Front:

    Back:

     

    Side:

     You probably can't make it out in the pictures, but across the front and back is a subtle wash consisting of code - don't know what code it is, but it's code.

    On the front is the word "HANDCRAFTED" and the Vista logo

    On the back is the text:

    "BY YOU

    We build software line by line, idea by idea, side by side.  Our software is an expression of ourselves, our best moments, our toughest challenges, our greatest hopes.  So it's a strange and beautiful day when this handcrafted product leaves our labs and appears on millions of computers around the globe.  Remember this day.  You have changed the world."

    Inside the fold is a collection of pictures, some from the ship party, some from inside Microsoft:

    I normally don't open the packaging on my ship gifts but in this case I made an exception, because again, this one was special.

    Front:

    Back:

     Side:

     The text on the back reads:

    "FOR ALL THE...

    Delighted customers, great ideas, tough deadlines, clever solutions, lines of code, pages of specs, runs of automation, lines of text, screens of UI, missed dinners, fixed bugs, inspiring teamwork, countless iterations, courage to break the rules, time away from loved ones, times you rose to the occasion, late nights, early mornings, delayed vacations, chances you took, long meetings, short meetings, canceled meetings, killed features, features that wouldn't die, crashed machines, moments of victory, moments of defeat, coffees, doughnuts, pizzas, beers, relentless dedication, blood, sweat, and tears.

    THANK YOU"

    And I want to thank whoever it was on the product team that designed this packaging.  It's absolutely awesome, and I think it totally captures the effort that went into Vista.  I especially love the text on the inside package.

    It's funny - when the commemorative edition started showing up, I noticed something unique. In my 22+ years at Microsoft, I've NEVER seen people take the "thank you" copy of the product out and show it to others.  But when we got this copy, there were lots of people walking around the halls showing the box off to others.  Every one of them called out the text on the package as being meaningful.

  • Larry Osterman's WebLog

    Why no Easter Eggs?

    • 35 Comments

    Yesterday's post caused a bit of a furor in the comments thread.  A large number of people leaving comments (and others) didn't understand why the OS division has a "no Easter Eggs" policy.

    If you think about this, it's not really that surprising.  One of the aspects of Trustworthy Computing is that you can trust what's on your computer.  Part of that means that there's absolutely NOTHING on your computer that isn't planned.  If the manufacturer of the software that's on every desktop in your company can't stop their developers from sneaking undocumented features into the product (even features as relatively benign as an Easter Egg), how can you be sure that they've not snuck some other undocumented feature into the code.

    Even mandating that you have access to the entire source code to the product doesn't guarantee that - first off, nobody in their right mind would audit all 10 million+ lines of code in the product before deployment, and even if you DID have the source code, that doesn't mean anything - Ken Thompson made that quite clear in his Turing Award lecture.  Once you've lost the trust of your customers, they're gone - they're going to find somewhere else to take their business.

    And there are LOTS of businesses and governments that have the sources to Microsoft products.  Imagine how they'd react if (and when) they discovered the code?  Especially when they were told that it was a "Special Surprise" for our users.  Their only reaction would be to wonder what other "Special Surprises" were in the code.

    It's even more than that.  What happens when the Easter Egg has a security bug in it?  It's not that unplausable - the NT 3.1 Easter Egg had a bug in it - the easter egg was designed to be triggered when someone typed in I LOVE NT, but apparently it could also be triggered by any anagram of "I LOVE NT" - as a result, "NOT EVIL" was also a trigger.

    Going still further, Easter Eggs are percieved as a symptom of bloat, and lots of people get upset when they find them.  From Adequacy.org:

    Now if you followed the link above and read the article you may be thinking to yourself...
  • Is this what MS developers do when they should be concentrating on security?
  • How often do they audit their code?
  • What's to stop someone from inserting malicious code?
  • Is this why I pay so much for Windows and MS Office?
  • I know other non-MS software contains EEs but this is rediculous.
  • One more reason why peer review is better as EEs and malicious code can be removed quickly.
  • Is this why security patches takes so long to be released?
  • This is TrustWorthy Computing!?!
  • From technofile:

    Even more disturbing is the vision of Microsoft as the purveyor of foolishness. Already, the cloying "Easter eggs" that Microsoft hides away in its software -- surprise messages, sounds or images that show off the skill of the programmers but have no usefulness otherwise -- are forcing many users to question the seriousness of Microsoft's management.
       A company whose engineers can spend dozens or even hundreds of hours placing nonsensical "Easter eggs" in various programs would seem to have no excuse for releasing Windows with any bugs at all. Microsoft's priorities are upside down if "Easter egg" frills and other non-essential features are more important than getting the basic software to work right.

    From Agathering.net:

    "and some users might like to know exactly why the company allows such huge eggs to bloat already big applications even further"

    I've been involved in Easter Eggs in the past - the Exchange 5.0 POP3 and NNTP servers had easter eggs in them.  In our case, we actually followed the rules - we filed a bug in the database ("Exchange POP3 server doesn't have an Easter Egg"), we had the PM write up a spec for it, the test lead developed test cases for it.  We even contacted the legal department to determine how we should reference the contingent staff that were included in the Easter Egg. 

    But it didn't matter - we still shouldn't have done it.  Why?  Because it was utterly irresponsible.  We didn't tell the customers about it, and that was unforgivable, ESPECIALLY in a network server.  What would have happened if there had been a buffer overflow or other security bug in the Easter Egg code?  How could we POSSIBLY explain to our customers that the reason we allowed a worm to propagate on the internet was because of the vanity of our deveopers?  Why on EARTH would they trust us in the future? 

    Not to mention that we messed up.  Just like the NT 3.1 Easter Egg, we had a bug in our Easter Egg, and we would send the Easter Egg out in response to protocol elements other than the intended ones.  When I was discussing this topic with Raymond Chen, he pointed out that his real-world IMAP client hit this bug - and he was more than slightly upset at us for it.

     

    It's about trust.  It's about being professional.  Yeah, it's cool seeing your name up there in lights.  It's cool when developers get a chance to let loose and show their creativity.  But it's not cool when doing it costs us the trust of our customers.

     

    Thanks to Raymond, KC and Betsy for their spirited email discussion that inspired this post, and especially to Raymond for the awesome links (and the dirt on my broken Easter Egg).

     

    Edit: Fixed some html wierdness.

    Edit2: s/anacronym/anagram/

  • Larry Osterman's WebLog

    Remember the blibbet

    • 36 Comments

    There was a thread on Channel9 that got me to remember the blibbet.

    "The blibbet?"  What on earth is a blibbet?

    The blibbet is the "O" in the 2nd Microsoft logo:

    Those of us who've been at Microsoft for a long time all have fond memories of the blibbet, which was cruely killed off in a fit of corporate ire in 1987.  Not only was it our corporate logo, but if you look at Microsoft binders from the time (1982-1987), you'll see that the blibbet is used as a watermark on it.

    At one point, I had a "save the blibbet" button, but unfortunately, I can't seem to find it (otherwise I'd post a photo of it).

    Periodically you find reminders of the blibbet around campus.  For example, the cafeterias used to offer a "Blibbet Burger" - a double cheeseburger with bacon, IIRC.

    I miss the blibbet :)  Somehow, it reminds me of Herbie (or any other VW bug) - cute and friendly :)

    Edit: Added more after a little bird sent me the image..

    When Microsoft announced that the would be retiring the blibbet, a number of employees mounted a fruitless "Save the Blibbet" campaign to retain the corporate icon.

    Unfortunately, the suits won :(

     

  • Larry Osterman's WebLog

    What is this thing called, SID?

    • 24 Comments

    One of the core data structures in the NT security infrastructure is the security identifier, or SID.

    NT uses two data types to represent the SID, a PSID, which is just an alias for VOID *, and a SID, which is a more complicated structure (declared in winnt.h).

    The contents of a SID can actually be rather fascinating.  Here’s the basic SID structure:

    typedef struct _SID {
       BYTE  Revision;
       BYTE  SubAuthorityCount;
       SID_IDENTIFIER_AUTHORITY IdentifierAuthority;
       DWORD SubAuthority[ANYSIZE_ARRAY];
    } SID, *PISID;

    Not a lot there, but some fascinating stuff none the less.  First let’s consider the Revision.  That’s always set to 1, for existing versions of NT.  There may be a future version of NT that defines other values, but not yet.

    The next interesting field in a version 1 SID is the IdentifierAuthority.  The IdentifierAuthority is an array of 6 bytes, which describes which system “owns” the SID.  Essentially the IdentifierAuthority defines the meaning of the fields in the SubAuthority array, which is an array of DWORDs that is SubAuthorityCount in length (SubAuthorityCount can be any number between 1 and SID_MAX_SUB_AUTHORITIES (15 currently).  NT’s access check logic and SID validation logic treats the sub authority array as an opaque data structure, which can allow a resource manager to define their own semantics for the contents of the SubAuthority (this is strongly NOT recommended btw).

    The “good stuff” in the SID (the stuff that makes a SID unique) lives in the SubAuthority array in the SID.  Each entry in the SubAuthority array is known as a RID (for Relative ID), more on this later. 

    NT defines a string representation of the SID by constructing a string S-<Revision>-<IdentifierAuthority>-<SubAuthority0>-<SubAuthority1>-…-<SubAuthority<SubAuthorityCount>>.  For the purposes of constructing a string sid, the IdentifierAuthority is treated as a 48bit number.  You can convert between a binary SID and back by using the ConvertSidToStringSid and ConvertStringSidToSid APIs.

    NT defines 6 IdentifierAuthorities, they are:

    #define SECURITY_NULL_SID_AUTHORITY         {0,0,0,0,0,0}
    #define SECURITY_WORLD_SID_AUTHORITY        {0,0,0,0,0,1}
    #define SECURITY_LOCAL_SID_AUTHORITY        {0,0,0,0,0,2}
    #define SECURITY_CREATOR_SID_AUTHORITY      {0,0,0,0,0,3}
    #define SECURITY_NON_UNIQUE_AUTHORITY       {0,0,0,0,0,4}
    #define SECURITY_NT_AUTHORITY               {0,0,0,0,0,5}
    #define SECURITY_RESOURCE_MANAGER_AUTHORITY {0,0,0,0,0,9}

    Taken in turn, they are:

    ·         SECURITY_NULL_SID_AUTHORITY: The “NULL” Sid authority is used to hold the “null” account SID, or S-1-0-0. 

    ·         SECURITY_WORLD_SID_AUTHORITY: The “World” Sid authority is used for the “Everyone” group, there’s only one SID in that group, S-1-1-0.

    ·         SECURITY_LOCAL_SID_AUTHORITY: The “Local” Sid authority is used for the “Local” group, again, there’s only one SID in that group, S-1-2-0.

    ·         SECURITY_CREATOR_SID_AUTHORITY: This Sid authority is responsible for the CREATOR_OWNER, CREATOR_GROUP, CREATOR_OWNER_SERVER and CREATOR_GROUP_SERVER well known SIDs, S-1-3-0, S-1-3-1, S-1-3-2 and S-1-3-3.
    The SIDs under the CREATOR_SID_AUTHORITY are sort-of “meta-SIDs”.  Basically, when ACL inheritance is run, any ACEs that are owned by the SECURITY_CREATOR_SID_AUTHORITY are replaced (duplicated if the ACEs are inheritable) by ACEs that reflect the relevant principal that is performing the inheritance.  So a CREATOR_OWNER ACE will be replaced by the owner SID from the token of the user that’s performing the inheritance.

    ·         SECURITY_NON_UNIQUE_AUTHORITY:  Not used by NT

    ·         SECURITY_RESOURCE_MANAGER_AUTHORITY:  The “resource manager” authority is a catch-all that’s used for 3rd party resource managers. 

    ·         SECURITY_NT_AUTHORITY: The big kahuna.  This describes accounts that are managed by the NT security subsystem.

    There are literally dozens of well known SIDs under the SECURITY_NT_AUTHORITY sub authority.  They range from NETWORK (S-1-5-2), a group added to the token of all users connected to the machine via a network, to S-1-5-5-X-Y, which is the SID for all authenticated NT users (X and Y will be replaced by values specific to your per-machine logon instance).

    Each domain controller allocates RIDs for that domain, each principal created gets its own RID.  In general, for NT principals, the SID for each user in a domain will be identical, except for the last RID (that’s why it’s a “relative” ID – the value in SubAuthority[n] is relative to SubAuthority[n-1]).  In Windows NT (before Win2000), RID allocation was trivial – user accounts could only be created at the primary domain controller (there was only one  PDC, with multiple backup domain controllers) so the PDC could manage the list of RIDs that was allocated easily.  For Windows 2000 and later, user accounts can be created on any domain controller, so the RID allocation algorithm is somewhat more complicated.

    Clearly a great deal of effort is made to ensure uniqueness of SIDs, if SIDs did not uniquely identify a user, then “bad things” would happen.

    If you look in WINNT.H, you can find definitions for many of the RIDs for the builtin NT accounts, to form a SID for one of those accounts, you’d initialize a SID with the SECURITY_NT_AUTHORITY, and set the first SubAuthority to the RID of the desired account.  The good news is that because this is an extremely tedious process, the NT security guys defined an API (in Windows XP and later) named CreateWellKnownSid which can be used to create any of the “standard” SIDs.

    Tomorrow: Some fun things you can do with a SID.

     

  • Larry Osterman's WebLog

    How do I change the master volume in Windows Vista

    • 32 Comments

    It's actually easier in Vista than it was in XP.  For Vista, we recognized that one of the key customer scenarios was going to be setting the master volume, and since we'd removed the old mechanism that was used to set the volume, we knew we had to provide an easier mechanism for Vista.

    Just for grins,  I threw together a tiny app that demonstrates it.  To save space, all error checking was removed.

    #include <stdio.h>
    #include <windows.h>
    #include <mmdeviceapi.h>
    #include <endpointvolume.h>

    void Usage()
    {
      printf("Usage: \n");
      printf(" SetVolume [Reports the current volume]\n");
      printf(" SetVolume -d <new volume in decibels> [Sets the current default render device volume to the new volume]\n");
      printf(" SetVolume -f <new volume as an amplitude scalar> [Sets the current default render device volume to the new volume]\n");

    }
    int _tmain(int argc, _TCHAR* argv[])
    {
      HRESULT hr;
      bool decibels = false;
      bool scalar = false;
      double newVolume;
      if (argc != 3 && argc != 1)
      {
        Usage();
        return -1;
      }
      if (argc == 3)
      {
        if (argv[1][0] == '-')
        {
          if (argv[1][1] == 'f')
          {
            scalar = true;
          }
          else if (argv[1][1] == 'd')
          {
            decibels = true;
          }
        }
        else
        {
          Usage();
          return -1;
        }

        newVolume = _tstof(argv[2]);
      }

      // -------------------------
      CoInitialize(NULL);
      IMMDeviceEnumerator *deviceEnumerator = NULL;
      hr = CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_INPROC_SERVER, __uuidof(IMMDeviceEnumerator), (LPVOID *)&deviceEnumerator);
      IMMDevice *defaultDevice = NULL;

      hr = deviceEnumerator->GetDefaultAudioEndpoint(eRender, eConsole, &defaultDevice);
      deviceEnumerator->Release();
      deviceEnumerator = NULL;

      IAudioEndpointVolume *endpointVolume = NULL;
      hr = defaultDevice->Activate(__uuidof(IAudioEndpointVolume), CLSCTX_INPROC_SERVER, NULL, (LPVOID *)&endpointVolume);
      defaultDevice->Release();
      defaultDevice = NULL; 

      // -------------------------
      float currentVolume = 0;
      endpointVolume->GetMasterVolumeLevel(&currentVolume);
      printf("Current volume in dB is: %f\n", currentVolume);

      hr = endpointVolume->GetMasterVolumeLevelScalar(&currentVolume);
      printf("Current volume as a scalar is: %f\n", currentVolume);
      if (decibels)
      {
        hr = endpointVolume->SetMasterVolumeLevel((float)newVolume, NULL);
      }
      else if (scalar)
      {
        hr = endpointVolume->SetMasterVolumeLevelScalar((float)newVolume, NULL);
      }
      endpointVolume->Release();

      CoUninitialize();
      return 0;
    }

    This program has essentially 3 parts.  The first parses the command line, the second retrieves an endpoint volume interface on the default endpoint, the third retrieves the current volume and sets the volume.

    I'm going to ignore the first part, it's the same junk you'll see in any CS 101 class. 

    The second part instantiates an MMDeviceEnumerator object which implements the IMMDeviceEnumerator interface.  The IMMDeviceEnumerator interface is the gateway object to the new audio subsystem - it can be used to enumerate audio endpoints and retrieve information about the various endpoints.  In this case, I'm only interested in the GetDefaultAudioEndpoint method, it returns an IMMDevice object that points to the current endpoint.

    Again, there are a bunch of things I can do with an IMMDevice object, but I'm only really interested in the "Activate" method.  The idea is that each MMDevice object supports lots of different interfaces, you "Activate" the interface to access the functionality associated with that object.  Again, in this case, I'm only interested in the IAudioEndpointVolume interface - there are other interfaces, like IDeviceTopology, and IAudioClient that can be activated from the endpoint.

    The IAudioEndpointVolume interface is where the good stuff lives, right now I'm only interested in four methods, which retrieve (and set) the current endpoint volume in either decibels or as a scalar value. 

    The decibels version of the IAudioEndointVolume interface instructs the driver to set the desired master volume (input or output) to the decibel value specified, it's intended to be used for applications that want to have exact control over the output dB value of the audio solution.

    The scalar version is a bit more complicated.  It's intended for use in applications that have volume sliders, and provides a linear volume taper (represented as a floating point value between 0.0 and 1.0).  In other words, the perceived volume when you set the scalar version of the API to .5 is twice as loud as when set to .25 and is half as loud as when set to 1.0.

  • Larry Osterman's WebLog

    We've RI'ed!!!

    • 55 Comments
    We've RI'ed!

    ??  What on earth is he talking about ??

    An RI is a "Reverse Integration".  The NT source system is built as a series of branches off of a main tree, and there are two sets of operations that occur - when a change is made to the trunk, the changes are "forward integrated" to be branches.  New feature development goes on in the branches, and when the feature is ready for "prime time", the work is "reverse integrated" back into the main tree, and those changes are subsequently forward integrated into the various other branches.

    The primary reason for structure is to ensure that the trunk always has a high level of quality - the branches may be of varying quality levels, but the main trunk always remains defect free.

    Well, yesterday afternoon, our feature RI'ed into the main multimedia branch, this is the first step towards having our code in the main Windows product (which should happen fairly soon).

    When a feature is RI'ed into any of the main Windows branches, code has to go through a series of what are called "Quality Gates".  The quality gates are in place to ensure a consistent level of engineering quality across the product - among other things, it ensures that the feature has up-to-date test and development specifications, an accurate and complete threat model, that the tests for the feature have a certain level of code coverage.  There are a bunch of other gates beyond these, but they're related to internal processes that aren't relevant.

    The quality gates may seem like a huge amount of bureaucracy to go through, and they can be difficult, but their purpose is really worthwhile - the quality gates are what ensures that no code is checked into the trunk that doesn't meet the quality bar for being a part of Windows.

    Our team's been working on this feature (no, I can't say what it is, yet :() for over three years, it's been a truly heroic effort on the part of everyone involved, but especially on the part of the group's development leads, Noel Cross and Alper Selcuk, who were at work at 2AM every day for most of the past three weeks ensuring that all the I's were dotted and the T's were crossed.

    This is SO cool.

    Edit: Cut&Paste error led to typo in Noel's name

     

  • Larry Osterman's WebLog

    Volume control in Vista

    • 25 Comments

    Before Vista, all of the controls available to applications were system-wide - when you changed the volume using the wave volume APIs, you changed the hardware volume, thus effecting all the applications in the system.  The problem with this is that for the vast majority of applications, this was exactly the wrong behavior.  This behavior was a legacy of the old Windows 3.1 audio architecture, where you could only have one application playing audio at a time.  In that situation, there was only one hardware volume, so the behavior made sense.

    When  the WDM audio drivers were released for Win98, Microsoft added kernel mode audio mixing, but it left the volume control infrastructure alone.  The volume controls available to the Windows APIs remained the hardware volume controls.  The reason for this is pretty simple: Volume control really needs to be per-application, but in the Win98 architecture, there was no way of associating individual audio streams with a particular application, instead audio streams were treated independently.

    The thing is, most applications REALLY wanted to just control the volume for their audio streams.  They didn't want (or need) to mess with other apps audio streams, that was just an unfortunate side effect of the audio architecture.

    For some applications, there were solutions.  For instance, if you used DirectSound (or DirectShow, which is layered on DirectSound), you could render your audio streams into a secondary buffer, since DSound secondary buffers had their own volume controls, that effectively makes their volume control per-application.   But it doesn't do anything to help the applications that don't use DSound, they're stuck with manipulating the hardware volume.

     

    For Vista, one of the things that was deployed as part of the new audio infrastructure was a component called "Audio Policy".  One of the tasks of the policy engine is tracking which audio streams belong to which application.

    For Vista, each audio stream is associated with an "audio session", and the audio session is roughly associated with a process (each process can have more than one audio session, and audio sessions can span multiple process, but by default each audio session is the collection of audio streams being rendered by the process).

    Each audio session has its own volume control, and WASAPI exposes interfaces that allow applications to control the volume of their audio session.  The volume control API also includes a notification mechanism so applications that want to be notified when their volume control changes can implement this - this mechanism allows an application to track when someone else changes their volume.

    This is all well and good, but how does this solve the problem of existing applications that are using the hardware volume but probably don't want to?

    Remember how I mentioned that all the existing APIs were plumbed to use WASAPI?  Well, we plumbed the volume controls for those APIs to WASAPI's volume control interfaces too. 

    We also plumbed the mixerLine APIs to use WASAPI.  This was slightly more complicated, because the mixerLine API also requires that we define a topology for audio devices, but we've defined a relatively simple topology that should match existing hardware topologies (so appcompat shouldn't be an issue).

    The upshot of this is that by default, for Vista Beta2, we're going to provide per-application volume control for the first time, for all applications.

    There is a very small set of applications that may be broken by this behavior change, but we have a mechanism to ensure that applications that need to manipulate the hardware volume using the existing APIs will be able to work in Vista without rewriting the application (if you've got one of those applications, you should contact me out-of-band and I'll get the right people involved in the discussion).

  • Larry Osterman's WebLog

    Why is Control-Alt-Delete the secure attention sequence (SAS)?

    • 50 Comments

    When we were designing NT 3.1, one of the issues that came up fairly early was the secure attention sequence - we needed to have a keystroke sequence that couldn't be intercepted by any application.

    So the security architect for NT (Jim Kelly) went looking for a keystroke sequence he could use.

     

    It turned out that the only keystroke combination that wasn't already being used by a shipping application was control-alt-del, because that was used to reboot the computer.

    And thus was born the Control-Alt-Del to log in.

    I've got to say that the first time that the logon dialog went into the system, I pressed it with a fair amount of trepidation - I'd been well trained that C-A-D rebooted the computer and....

     

  • Larry Osterman's WebLog

    So where DOES the mass of a tree come from?

    • 51 Comments

    Yesterday, I asked where the mass comes from in a tree.

    The answer is actually really simple: Carbon.  Photosynthesis is the act of converting CO2 from the air into O2 and a bit of H2O. 

    It turns out that if you ask new Harvard graduates this question, the vast majority of them answer some variant of "It comes from the soil".  When people think of photosynthesis, they don't think about the carbon that's left behind.  They'll usually be puzzled as they answer, because in their hearts, they realize that "the soil" doesn't actually work as an answer, but they can't quite put all the pieces together.

    If you ask 7th graders the same question right after they've finished their photosynthesis unit, they end up coming up with a variant of the same answer.  Or they say it comes from the water the plant absorbs. Soil and water have mass, and that seems to the determining factor in their answer.

    You see, 7th graders don't seem to get the idea that air contains mass - it's just the stuff that's around them, it doesn't "weigh" anything.  Since they have always been exposed to the weight of air, it doesn't occur to them that it has any real properties at all.  If you show them a block of dry ice, and ask them what it is, they'll say "It's ice". If you ask them to weigh it, they get that part.  It's not until you follow that up and ask "So what's happening to the ice?" and they realize that the fog it's generating is disappearing into the air that they'll start figuring out what's happening - that the mass of the dry ice is being absorbed into the air.  The very smartest of the kids will then put the pieces together and realize that air DOES have mass.

    Yesterday's "quiz" netted over 70 correct answers and about 15 incorrect answers, I've got to say that I'm pretty impressed. The reality is that since I only let the incorrect answers through, it biased the sample towards correctness (several people mentioned that they'd read the other comments and realized that their first thought was wrong).

     

    Valorie had one more question: Could those of you who got the answer right, and who are under 30, and who were educated under the American education system post a comment below?

     

  • Larry Osterman's WebLog

    Why does Win32 even have Fibers?

    • 23 Comments
    Raymond's had an interesting series on fibers (starting here), and I thought I'd expand on them a bit.

    Fibers were added to Windows (in NT 3.51 SP3, IIRC) because some customers (not just SQL server) believed that they could improve the performance of their server applications if they had more control over their threading environment.

    But why on earth did these customers want fibers?

    Well, it's all about scalability, especially on MP system.  On a multitasking system, it's easy to forget that a single processor can only do one thing at a time.

    The ramifications of this are actually quite profound.  If you've got two tasks currently running on your system, then your operating system will have to switch between each of them.  That switch is called a context switch, and it can be expensive (just for the sake of argument, let's say a context switch takes 5000 instructions).  In a context switch, the operating system has to (at a minimum):

    1. Enter Kernel Mode
    2. Save all the old threads registers.
    3. Acquire the dispatch spinlock.
    4. Determine the next thread to run (if the next thread is in another process, this can get expensive)
    5. Leave the dispatch spinlock.
    6. Swap the old threads kernel state with the new threads kernel state.
    7. Restore the new threads registers.
    8. Leave Kernel Mode

    That's a fair amount of work to perform (not outrageous, but not trivial).

    The OS won't do this unless it has to.  In general, there are three things that will cause the OS to cause a context switch are (there are others, like page faults, but these are the big ones):

    1. When your thread goes to sleep (either by calling Sleep() or calling WaitFor[Single|Multiple]Object[s])
    2. When your thread calls SwitchToThread() or Sleep(0) (this is a special case of the Sleep() API that is identical to SwitchToThread())
    3. When your thread's quanta elapses.

    A thread's quanta is essentially the amount of time that the OS will dedicate to a thread if there's another thread in the system that can also run.  A quantum is something like 5-10 ticks on a workstation and 10-15 on server, and each tick is typically somewhere between 10 and 15 milliseconds, depending on the platform.  In general, your thread will get its full quanta unless there is a higher priority runnable thread in the system (please note: this is a grotesque simplification, but it's sufficient for the purposes of this discussion).

    The thing is, for a highly scalable application, context switches are BAD.  They represent CPU time that the application could be spending on working for the customer, but instead is spent doing what is essentially OS bookkeeping.  So a highly scalable application REALLY wants to reduce the number of context switches.  If you ever have a service that's performing poorly, one of the first things to look for is the number of context switches/second - if it's high (for some value of high), then there's invariably a scalability issue in the application that needs to be addressed.

    So why fibers?  Because for highly scalable applications, you want each of your threads to get their full quanta - in other words, you want the only reason for a context switch to be reason #3 above. 

    Remember the first cause of context switches: Calling WaitFor*Object.  What that means is that if you call EnterCriticalSection on a critical section with contention, then you're highly likely to cause a context switch. The same thing happens when you wait for an I/O to complete, etc.  You absolutely want to avoid calling any Win32 APIs that might block under the covers.

    So fibers were created to resolve this issue.  A fiber is effectively removes steps 1, 3, 5 and 8 from the context switch steps above, switching from one fiber to another just saves the old register state, and restores the new register state.  It's up to the application to determine which fiber runs next, etc.  But the application can make its own choices.  As a result, a server application could have a dozen or more "tasks" running on each thread, and they'd radically reduce their context switch overhead, because saving and restoring registers is significantly faster than a full context switch.  The other thing that fibers allow is the ability to avoid the dispatcher spin lock (see John Vert's comment about context switches being serialized across all processors below).  Any global lock hurts your scalability, and fibers allow an application to avoid one of the global locks in the system.

    Ok, so why have fibers remained obscure?

    They've remained obscure first because of the reasons Raymond mentioned in his fibers caveat here - using fibers is an all-or-nothing thing, and it's not possible to use fibers from a shared library.  As Rob Earhart pointed out in this comment on Raymond's post, some of the idiosyncrasies of the fiber APIs have been resolved in the current versions of Windows.

    They're also HARD to deal with - you essentially have to write your own scheduler.

    Raymond also left off a couple of other gotchas: For example, if you're using fibers to improve your apps scalability, you can't call ANY Win32 APIs that might block (including filesystem APIs) because all the Win32 blocking APIs are also have thread affinity (not surprisingly :)) So if you're running 20 fibers on a single thread, when any of the fibers blocks, your thread blocks (however, the fibers can be run from another thread, because fibers don't have thread affinity, so if you have a spare thread around, that thread can run the fibers).

    The other reason that fibers have remained obscure is more fundamental.  It has to do with Moore's law (there's a reason for the posts yesterday and the day before).

    Back when fibers were first implemented, CPUs were a lot slower.  Those 5000 instructions for the context switch (again, this is just a guess) took .05 millisecond (assuming one cycle/instruction) to execute on a 100MHz machine (which would be a pretty fast machine in 1995).  Well, on a 2GHz machine, that .05 is .0025 millisecond - it's an order of magnitude smaller.  The raw cost of a context switch has gone down dramatically.  In addition, there has been a significant amount of work in the base operating system to increase the scalability of the dispatcher spinlock - nowadays, the overhead of the dispatcher lock is essentially nonexistant on many MP machines (you start to see contention issues on machines with a lot of CPUs, for some value of "large").

    But there's another aspect of performance that has gone up dramatically, and that's the cost of blowing the CPU cache.

    As processors have gotten smarter, the performance of the CPU cache has become more and more critical to their speed - because main memory is painfully slow compared to the speed of the processor, if you're not getting your data from the CPU's cache, you're paying a huge hidden penalty.  And fibers don't fix this cost - when you switch from one fiber to another, you're going to blow the CPU cache.

    Nowadays, the cost of blowing the cache has leveled the playing field between OS context switches and fibers - these days, you don't get nearly the benefit from fibers that you did ten years ago.

    This isn't to say that fibers won't become useful in the future, they might.  But they're no longer as useful as they were.

    Btw, it's important to note that fibers aren't the ONLY solution to the thread quantization issue mentioned above.  I/O completion ports can also be used to limit context switches - the built-in Win32 thread pool uses them (that's also what I used in my earlier post about thread pools).  In fact, the recomendation is that instead of spending your time rewriting your app to use fibers (and it IS a rewrite), instead it's better to rearchitect your app to use a "minimal context" model - instead of maintaining the state of your server on the stack, maintain it in a small data structure, and have that structure drive a small one-thread-per-cpu state machine.  You'll still have the issue of unexpected blocking points (you call malloc and malloc blocks accessing the heap critical section), but that issue exists regardless of how your app's architected.

    If you're designing a scalable application, you need to architect your application to minimize the number of context switches, so it's critical that you not add unnecessary context switches to your app (like queuing a request to a worker thread, then block on the request (which forces the OS to switch to the worker, then back to the original thread)). 

    Significant Edit (1/10/2005): Fixed several issues pointed out by the base performance team.

     

  • Larry Osterman's WebLog

    What is localization anyway?

    • 32 Comments
    I may be stomping on Michael Kaplan's toes with this one, but...

    I was reading the February 2005 issue of Dr. Dobbs Journal this morning and I ran into the article "Automating Localization" by Hew Wolff (you may have to subscribe to get access to the article).

    When I was reading the article, I was struck by the following comment:

     I didn't think we could, because the localization process is prety straightforward. By "localization", I mean the same thing as "globalization" (oddly) or "internationalization." You go through the files looking for English text strings, and pull them into a big "language table," assigning each one a unique key

    The first thing I thought was what an utterly wrong statement.  The author of the article is conflating five different concepts and calling them the same thing.  The five concepts are: localizability, translation, localization, internationalization, and globalization.

    What Hew's talking about is "localizability" - the process of making the product localizable.

    Given that caveat, he's totally right in his definition of localizability - localizability is the process of extracting all the language-dependant strings in your binary and putting them in a separate location that can be later modified by a translator.

    But he totally missed the boat on the rest of the concepts.

    The first three (localizability, translation, and localization) are about resources:

    • Localizability is about enabling translation and localization.  It's about ensuring that a translator has the ability to modify your application to work in a new country without recompiling your binary.
    • Translation is about converting the words in his big "language table" from one language to another.  Researchers love this one because they think that they can automate this process (see Google's language tools as an example of this).
    • Localization is the next step past translation.  As Yoshihiko Sakurai mentioned to Michael in a related discussion this morning "[localization] is a step past translation, taking the certain communication code associated with a certain culture.  There are so many aspects you have to think about such as their moral values, working styles, social structures, etc... in order to get desired (or non-desired) outputs.  This is one of the big reasons that automated translation tools leave so much to be desired - humans know about the cultural issues involved in a language, computers don't.

    Internationalization is about code.  It's about ensuring that the code in your application can handle strings in a language sensitive manner.  Michael's blog is FULL of examples of internationalization.  Michael's article about Tamil numbers, or deeptanshuv's article about the four versions of "I" in Turkish are great examples of this.  Another example is respecting the date and time format of the user - even though users in the US and the UK both speak English (I know that the Brits reading this take issue with the concept of Americans speaking English, but bear with me here), they use different date formats.  Today is 26/01/2005 in Great Britain, but it's 01/26/2005 here in the US.  If your application displays dates, it should automatically adjust them.

    Globalization is about politics.  It's about ensuring that your application doesn't step on the policies of a country - So you don't ever highlight national borders in your graphics, because you might upset your customers living on one side or another of a disputed border. I do want to be clear that this isn't the traditional use of globalization, maybe a better word would be geopoliticization, but that's too many letters to type, even for me, and since globalization was almost always used as a synonym for internationalization, I figured it wouldn't mind being coopted in this manner :)

    Having said that, his article is an interesting discussion about localization and the process of localization.  I think that the process he went through was a fascinating one, with some merit.  But that one phrase REALLY stuck in my craw.

    Edit: Fixed incorrect reference to UK dates - I should have checked first :)  Oh, and it's 2005, not 2004.

    Edit2: Added Sakurai-san's name.

    Edit3: Added comment about the term "globalization"

  • Larry Osterman's WebLog

    Beep Beep

    • 44 Comments
    What's the deal with the Beep() API anyway?

    It's one of the oldest Windows API, dating back to Windows 1.0.  It's also one of the few audio APIs that my team doesn't own.  The Beep API actually has its own dedicated driver (beep.sys).  The reason for this is that the Beep() API works totally differently from any other audio API in the system.

    Back when IBM built the first IBM PCs, they realized that they needed to have the ability to do SOME level of audio, even if it wasn't particularly high quality.  So they built a speaker into the original PC hardware.

    But how do you drive the speaker?  It turns out that the original PC hardware used an 8253 programmable interval timer to control the system hardware timer.  The 8253 was a pretty cool little chip - it would operate in 5 different modes - one shot timer, interrupt on terminal count, rate generator, square wave generator, software strobe or hardware strobe.  It also contained three independent counters - counter 0 was used by the operating system, counter 1 was reserved for the hardware.  The third counter, counter 2 was special.  The IBM hardware engineers tied the OUT2 line from the 8253 to the speaker line, and they programmed the timer to operate in square wave generation mode.

    What that means is that whenever the 2nd counter of the 8253 counted to 0, it would toggle the output of the OUT2 line from the 8253.  This gave the PC a primitive way of generating very simple tones.

    The original Windows Beep() API simply fiddled the controls on the 8253 to cause it to generate a square wave with the appropriate frequency, and that's what Beep.sys continues to do.  Legacy APIs can be hard to remove sometimes :)

    Nowadays, the internal PC speaker is often also connected to the PCs audio solution, that allows the PC to have sound even when there are no external speakers connected to the machine.

    In addition to the simple beep, some very clever people figured out how they could use the 8253 to generate honest to goodness audio, I'm not sure how they succeeded in doing it but I remember someone had a PC speaker based sound driver for DOS available at one point - it totally killed your PCs performance but it DID play something better than BEEEEEEP.

    Edit: s/interrupt conroller/interval timer/

    Edit2: fixed description of channel 1 (in case someone comes along later and decides to depend on my error).

  • Larry Osterman's WebLog

    It was 20 years ago today...

    • 74 Comments

    Nope, Sgt Pepper didn’t teach the band to play.

    20 years ago today, a kid fresh out of Carnegie-Mellon University showed up at the door of the 10700 Northup Way, ready to start his first day at a real job.

    What a long strange trip it’s been.

    Over the past 20 years, I’ve:

    ·         Worked on two different versions of MS-DOS (4.0, 4.1).

    ·         Worked on three different versions of Lan Manager (1.0, 1.5, 2.0)

    ·         Worked on five different releases of Windows NT (3.1, 3.5, XP (SP2), W2K3 (SP1), Longhorn)

    ·         Worked on four different releases of Exchange (4.0, 5.0, 5.5, and 2000)

    I’ve watched my co-workers move on to become senior VP’s.  I’ve watched my co-workers leave the company.

    I’ve seen the children of my co-workers grow up, go to college, marry, and have kids.

    I’ve watched the younger brother of my kids babysitter who I met at 12 years of age grow up, go to college and come to work at Microsoft in the office around the corner from mine (that one is REALLY weird btw).

    I’ve seen strategy’s come and go (Lan Manager as an OEM product, then retail, then integrated with the OS).

    I’ve watched three different paradigm shifts occur in the software industry, and most of a fourth.  The first one was the shift of real computing to “personal” computers.  The second was the GUI revolution, the third was the internet, and now we’re seeing a shift to smaller devices.  We’re still not done with that one.

    I’ve watched Microsoft change from a “small software startup in Seattle” to the 800 pound gorilla everyone hates.

    I’ve watched Microsoft grow from 650ish people to well over 50,000.

    I’ve watched our stock grow and shrink.  I’ve watched co-workers fortunes rise and fall.

    I’ve watched governments sue Microsoft.  I’ve watched Governments settle with Microsoft.  I’ve seen Microsoft win court battles.  I’ve seen Microsoft lose court battles.

    I’ve watched the internet bubble start, blossom, and explode.

    I’ve watched cellular phones go from brick-sized lumps to something close to the size of matchbooks.

    I’ve seen the computer on my desktop go from a 4.77MHz 8088 with 512K of RAM and a 10M hard disk to a 3.2GHz hyper-threaded Pentium 4 with 1G of RAM and an 80G hard disk.

    I’ve watched the idea of multimedia on the PC go from squeaky beeps from the speaker to 8-channel surround sound that would rival audiophile quality products.

    I’ve watched video on the PC go from 640x350 Black&White to 32bit color rendered in full 3d with millions of polygons.

    When I started at Microsoft, the computer that they gave me was a 4.77MHz PC/XT, with a 10 megabyte hard disk, and 512K of RAM.  I also had a Microsoft Softcard that increased the RAM to 640K, and it added a clock to the computer, too (they didn’t come with one by default)!  Last month, I bought a new computer for my home (my old one was getting painfully slow).  The new computer is a 3.6GHz Pentium 4, with 2 GIGABYTES(!) of RAM, and a 400 GIGABYTE hard disk.  My new computer cost significantly less than the first one did.  If you index for inflation, the new computer is at least an order of magnitude cheaper.

    I still have the letter that Microsoft sent me confirming my job offer.  It’s dated January 16th, 1984.  It’s formatted in Courier, and the salary and stock option information is written in ink.  It’s signed (in ink) by Steve Ballmer.  The offer letter also specifies the other benefits; it’s not important what they are.  I also have Steve’s business card – his job title?  VP, Corporate Staffs.  Yup, he was head of HR back then (he did lots of other things, but that’s what his title was).  I also have the employee list they gave out for the new hires, as I said before; there are only about 600 people on it.  Of those 600 people, 48 of them are still with Microsoft.  Their job titles range from Executive Assistant, to UK Project Troubleshooter, to Architect, to Director. 

    The only person who I interviewed with when I started is still at Microsoft, Mark Zbikowski.  Mark also has the most seniority of anyone still at Microsoft (except for Steve Ballmer and Bill Gates).

    When I started in the Lan Manager group, Brian Valentine was a new hire.  He was a test lead in the Lan Manager group, having just joined the company from Intel.  He (and Paul Maritz) used to tell us war stories about their time at Intel (I particularly remember the ones about the clean desk patrol).

    In the past twenty years, I’ve had 16 different managers:  Alan Whitney (MS-DOS 4.0); Anthony Short (MS-DOS 4.0); Eric Evans (MS-DOS 4.0, MS-DOS 4.1); Barry Shaw (Lan Manager 1.0); Ken Masden (Lan Manager 1.5, Lan Manager 2.0); Dave Thompson (Lan Manager 2.0, Windows NT 3.1); Chuck Lenzmeier (Windows NT 3.5); Mike Beckerman (Tiger); Rick Rashid (Tiger); Max Benson (Exchange 4.0, 5.0, 5.5); Soner Terek (Exchange 5.5, Exchange 2000); Jon Avner (Exchange 2000); Harry Pyle (SCP); Frank Yerrace (Longhorn); Annette Crowley (Longhorn) and Noel Cross (Longhorn).

    I’ve moved my office 18 different times (the shortest time I’ve spent in an office: 3 weeks).  I’ve lived through countless re-orgs.  On the other hand, I’ve never had a reorg that affected my day-to-day job.

    There have been so many memorable co-workers I’ve known over the years.  I can’t name them all (and I know that I’ve missed some really, really important ones), but I’ll try to hit some highlights.  If you think you should be on the list but aren’t, blame my poor memory, I apologize, and drop me a line!

    Gordon Letwin – Gordon was the OS guru at Microsoft when I started, he was the author of the original H19 terminal ROM before coming to Microsoft.  In many ways, Gordon was my mentor during my early years at Microsoft.

    Ross Garmoe – Ross was the person who truly taught me how to be a software engineer.  His dedication to quality continues to inspire me.  Ross also ran the “Lost Lambs” Thanksgiving Dinner – all of us in Seattle without families were welcome at Ross’s house where his wife Rose and their gaggle of kids always made us feel like we were home.  Ross, if you’re reading this, drop me a line :)

    Danny Glasser – Danny had the office across the hall from me when I was working on DOS Lan Manager.  He’s the guy who gave me the nickname of “DOS Vader”.

    Dave Cutler – Another inspiration.  He has forgotten more about operating systems than I’ll ever know.

    David Thompson – Dave was the singularly most effective manager I’ve ever had.  He was also my least favorite.  He pushed me harder than I’ve ever been pushed before, and taught me more about how to work on large projects than anyone had done before.  Valorie was very happy when I stopped working for him.

    David Weise – David came to Microsoft from Dynamical Systems Research, which I believe was Microsoft’s third acquisition.  He owned the memory management infrastructure for Windows 3.0.

    Aaron Reynolds – Author of the MS-NET redirector, one of the principal DOS developers.

    Ralph Lipe –Ralph designed most (if not all) of the VxD architecture that continued through Win9x. 

    David, Aaron, and Ralph formed the core of the Windows 3.0 team; it wouldn’t have been successful without them.  Collectively they’re the three people that I believe are most responsible for the unbelievable success of Windows 3.0.  Aaron retired a couple of years ago; David and Ralph are still here.  I remember David showing me around building 3 showing off the stuff in Windows 3.0.  The only thing that was going through my mind was “SteveB’s going to freak when he sees this stuff – this will blow OS/2 completely out of the water”.

    Paul Butzi – Paul took me for my lunch interview when I interviewed at Microsoft.  He also was in the office next to mine when I started (ok, I was in a lounge, he was in an office).  When I showed up in a suit, he looked at me and started gagging – “You’re wearing a ne-ne-ne-neckt….”  He never did get the word out.

    Speaking of Paul.  There was also the rest of the Xenix team:  Paul Butzi, Dave Perlin, Lee Smith, Eric Chin, Wayne Chapeski, David Byrne, Mark Bebie (RIP), Neil Friedman and many others.  Xenix 386 was the first operating system for the Intel 386 computer (made by Compaq!).  Paul had a prototype in his office, he had a desk fan blowing on it constantly, and kept a can of canned air handy in case it overheated.

    Ken Masden – the man who brought unicycle juggling to Microsoft.

    All of the “core 12”: Dave Cutler (KE), Lou Perazzoli (MM), Mark Lucovsky (Win32), Steve Wood (Win32, OB), Darryl Havens (IO), Chuck Lenzmeier (Net), John Balciunas (Bizdev), Rob Short (Hardware), Gary Kimura (FS), Tom Miller (FS),  Ted Kummert (Hardware), Jim Kelly (SE), Helen Custers (Inside Windows NT), and others.  These folks came to Microsoft from Digital Equipment with a vision to create something brand new.  As Tom Miller put it, it was likely to be the last operating system ever built from scratch (and no, Linux doesn’t count – NT was 100% new code (ok, the command interpreter came from OS/2), the Linux kernel is 100% new, but the rest of the system isn’t).  And these guys delivered.  It took longer than anyone had originally planned, but they delivered.  And these guys collectively taught Microsoft a lesson in how to write a REAL operation system, not a toy operating system like we’d been working on before.  Some day I’ll write about Gary Kimura’s coding style.

    Brian Valentine – Brian is without a doubt the most inspirational leader at Microsoft.  His ability to motivate teams through dark times is legendary.  I joined the Exchange team in 1994, the team was the laughing stock at Microsoft for our inability to ship product (Exchange had been in development for almost six years at that point), and we still had another year to go.  Brian led the team throughout this period with his unflagging optimism and in-your-face, just do it attitude.  For those reading this on the NT team: The Weekly World News was the official newspaper of the Exchange team LONG before it was the official newspaper of the Windows team.

    Max Benson – Max was my first manager in Exchange.  He took a wild chance on a potentially burned out engineer (my time in Research was rough) and together we made it work.

    Jaya Matthew – Jaya was the second person I ever had report to me; her pragmatism and talent were wonderful to work with.  She’s also a very good friend.

    Jim Lane, Greg Cox, and Ardis Jakubaitis – Jim, Greg, Ardis, Valorie and I used to play Runequest together weekly.  When I started, they were the old hands at Microsoft, and their perspectives on the internals of the company were invaluable.  They were also very good friends.

    And my list of co-workers would not be complete without one other:  Valorie Holden.  Yes, Valorie was a co-worker.  She started at Microsoft in 1985 as a summer intern working on testing Word and Windows 1.0.  While she was out here, she accepted my marriage proposal, and we set a date in 1987.  She went back to school, finished her degree, and we got married.  After coming out here, she started back working at Microsoft, first as the bug coordinator for OS/2, then as Nathan Myhrvold’s administrative assistant, then as a test lead in the Windows Printing Division, eventually as a program manager over in WPD.  Valorie has stood by my side through my 20 years at Microsoft; I’d never have made it without her unflagging support and advice (ok, the threats to my managers didn’t hurt either).

    There’ve been good times: Getting the first connection from the NT redirector to the NT server; Shipping Exchange 4.0; Shipping Exchange 2000 RC2 (the ship party in the rain).  Business trips to England.  Getting a set of cap guns from Brian Valentine in recognition of the time I spent in Austin for Lan Manager 2.0 (I spent 6 months spending Sunday-Wednesday in Austin, Thursday-Saturday in Redmond).

    There’ve been bad times:  Reorgs that never seemed to end.  Spending four years in ship mode (we were going to ship “6 months from now” during that time) for NT 3.1 (Read Showstopper! for more details).  The browser checker (it took almost ten years to get over that one).  A job decision decided over a coin toss.  Working in Research (I’m NOT cut out to work in research).

    But you know, the good times have far outweighed the bad, it’s been a blast.  My only question is: What’s in store for the next twenty years?

     

    Edit: Forgot some managers in the list :)

    Edit2: Missed Wayne from the Xenix team, there are probably others I also forgot.

    Edit3: Got some more Xenix developers :)

     

Page 1 of 33 (815 items) 12345»