Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Wait, that was MY bug? Ouch!

Wait, that was MY bug? Ouch!

Over the weekend, the wires were full with reports of a speech recognition demo at the Microsoft's Financial Analysts Meeting here in Seattle that went horribly wrong. 

Slashdot had it, Neowin had it,  Digg had it, Reuters had it.  It was everywhere.

And it was all my fault.


Well, mostly.  Rob Chambers on the speech team has already written about this, here's the same problem from my side of the fence.

About a month ago (more-or-less), we got some reports from an IHV that sometimes when they set the volume on a capture stream the actual volume would go crazy (crazy, for those that don't know, is a technical term).  Since volume is one of the areas in the audio subsystem that I own, the bug landed on my plate.  At the time, I was overloaded with bugs, so another of the developers on the audio team took over the investigation and root caused the bug fairly quickly.  The annoying thing about it was that the bug wasn't reproducible - every time he stepped through the code in the debugger, it worked perfectly, but it kept failing when run without any traces.


If you've worked with analog audio, it's pretty clear what's happening here - there's a timing issue that is causing a positive feedback loop that resulted from a signal being fed back into an amplifier.

It turns out that one of the common causes of feedback loops in software is a concurrency issue with notifications - a notification is received with new data, which updates a value, updating the value causes a new notification to be generated, which updates a value, updating the value causes a new notification, and so-on...

The code actually handled most of the feedback cases involving notifications, but there were two lower level bugs that complicated things.  The first bug was that there was an incorrect calculation that occurred when handling one of the values in the notification, and the second was that there was a concurrency issue - a member variable that should have been protected wasn't (I'm simplifying what actually happened, but this suffices). 


As a consequence of these two very subtle low level bugs, the speech recognition engine wasn't able to correctly control the gain on the microphone, when it did, it hit the notification feedback loop, which caused the microphone to clip, which meant that the samples being received by the speech recognition engine weren't accurate.

There were other contributing factors to the problem (the bug was fixed on more recent Vista builds than the one they were using for the demo, there were some issues with way the speech recognition engine had been "trained", etc), but it doesn't matter - the problem wouldn't have been nearly as significant.

Mea Culpa.

  • > and root caused the bug fairly quickly.

    That's what you get for running as root.  If you ran as a limited user then you wouldn't cause bugs.

    > The annoying thing about it was that the bug wasn't
    > reproducible

    So just mark it as resolved, not reproducible, and close it.  That's what Microsoft does with lots of my Vista beta bug reports.

    Unlike ordinary Windows bugs where Microsoft requires fees to be paid before allowing bugs to be reported, with Vista betas it's easy.
    (1) Find a secret link posted by Microsoft to get a bug reporting tool,
    (2) Download, install, and run the tool.
    (3) Get a reply containing some combination of:
    (3a) A request for more information, telling you to view a page on the Connect site but not allowing you to view the page because Microsoft intentionally prohibits you from viewing your own bug reports and Microsoft's replies to them (that is, if you paid for the beta as part of an MSDN subscription),
    (3b) A resolution that Microsoft couldn't reproduce the bug, accompanied by a request to try it again under build 5472, but not accompanied by a link from which build 5472 could be downloaded (MSDN has links for build 5472 in English and German but not Japanese),
    (3c) A bunch of question marks, for which Microsoft's original words can't be guessed, because Microsoft doesn't know how to select an encoding for Microsoft's own e-mail that can send Microsoft's own words to the victim.

    Mr. Osterman, all you have to do is conform to your employer's practices.  (3b) would make things easy for you.
  • It was a very good technical definition on what happened during the demo. But a bug is a bug and other people just see it that way. I will not be suprised if Vista will not ship by early next year. Anyway, we can all look forward for the Zune instead of Vista :)
  • Everyone is talking about that Windows Vista Speech Recognition demo at the Financial Analysts Meeting....
  • Reading about this one took me back to college in, oh, say, 1982.

    We were doing a speech-recognition project for an engineering class.  We wrote code to interface to the speech recognition boards (for a PDP-11, IIRC).  We built some interfaces to control various 110V items - lights, etc. We practiced the presentation that we were to make before the entire engineering class and various industry visitors multiple times.  We were smoking.

    And it went off well.  Right up until the time the presenter said "Make Daiquiris", and the blender started up.  See, we hadn't actually practiced with the blender plugged in - it was too loud.  While the rest of the audience roared with laughter at the gag, our presenter is walking to the back corner of the stage, bending over the mic trying to shield it from the noise, and finally gets the blender turned off.  Talking to people later, almost no one in the audience noticed the gaffe.

    Sometimes, it's good to be lucky, and not to be a Microsoft representative in front of an audience with recorders.

  • Since Microsoft is so fond of 'ahem', ripping off Apple, why not steal Steve Job's presentation preparation routine.
  • And people wonder why some developers try to deflect any bug from their perfect code - this is the perfect example of what happens when you stand up and say "That was my bug and I squashed it!"  (The fun in the comments, that is.)

    Good job Larry!  It's a beta, you found an error and fixed it for RTM.  It sounds like both jobs were done - both testing in real world situations to find the bug and a solution was found.

    Oh ye who have not released code with bugs - go ahead and admit you haven't written any code.
  • Larry,

    Thanks for the technical explanation of this. It's why I read your blog! Please don't be disheartened by some of the negative comments.
  • There's a reason presenters joke nervously about "the demo gods".  No matter how many times you rehearse something the odds of something going wrong increases proportionally with the number of people watching.  I recently presented a session at a local code camp.  Rehearsed the demos dozens of times.  I get in the room, connect my laptop to the projector and go to login and XP does not recognize my password!  I tried it a couple of times slowly, made sure caps lock was off but still no joy.  Unbelievable.  So I restarted the laptop and got lucky - it allowed me to login!  Next time I will figure out the appropriate sacrifice to "the demo gods" before presenting a session like that.  :-)
  • Bob, I think that Jobs ripped his presentation routine off from us.  There were errors all around here.

    One of these days, I'll write about the Steve Ballmer Windows 1.0 demo at the Winter '85 company meeting (that's the one where he did the "how much would you pay for this" fake ad that's on the internet).

    Stuff happens.  It sucks when it does, but it does happen.  In this case, the problem was caught by our test team (while the original report came from an IHV, the test team had test cases that explicitly caught this defect), we fixed it long before the next release cycle, and the bugs would normally never have seen the light of day.
  • Lots of great comments from listeners, you all keep me on my A game. I think your gonna like the new intro tonight we get you in the mood for a rocking podcast. Please submit your votes at Podcast Alley....
  • PingBack from
  • The blog entry of the speech team member Rob Chambers is funny. It cites Wikipedia as source of a definition of a term of digital signal processing (clipping).

    Shouldn't they have more scientific and accurate sources than wikipedia? I mean for a college student would be fine to cite such sources but for a member of the speech team of the biggest software company in the world?
  • Dude, if this is a timing issue with your traces on , you should try doing what we do - set up a ram drive on your test machine and send all debug out to it. It reduces the timing issues involved with tracing ..

    Alternatively, just ship vista with tracing on, and have a scheduled job that cleans it up once every so often?
  • You are fired!
  • Chris, we actually use WPP (ETW based logging) for our traces, it's essentially overhead-less (not quite, but close).

    Gino, I use the wikipedia on occasion too.  It has its issues (it can have serious accuracy issues (look at the history for "The Overlake School" for an example), but it works, and sometimes there's no more convenient source on the web.
Page 2 of 7 (100 items) 12345»