Larry Osterman's WebLog

Confessions of an Old Fogey
  • Larry Osterman's WebLog

    NextGenHacker101 owes me a new monitor


    Because I just got soda all over my current one…

    One of the funniest things I’ve seen in a while. 


    And yes, I know that I’m being cruel here and I shouldn’t make fun of the kids ignorance, but he is SO proud of his new discovery and is so wrong in his interpretation of what actually is going on…




    For my non net-savvy readers: The “tracert” command lists the route that packets take from the local computer to a remote computer.  So if I want to find out what path a packet takes from my computer to, I would issue “tracert”.  This can be extremely helpful when troubleshooting networking problems.  Unfortunately the young man in the video had a rather different opinion of what the command did.

  • Larry Osterman's WebLog

    Wait, that was MY bug? Ouch!


    Over the weekend, the wires were full with reports of a speech recognition demo at the Microsoft's Financial Analysts Meeting here in Seattle that went horribly wrong. 

    Slashdot had it, Neowin had it,  Digg had it, Reuters had it.  It was everywhere.

    And it was all my fault.


    Well, mostly.  Rob Chambers on the speech team has already written about this, here's the same problem from my side of the fence.

    About a month ago (more-or-less), we got some reports from an IHV that sometimes when they set the volume on a capture stream the actual volume would go crazy (crazy, for those that don't know, is a technical term).  Since volume is one of the areas in the audio subsystem that I own, the bug landed on my plate.  At the time, I was overloaded with bugs, so another of the developers on the audio team took over the investigation and root caused the bug fairly quickly.  The annoying thing about it was that the bug wasn't reproducible - every time he stepped through the code in the debugger, it worked perfectly, but it kept failing when run without any traces.


    If you've worked with analog audio, it's pretty clear what's happening here - there's a timing issue that is causing a positive feedback loop that resulted from a signal being fed back into an amplifier.

    It turns out that one of the common causes of feedback loops in software is a concurrency issue with notifications - a notification is received with new data, which updates a value, updating the value causes a new notification to be generated, which updates a value, updating the value causes a new notification, and so-on...

    The code actually handled most of the feedback cases involving notifications, but there were two lower level bugs that complicated things.  The first bug was that there was an incorrect calculation that occurred when handling one of the values in the notification, and the second was that there was a concurrency issue - a member variable that should have been protected wasn't (I'm simplifying what actually happened, but this suffices). 


    As a consequence of these two very subtle low level bugs, the speech recognition engine wasn't able to correctly control the gain on the microphone, when it did, it hit the notification feedback loop, which caused the microphone to clip, which meant that the samples being received by the speech recognition engine weren't accurate.

    There were other contributing factors to the problem (the bug was fixed on more recent Vista builds than the one they were using for the demo, there were some issues with way the speech recognition engine had been "trained", etc), but it doesn't matter - the problem wouldn't have been nearly as significant.

    Mea Culpa.

  • Larry Osterman's WebLog

    A Tree Grows... How?

    Valorie's currently attending a seminar about teaching science at the middle-school level.

    Yesterday, her instructor asked the following question:

    "I have in my hand a Douglass Fir tree seed that masses 1 gram [I'm making this number up, it's not important].  I plant it in my yard, water it regularly, and wait for 20 years.

    At the end of that time, I have a 50 foot tree that masses 1,000 kilograms [again, I'm making this exact number up, it's not important].


    My question is: Where did the 999,999 grams of mass come from?"


    I'm going to put the answer out to the group.  Where DID the 999,999 grams of mass come from in the tree.

    The answer surprises a lot of people.  And it brings to question how much we actually know about science.



    I'm going to change my comment moderation policy for this one.  I'm NOT going to approve people whose comments have the right answer, until I post my follow-up post tomorrow, because once the right answer's been given, it's pretty clear.  But I'll be interested in knowing the percentage of comments that have the right answer vs. the wrong answer.


  • Larry Osterman's WebLog

    What’s up with the Beep driver in Windows 7?


    Earlier today, someone asked me why 64bit versions of windows don’t support the internal PC speaker beeps.  The answer is somewhat complicated and ends up being an interesting intersection between a host of conflicting tensions in the PC ecosystem.


    Let’s start by talking about how the Beep hardware worked way back in the day[1].  The original IBM PC contained an Intel 8254 programmable interval timer chip to manage the system clock.  Because the IBM engineers felt that the PC needed to be able to play sound (but not particularly high quality sound), they decided that they could use the 8254 as a very primitive square wave generator.  To do this, they programmed the 3rd timer on the chip to operate in Square Wave mode and to count down with the desired output frequency.  This caused the Out2 line on the chip to toggle from high to low every time the clock went to 0.  The hardware designers tied the Out2 line on the chip to the PC speaker and voila – they were able to use the clock chip to program the PC speaker to make a noise (not a very high quality noise but a noise nonetheless).

    The Beep() Win32 API is basically a thin wrapper around the 8254 PIC functionality.  So when you call the Beep() API, you program the 8254 to play sounds on the PC speaker.


    Fast forward about 25 years…  The PC industry has largely changed and the PC architecture has changed with it.  At this point they don’t actually use the 8254 as the programmable interrupt controller, but it’s still in modern PCs.  And that’s because the 8254 is still used to drive the PC speaker. 

    One of the other things that happened in the intervening 25 years was that machines got a whole lot more capable.  Now machines come with capabilities like newfangled hard disk drives (some of which can even hold more than 30 megabytes of storage (but I don’t know why on earth anyone would ever want a hard disk that can hold that much stuff)).  And every non server machine sold today has a PC sound card.  So every single machine sold today has two ways of generating sounds – the PC sound card and the old 8254 which is tied to the internal PC speaker (or to a dedicated input on the sound card – more on this later).


    There’s something else that happened in the past 25 years.  PCs became commodity systems.  And that started exerting a huge amount of pressure on PC manufacturers to cut costs.  They looked at the 8254 and asked “why can’t we remove this?”

    It turns out that they couldn’t.  And the answer to why they couldn’t came from a totally unexpected place.  The American’s with Disabilities Act.


    The ADA?  What on earth could the ADA have to do with a PC making a beep?   Well it turns out that at some point in the intervening 25 years, the Win32 Beep() was used for assistive technologies – in particular the sounds made when you enable the assistive technologies like StickyKeys were generated using the Beep() API.   There are about 6 different assistive technology (AT) sounds built into windows, their implementation is plumbed fairly deep inside the win32k.sys driver. 

    But why does that matter?  Well it turns out that many enterprises (both governments and corporations) have requirements that prevent them from purchasing equipment that lacks accessible technologies and that meant that you couldn’t sell computers that didn’t have beep hardware to those enterprises.


    This issue was first noticed when Microsoft was developing the first 64bit version of WIndows.  Because the original 64bit windows was intended for servers, the hardware requirements for 64bit machines didn’t include support for an 8254 (apparently the AT requirements are relaxed on servers).  But when we started building a client 64bit OS, we had a problem – client OS’s had to support AT so we needed to bring the beep back even on machines that didn’t have beep hardware.

    For Windows XP this was solved with some custom code in winlogon which worked but had some unexpected complications (none of which are relevant to this discussion).  For Windows Vista, I redesigned the mechanism to move the accessibility beep logic to a new “user mode system sounds agent”. 

    Because the only machines with this problem were 64bit machines, this functionality was restricted to 64bit versions of Windows. 

    That in turn meant that PC manufacturers still had to include support for the 8254 hardware – after all if the user chose to buy the machine with a 32bit operating system on it they might want to use the AT functionality.

    For Windows 7, we resolved the issue completely – we moved all the functionality that used to be contained in Beep.Sys into the user mode system sounds agent – now when you call the Beep() API instead of manipulating the 8254 chip the call is re-routed into a user mode agent which actually plays the sounds.


    There was another benefit associated with this plan: Remember above when I mentioned that the 8254 output line was tied to a dedicated input on the sound card?  Because of this input to the sound card, the sound hardware needed to stay powered on at full power all the time because the system couldn’t know when an application might call Beep and thus activate the 8254 (there’s no connection between the 8254 and the power management infrastructure so the system can’t power on the sound hardware when someone programs the 3rd timer on the 8254).  By redirecting the Beep calls through the system audio hardware the system was able to put the sound hardware to sleep until it was needed.


    This redirection also had had a couple of unexpected benefits.  For instance when you accidentally type (or grep) through a file containing 0x07 characters in it (like a .obj file) you can finally turn off the annoying noise – since the beeps are played through the PC speakers, the PC mute key works to shut them up.  It also means that you can now control the volume of the beeps. 

    There were also some unexpected consequences.  The biggest was that people started noticing when applications called Beep().  They had placed their PCs far enough away (or there was enough ambient noise) that they had never noticed when their PC was beeping at them until the sounds started coming out their speakers.



    [1] Thus providing me with an justification to keep my old Intel component data catalogs from back in the 1980s.

  • Larry Osterman's WebLog

    Tipping Points


    One of my birthday presents was the book "The Tipping Point" by Malcolm Gladwell.

    In it, he talks about how epidemics and other flash occurances happen - situations that are stable, and a small thing changes and suddenly the world changed overnight.

    I've been thinking a lot about yesterdays blog post, and I realized that not only is it a story about one of the coolest developers I've ever met, it also describes a tipping point for the entire computer industry.

    Sometimes, it's fun to play the "what if" game, so...

    What if David Weise hadn't gotten Windows applications running in protected mode?  Now, keep in mind, this is just my rampant speculation, not what would have happened.  Think of it kinda like the Marvel Comics "What if..." series (What would have happened if Spiderman had rescued Gwen Stacy, etc [note: the deep link may not work, you may have to navigate directly]).

    "What If David Weise hadn't gotten Windows applications running in protected mode..."[1]

    Well, if Windows 3.0 hadn't had windows apps running in protected mode, then it likely would have not been successful.  That means that instead of revitalizing interest in Microsoft in the MS-DOS series of operating systems, Microsoft would have continued working on OS/2.  Even though working under the JDA was painful for both Microsoft and IBM, it was the best game in town.

    By 1993, Microsoft and IBM would have debuted OS/2 2.0, which would have had supported 32bit applications, and had MVDM support built-in.

    Somewhere over the next couple of years, the Windows NT kernel would have come out as the bigger, more secure brother of OS/2, it would have kept the workplace shell that IBM wrote (instead of the Windows 3.1 Task Manager).

    Windows 95 would have never existed, since the MS-DOS line would have withered and died off.  Instead, OS/2 would be the 32bit application for lower end machines.  And instead of Microsoft driving the UI story for the platform, IBM would have owned it.

    By 2001, most PC class machines would have OS/2 running on them (probably OS/2 2.5) with multimedia support.  NT OS/2 would also be available for business and office class machines.  With IBMs guidance, instead of the PCI bus becoming dominant, the MCA was the dominant bus form factor.  The nickname for the PC architecture wasn't "Wintel", instead it was "Intos" (OS2tel was just too awkwards to say).  IBM, Microsoft and Intel all worked to drive the hardware platform, and, since IBM was the biggest vendor of PC class hardware, they had a lot to say in the decisions.

    And interestingly enough, when IBM came to the realization that they could make more money selling consulting services than selling hardware, instead of moving to Linux, they stuck with OS/2 - they had a significant ownership stake in the platform, and they'd be pushing it as hard as they can.

    From Microsoft's perspective, the big change would be that instead of Microsoft driving the industry, IBM (as Microsoft's largest OEM, and development partner in OS/2) would be the driving force (at least as far as consumers were concerned).  UI decisions would be made by IBM's engineers, not Microsoft's.

    In my mind, the biggest effect of such a change would be on Linux.  Deprived of the sponsorship of a major enterprise vendor (the other enterprise players followed IBMs lead and went with OS/2), Linux remained as primarily an 'interesting' alternative to Solaris, AIX, and the other *nix based operating systems sold by hardware vendors.  Instead, AIX and Solaris became the major players in the *nix OS space, and flourished as an alternative. 


    Anyway, it's all just silly speculation, about what might have happened if the industry hadn't tipped, so take it all with a healthy pinch of salt.

    [1] I'm assuming that all other aspects of the industry remain the same: The internet tidal wave hit in the mid 90s, computers remained as fast as they had always, etc. - this may not be a valid set of assumptions, but it's my fantasy.  I'm also not touching on what affects the DoJ would have had on the situation.

  • Larry Osterman's WebLog

    What's this untitled slider doing on the Vista volume mixer?


    Someone sent the following screen shot to one of our internal troubleshooting aliases.  They wanted to know what the "Name Not Available" slider meant.



    The audio system on Vista keeps track of the apps that are playing sounds (it has to, to be able to display the information on what apps are playing sounds :)).  It keeps this information around for a period of time after the application has made the sound to enable the scenario where your computer makes a weird sound and you want to find out which application made the noise.

    The system only keeps track of the PID for each application, it's the responsibility of the volume mixer to convert the PID to a reasonable name (the audio service can't track this information because of session 0 isolation).

    This works great, but there's one possible problem: If an application exits between the time when the application made a noise and the system times out the fact that it played the noise, then the volume mixer has no way of knowing what the name of the application that made the noise was. In that case, it uses the "Name Not Available" text to give the user some information.

  • Larry Osterman's WebLog

    It was 20 years ago today...


    Nope, Sgt Pepper didn’t teach the band to play.

    20 years ago today, a kid fresh out of Carnegie-Mellon University showed up at the door of the 10700 Northup Way, ready to start his first day at a real job.

    What a long strange trip it’s been.

    Over the past 20 years, I’ve:

    ·         Worked on two different versions of MS-DOS (4.0, 4.1).

    ·         Worked on three different versions of Lan Manager (1.0, 1.5, 2.0)

    ·         Worked on five different releases of Windows NT (3.1, 3.5, XP (SP2), W2K3 (SP1), Longhorn)

    ·         Worked on four different releases of Exchange (4.0, 5.0, 5.5, and 2000)

    I’ve watched my co-workers move on to become senior VP’s.  I’ve watched my co-workers leave the company.

    I’ve seen the children of my co-workers grow up, go to college, marry, and have kids.

    I’ve watched the younger brother of my kids babysitter who I met at 12 years of age grow up, go to college and come to work at Microsoft in the office around the corner from mine (that one is REALLY weird btw).

    I’ve seen strategy’s come and go (Lan Manager as an OEM product, then retail, then integrated with the OS).

    I’ve watched three different paradigm shifts occur in the software industry, and most of a fourth.  The first one was the shift of real computing to “personal” computers.  The second was the GUI revolution, the third was the internet, and now we’re seeing a shift to smaller devices.  We’re still not done with that one.

    I’ve watched Microsoft change from a “small software startup in Seattle” to the 800 pound gorilla everyone hates.

    I’ve watched Microsoft grow from 650ish people to well over 50,000.

    I’ve watched our stock grow and shrink.  I’ve watched co-workers fortunes rise and fall.

    I’ve watched governments sue Microsoft.  I’ve watched Governments settle with Microsoft.  I’ve seen Microsoft win court battles.  I’ve seen Microsoft lose court battles.

    I’ve watched the internet bubble start, blossom, and explode.

    I’ve watched cellular phones go from brick-sized lumps to something close to the size of matchbooks.

    I’ve seen the computer on my desktop go from a 4.77MHz 8088 with 512K of RAM and a 10M hard disk to a 3.2GHz hyper-threaded Pentium 4 with 1G of RAM and an 80G hard disk.

    I’ve watched the idea of multimedia on the PC go from squeaky beeps from the speaker to 8-channel surround sound that would rival audiophile quality products.

    I’ve watched video on the PC go from 640x350 Black&White to 32bit color rendered in full 3d with millions of polygons.

    When I started at Microsoft, the computer that they gave me was a 4.77MHz PC/XT, with a 10 megabyte hard disk, and 512K of RAM.  I also had a Microsoft Softcard that increased the RAM to 640K, and it added a clock to the computer, too (they didn’t come with one by default)!  Last month, I bought a new computer for my home (my old one was getting painfully slow).  The new computer is a 3.6GHz Pentium 4, with 2 GIGABYTES(!) of RAM, and a 400 GIGABYTE hard disk.  My new computer cost significantly less than the first one did.  If you index for inflation, the new computer is at least an order of magnitude cheaper.

    I still have the letter that Microsoft sent me confirming my job offer.  It’s dated January 16th, 1984.  It’s formatted in Courier, and the salary and stock option information is written in ink.  It’s signed (in ink) by Steve Ballmer.  The offer letter also specifies the other benefits; it’s not important what they are.  I also have Steve’s business card – his job title?  VP, Corporate Staffs.  Yup, he was head of HR back then (he did lots of other things, but that’s what his title was).  I also have the employee list they gave out for the new hires, as I said before; there are only about 600 people on it.  Of those 600 people, 48 of them are still with Microsoft.  Their job titles range from Executive Assistant, to UK Project Troubleshooter, to Architect, to Director. 

    The only person who I interviewed with when I started is still at Microsoft, Mark Zbikowski.  Mark also has the most seniority of anyone still at Microsoft (except for Steve Ballmer and Bill Gates).

    When I started in the Lan Manager group, Brian Valentine was a new hire.  He was a test lead in the Lan Manager group, having just joined the company from Intel.  He (and Paul Maritz) used to tell us war stories about their time at Intel (I particularly remember the ones about the clean desk patrol).

    In the past twenty years, I’ve had 16 different managers:  Alan Whitney (MS-DOS 4.0); Anthony Short (MS-DOS 4.0); Eric Evans (MS-DOS 4.0, MS-DOS 4.1); Barry Shaw (Lan Manager 1.0); Ken Masden (Lan Manager 1.5, Lan Manager 2.0); Dave Thompson (Lan Manager 2.0, Windows NT 3.1); Chuck Lenzmeier (Windows NT 3.5); Mike Beckerman (Tiger); Rick Rashid (Tiger); Max Benson (Exchange 4.0, 5.0, 5.5); Soner Terek (Exchange 5.5, Exchange 2000); Jon Avner (Exchange 2000); Harry Pyle (SCP); Frank Yerrace (Longhorn); Annette Crowley (Longhorn) and Noel Cross (Longhorn).

    I’ve moved my office 18 different times (the shortest time I’ve spent in an office: 3 weeks).  I’ve lived through countless re-orgs.  On the other hand, I’ve never had a reorg that affected my day-to-day job.

    There have been so many memorable co-workers I’ve known over the years.  I can’t name them all (and I know that I’ve missed some really, really important ones), but I’ll try to hit some highlights.  If you think you should be on the list but aren’t, blame my poor memory, I apologize, and drop me a line!

    Gordon Letwin – Gordon was the OS guru at Microsoft when I started, he was the author of the original H19 terminal ROM before coming to Microsoft.  In many ways, Gordon was my mentor during my early years at Microsoft.

    Ross Garmoe – Ross was the person who truly taught me how to be a software engineer.  His dedication to quality continues to inspire me.  Ross also ran the “Lost Lambs” Thanksgiving Dinner – all of us in Seattle without families were welcome at Ross’s house where his wife Rose and their gaggle of kids always made us feel like we were home.  Ross, if you’re reading this, drop me a line :)

    Danny Glasser – Danny had the office across the hall from me when I was working on DOS Lan Manager.  He’s the guy who gave me the nickname of “DOS Vader”.

    Dave Cutler – Another inspiration.  He has forgotten more about operating systems than I’ll ever know.

    David Thompson – Dave was the singularly most effective manager I’ve ever had.  He was also my least favorite.  He pushed me harder than I’ve ever been pushed before, and taught me more about how to work on large projects than anyone had done before.  Valorie was very happy when I stopped working for him.

    David Weise – David came to Microsoft from Dynamical Systems Research, which I believe was Microsoft’s third acquisition.  He owned the memory management infrastructure for Windows 3.0.

    Aaron Reynolds – Author of the MS-NET redirector, one of the principal DOS developers.

    Ralph Lipe –Ralph designed most (if not all) of the VxD architecture that continued through Win9x. 

    David, Aaron, and Ralph formed the core of the Windows 3.0 team; it wouldn’t have been successful without them.  Collectively they’re the three people that I believe are most responsible for the unbelievable success of Windows 3.0.  Aaron retired a couple of years ago; David and Ralph are still here.  I remember David showing me around building 3 showing off the stuff in Windows 3.0.  The only thing that was going through my mind was “SteveB’s going to freak when he sees this stuff – this will blow OS/2 completely out of the water”.

    Paul Butzi – Paul took me for my lunch interview when I interviewed at Microsoft.  He also was in the office next to mine when I started (ok, I was in a lounge, he was in an office).  When I showed up in a suit, he looked at me and started gagging – “You’re wearing a ne-ne-ne-neckt….”  He never did get the word out.

    Speaking of Paul.  There was also the rest of the Xenix team:  Paul Butzi, Dave Perlin, Lee Smith, Eric Chin, Wayne Chapeski, David Byrne, Mark Bebie (RIP), Neil Friedman and many others.  Xenix 386 was the first operating system for the Intel 386 computer (made by Compaq!).  Paul had a prototype in his office, he had a desk fan blowing on it constantly, and kept a can of canned air handy in case it overheated.

    Ken Masden – the man who brought unicycle juggling to Microsoft.

    All of the “core 12”: Dave Cutler (KE), Lou Perazzoli (MM), Mark Lucovsky (Win32), Steve Wood (Win32, OB), Darryl Havens (IO), Chuck Lenzmeier (Net), John Balciunas (Bizdev), Rob Short (Hardware), Gary Kimura (FS), Tom Miller (FS),  Ted Kummert (Hardware), Jim Kelly (SE), Helen Custers (Inside Windows NT), and others.  These folks came to Microsoft from Digital Equipment with a vision to create something brand new.  As Tom Miller put it, it was likely to be the last operating system ever built from scratch (and no, Linux doesn’t count – NT was 100% new code (ok, the command interpreter came from OS/2), the Linux kernel is 100% new, but the rest of the system isn’t).  And these guys delivered.  It took longer than anyone had originally planned, but they delivered.  And these guys collectively taught Microsoft a lesson in how to write a REAL operation system, not a toy operating system like we’d been working on before.  Some day I’ll write about Gary Kimura’s coding style.

    Brian Valentine – Brian is without a doubt the most inspirational leader at Microsoft.  His ability to motivate teams through dark times is legendary.  I joined the Exchange team in 1994, the team was the laughing stock at Microsoft for our inability to ship product (Exchange had been in development for almost six years at that point), and we still had another year to go.  Brian led the team throughout this period with his unflagging optimism and in-your-face, just do it attitude.  For those reading this on the NT team: The Weekly World News was the official newspaper of the Exchange team LONG before it was the official newspaper of the Windows team.

    Max Benson – Max was my first manager in Exchange.  He took a wild chance on a potentially burned out engineer (my time in Research was rough) and together we made it work.

    Jaya Matthew – Jaya was the second person I ever had report to me; her pragmatism and talent were wonderful to work with.  She’s also a very good friend.

    Jim Lane, Greg Cox, and Ardis Jakubaitis – Jim, Greg, Ardis, Valorie and I used to play Runequest together weekly.  When I started, they were the old hands at Microsoft, and their perspectives on the internals of the company were invaluable.  They were also very good friends.

    And my list of co-workers would not be complete without one other:  Valorie Holden.  Yes, Valorie was a co-worker.  She started at Microsoft in 1985 as a summer intern working on testing Word and Windows 1.0.  While she was out here, she accepted my marriage proposal, and we set a date in 1987.  She went back to school, finished her degree, and we got married.  After coming out here, she started back working at Microsoft, first as the bug coordinator for OS/2, then as Nathan Myhrvold’s administrative assistant, then as a test lead in the Windows Printing Division, eventually as a program manager over in WPD.  Valorie has stood by my side through my 20 years at Microsoft; I’d never have made it without her unflagging support and advice (ok, the threats to my managers didn’t hurt either).

    There’ve been good times: Getting the first connection from the NT redirector to the NT server; Shipping Exchange 4.0; Shipping Exchange 2000 RC2 (the ship party in the rain).  Business trips to England.  Getting a set of cap guns from Brian Valentine in recognition of the time I spent in Austin for Lan Manager 2.0 (I spent 6 months spending Sunday-Wednesday in Austin, Thursday-Saturday in Redmond).

    There’ve been bad times:  Reorgs that never seemed to end.  Spending four years in ship mode (we were going to ship “6 months from now” during that time) for NT 3.1 (Read Showstopper! for more details).  The browser checker (it took almost ten years to get over that one).  A job decision decided over a coin toss.  Working in Research (I’m NOT cut out to work in research).

    But you know, the good times have far outweighed the bad, it’s been a blast.  My only question is: What’s in store for the next twenty years?


    Edit: Forgot some managers in the list :)

    Edit2: Missed Wayne from the Xenix team, there are probably others I also forgot.

    Edit3: Got some more Xenix developers :)


  • Larry Osterman's WebLog

    Open Source and Hot Rods


    I was surfing the web the other day and ran into someone linking to this article by Jack Lanier from Edmunds (the automotive newsletter people).

    The article's entitled "Friends Don't Let Friends Modify Cars".

    From the article:

    Today, it's difficult to make cars better and extremely easy to make them worse. Or dangerous.

    As a journalist driving modified cars, I've been sprayed with gasoline, boiling coolant, super-heated transmission fluid and nitrous oxide. (The latter was more entertaining than the former.) Several have burst into flames. Throttles have stuck wide open, brake calipers snapped clean off, suspensions ripped from their mounts and seatbelt mounting hardware has dropped into my lap. All this is on top of the expected thrown connecting rod, blown head gasket, exploded clutch, disintegrated turbocharger and broken timing belt.

    The vast majority of these vehicles were built by professionals. Many were from big-name tuners. Most performed as if they were constructed in a shop class at a high school with a lax drug policy. Once, after a suspension component fell off a car from a big-name tuner, the car actually handled better.

    For every modified and tuner car that performed better than stock, I've driven numerous examples that were slower. If they were quicker, it was often in an area that can't be used on the street. What's the use of gaining 0.2 second in the quarter-mile if the car is slower 0-60 mph? And costs $10,000 more?


    Recently, I autocrossed a pair of Subaru WRXs. One was a dead-stock WRX. The other, a tricked-out STi lowered with stiffer springs, shocks and bars and an exhaust kit and air filter. The STi is supposed to have an advantage of some 70 horsepower. Maybe the exhaust and filter moved the power up in the rev band where it couldn't be used. The lowered, stiffened STi regularly bottomed against its bump stops. When a car hits its bump stops, the spring rate goes to infinity and tire grip drops to near zero. This caused the STi to leap into the worst understeer I've experienced with inflated front tires. Meanwhile, in the unmodified WRX, I could be hard in the throttle at the same point. The result: The dead-stock WRX was at least as quick as the STi and far easier to drive. Easy to make worse, harder to make better

    I read this article and was struck by the similarities between this and the open source vs COTS model.

    COTS (Commercial Off The Shelf) software is equivalent to a stock automobile.  They're built by professional engineers, and tested as a whole.  But you don't get to mess with the system.

    On the other hand, open source gives you the ability to join the software equivalent of the tuner/modified market - you can tweak the system to your hearts content.  You may make it go faster, but you're not totally sure what it's going to do to the overall quality of the system.

    In fact, I constantly read that that's one of the huge benefits of open source - on an open source project, if you don't like how something works, you can just step in and fix it, while with COTS you don't have that ability.

    Software engineering is software engineering, whether it's open source or closed source.  Having the ability to tweak code (or an automobile) doesn't automatically mean that the tweak will be higher quality than what it's replacing.  It's entirely possible that it either won't be better, or that the improvement won't really matter.  On the IMAP mailing list, I CONSTANTLY see people submitting patches to the U.W. IMAP server proposing tweaks to fix one thing or another (even though it’s the wrong mailing list, the patches still come in).  And Mark Crispin shoots them down all the time, because the person making the patch didn’t really understand the system – their patch might have fixed their problem and their configuration, but it either opened up a security hole, or broke some other configuration, etc.

    Btw, the same thing holds true for system modifications.  Just because you can put a window into a hard disk doesn’t mean that the hard disk is going to work as well afterwards as it did before you took it apart.

    Just like in the automotive world, simply because you CAN modify something, it doesn't mean that it's a good idea to modify it.

  • Larry Osterman's WebLog

    Why do people think that a server SKU works well as a general purpose operating system?


    Sometimes the expectations of our customers mystify me.


    One of the senior developers at Microsoft recently complained that the audio quality on his machine (running Windows Server 2008) was poor.

    To me, it’s not surprising.  Server SKUs are tuned for high performance in server scenarios, they’re not configured for desktop scenarios.  That’s the entire POINT of having a server SKU – one of the major differences between server SKUs and client SKUs is that the client SKUs are tuned to balance the OS in favor of foreground responsiveness and the server SKUs are tuned in favor of background responsiveness (after all, its a server, there’s usually nobody sitting at the console, so there’s no point in optimizing for the console).


    In this particular case, the documentation for the MMCSS service describes a large part of the root cause for the problem:  The MMCSS service (which is the service that provides glitch resilient services for Windows multimedia applications) is essentially disabled on server SKUs.  It’s just one of probably hundreds of other settings that are tweaked in favor of server responsiveness on server SKUs. 


    Apparently we’ve got a bunch of support requests coming in from customers who are running server SKUs on their desktop and are upset that audio quality is poor.  And this mystifies me.  It’s a server operating system – if you want client operating system performance, use a client operating system.



    PS: To change the MMCSS tuning options, you should follow the suggestions from the MSDN article I linked to above:

    The MMCSS settings are stored in the following registry key:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Multimedia\SystemProfile

    This key contains a REG_DWORD value named SystemResponsiveness that determines the percentage of CPU resources that should be guaranteed to low-priority tasks. For example, if this value is 20, then 20% of CPU resources are reserved for low-priority tasks. Note that values that are not evenly divisible by 10 are rounded up to the nearest multiple of 10. A value of 0 is also treated as 10.

    For Vista, this value is set to 20, for Server 2008 the value is set to 100 (which disables MMCSS).

  • Larry Osterman's WebLog

    It's the platform, Silly!


    I’ve been mulling writing this one for a while, and I ran into the comment below the other day which inspired me to go further, so here goes.

    Back in May, Jim Gosling was interviewed by Asia Computer Weekly.  In the interview, he commented:

    One of the biggest problems in the Linux world is there is no such thing as Linux. There are like 300 different releases of Linux out there. They are all close but they are not the same. In particular, they are not close enough that if you are a software developer, you can develop one that can run on the others.

    He’s completely right, IMHO.  Just like the IBM PC’s documented architecture meant that people could create PC’s that were perfect hardware clones of IBM’s PCs (thus ensuring that the hardware was the same across PCs), Microsoft’s platform stability means that you could write for one platform and trust that it works on every machine running on that platform.

    There are huge numbers of people who’ve forgotten what the early days of the computer industry were like.  When I started working, most software was custom, or was tied to a piece of hardware.  My mother worked as the executive director for the American Association of Physicists in Medicine.  When she started working there (in the early 1980’s), most of the word processing was done on old Wang word processors.  These were dedicated machines that did one thing – they ran a custom word processing application that Wang wrote to go with the machine.  If you wanted to computerize the records of your business, you had two choices: You could buy a minicomputer and pay a programmer several thousand dollars to come up with a solution that exactly met your business needs.  Or you could buy a pre-packaged solution for that minicomputer.  That solution would also cost several thousand dollars, but it wouldn’t necessarily meet your needs.

    A large portion of the reason that these solutions were so expensive is that the hardware cost was so high.  The general purpose computers that were available cost tens or hundreds of thousands of dollars and required expensive facilities to manage.  So there weren’t many of them, which means that companies like Unilogic (makers of the Scribe word processing software, written by Brian Reid) charged hundreds of thousands of dollars for installations and tightly managed their code – you bought a license for the software that lasted only a year or so, after which you had to renew it (it was particularly ugly when Scribe’s license ran out (it happened at CMU once by accident) – the program would delete itself off the hard disk).

    PC’s started coming out in the late 1970’s, but there weren’t that many commercial software packages available for them.  One problems developers encountered was that the machines had limited resources, but beyond that, software developers had to write for a specific platform – the hardware was different for all of these machines, as was the operating system and introducing a new platform linearly increases the amount of testing required.  If it takes two testers to test for one platform, it’ll take four testers to test two platforms, six testers to test three platforms, etc (this isn’t totally accurate, there are economies of scale, but in general the principal applies – the more platforms you support, the higher your test resources required).

    There WERE successful business solutions for the early PCs, Visicalc first came out for the Apple ][, for example.  But they were few and far between, and were limited to a single hardware platform (again, because the test and development costs of writing to multiple platforms are prohibitive).

    Then the IBM PC came out, with a documented hardware design (it wasn’t really open like “open source”, since only IBM contributed to the design process, but it was fully documented).  And with the IBM PC came a standard OS platform, MS-DOS (actually IBM offered three or four different operating systems, including CP/M and the UCSD P-system but MS-DOS was the one that took off).  In fact, Visicalc was one of the first applications ported to MS-DOS btw, it was ported to DOS 2.0. But it wasn’t until 1983ish, with the introduction of Lotus 1-2-3, that PC was seen as a business tool and people flocked to it. 

    But the platform still wasn’t completely stable.  The problem was that while MS-DOS did a great job of virtualizing the system storage (with the FAT filesystem)  keyboard and memory, it did a lousy job of providing access to the screen and printers.  The only built-in support for the screen was a simple teletype-like console output mechanism.  The only way to get color output or the ability to position text on the screen was to load a replacement console driver, ANSI.SYS.

    Obviously, most ISVs (like Lotus) weren’t willing to deal with this performance issue, so they started writing directly to the video hardware.  On the original IBM PC, that wasn’t that big a deal – there were two choices, CGA or MDA (Color Graphics Adapter and Monochrome Display Adapter).  Two choices, two code paths to test.  So the test cost was manageable for most ISVs.  Of course, the hardware world didn’t stay still.  Hercules came out with their graphics adapter for the IBM monochrome monitor.  Now we have three paths.  Then IBM came out with the EGA and VGA.  Now we have FIVE paths to test.  Most of these were compatible with the basic CGA/MDA, but not all, and they all had different ways of providing their enhancements.  Some had some “unique” hardware features, like the write-only hardware registers on the EGA.

    At the same time as these display adapter improvements were coming, disks were also improving – first 5 ¼ inch floppies, then 10M hard disks, then 20M hard disks, then 30M.  And system memory increased from 16K to 32K to 64K to 256K to 640K.  Throughout all of it, the MS-DOS filesystem and memory interfaces continued to provide a consistent API to code to.  So developers continued to write to the MS-DOS filesystem APIs and grumbled about the costs of testing the various video combinations.

    But even so, vendors flocked to MS-DOS.  The combination of a consistent hardware platform and a consistent software interface to that platform was an unbelievably attractive combination.  At the time, the major competition to MS-DOS was Unix and the various DR-DOS variants, but none of them provided the same level of consistency.  If you wanted to program to Unix, you had to chose between Solaris, 4.2BSD, AIX, IRIX, or any of the other variants.  Each of which was a totally different platform.  Solaris’ signals behaved subtly differently from AIX, etc.  Even though the platforms were ostensibly the same, they were enough subtle differences so that you either wrote for only one platform, or you took on the burden of running the complete test matrix on EVERY version of the platform you supported.  If you ever look at the source code to an application written for *nix, you can see this quite clearly – there are literally dozens of conditional compilation options for the various platforms.

    On MS-DOS, on the other hand, if your app worked on an IBM PC, your app worked on a Compaq.  Because of the effort put forward to ensure upwards compatibility of applications, if your application ran on DOS 2.0, it ran on DOS 3.0 (modulo some minor issues related to FCB I/O).  Because the platforms were almost identical, your app would continue to run.   This commitment to platform stability has continued to this day – Visicalc from DOS 2.0 still runs on Windows XP.

    This meant that you could target the entire ecosystem of IBM PC compatible hardware with a single test pass, which significantly reduced your costs.   You still had to deal with the video and printer issue however.

    Now along came Windows 1.0.  It virtualized the video and printing interfaces providing, for the first time, a consistent view of ALL the hardware on the computer, not just disk and memory.  Now apps could write to one API interface and not worry about the underlying hardware.  Windows took care of all the nasty bits of dealing with the various vagaries of hardware.  This meant that you had an even more stable platform to test against than you had before.  Again, this is a huge improvement for ISV’s developing software – they no longer had to wonder about the video or printing subsystem’s inconsistencies.

    Windows still wasn’t an attractive platform to build on, since it had the same memory constraints as DOS had.  Windows 3.0 fixed that, allowing for a consistent API that finally relieved the 640K memory barrier.

    Fast forward to 1993 – NT 3.1 comes out providing the Win32 API set.  Once again, you have a consistent set of APIs that abstracts the hardware and provides a constant API set.  Win9x, when it came out continued the tradition.  Again, the API is consistent.  Apps written to Win32g (the subset of Win32 intended for Win 3.1) still run on Windows XP without modification.  One set of development costs, one set of test costs.  The platform is stable.  With the Unix derivatives, you still had to either target a single platform or bear the costs of testing against all the different variants.

    In 1995, Sun announced its new Java technology would be introduced to the world.  Its biggest promise was that it would, like Windows, deliver platform independent stability.  In addition, it promised cross-operating system stability.  If you wrote to Java, you’d be guaranteed that your app would run on every JVM in the world.  In other words, it would finally provide application authors the same level of platform stability that Windows provided, and it would go Windows one better by providing the same level of stability across multiple hardware and operating system platforms.

    In Jim Gosling post, he’s just expressing his frustration with fact that Linux isn’t a completely stable platform.  Since Java is supposed to provide a totally stable platform for application development, just like Windows needs to smooth out differences between the hardware on the PC, Java needs to smooth out the differences between operating systems.

    The problem is that Linux platforms AREN’T totally stable.  The problem is that while the kernel might be the same on all distributions (and it’s not, since different distributions use different versions of the kernel), the other applications that make up the distribution might not.  Java needs to be able to smooth out ALL the differences in the platform, since its bread and butter is providing a stable platform.  If some Java facilities require things outside the basic kernel, then they’ve got to deal with all the vagaries of the different versions of the external components.  As Jim commented, “They are all close, but not the same.”  These differences aren’t that big a deal for someone writing an open source application, since the open source methodology fights against packaged software development.  Think about it: How many non open-source software products can you name that are written for open source operating systems?  What distributions do they support?  Does Oracle support other Linux distributions other than Red Hat Enterprise?  The reason that there are so few is that the cost of development for the various “Linux” derivatives is close to prohibitive for most shrink-wrapped software vendors; instead they pick a single distribution and use that (thus guaranteeing a stable platform).

    For open source applications, the cost of testing and support is pushed from the developer of the package to the end-user.  It’s no longer the responsibility of the author of the software to guarantee that their software works on a given customer’s machine, since the customer has the source, they can fix the problem themselves.

    In my honest opinion, platform stability is the single thing that Microsoft’s monoculture has brought to the PC industry.  Sure, there’s a monoculture, but that means that developers only have to write to a single API.  They only have to test on a single platform.  The code that works on a Dell works on a Compaq, works on a Sue’s Hardware Special.  If an application runs on Windows NT 3.1, it’ll continue to run on Windows XP.

    And as a result of the total stability of the platform, a vendor like Lotus can write a shrink-wrapped application like Lotus 1-2-3 and sell it to hundreds of millions of users and be able to guarantee that their application will run the same on every single customer’s machine. 

    What this does is to allow Lotus to reduce the price of their software product.  Instead of a software product costing tens of thousands of dollars, software products costs have fallen to the point where you can buy a fully featured word processor for under $130.  

    Without this platform stability, the testing and development costs go through the roof, and software costs escalate enormously.

    When I started working in the industry, there was no volume market for fully featured shrink wrapped software, which meant that it wasn’t possible to amortize the costs of development over millions of units sold. 

    The existence of a stable platform has allowed the industry to grow and flourish.  Without a stable platform, development and test costs would rise and those costs would be passed onto the customer.

    Having a software monoculture is NOT necessarily an evil. 

  • Larry Osterman's WebLog

    Somehow I don't think I'm going to see this story on slashdot any time soon :)


    Michael Howard sent the following news article to one of our internal DL's this morning.  For some reason, I don't think it's going to hit the front page of Slashdot any time soon:

    Serving as the latest reminder of that fact is Antioch University in Yellow Springs, Ohio, which recently disclosed that Social Security numbers and other personal data belonging to more than 60,000 students, former students and employees may have been compromised by multiple intrusions into its main ERP server.

    The break-ins were discovered Feb. 13 and involved a Sun Solaris server that had not been patched against a previously disclosed FTP vulnerability, even though a fix was available for the flaw at the time of the breach, university CIO William Marshall said today.


    "When we went in and did a further investigation, we found that there was an IRC bot installed on the system," Marshall said.

    So Antioch's Solaris systems were (a) compromised by an old vulnerability, and (b) were being used as botnet clients.  Both of which the slashdot crowd claim only happens on "Windoze" machines.

    At what point do people pull their heads out of the sand and realize that computer security and patching disciplines are an industry-wide issue and not just a single platform issue?  Even after the Pwn2Own contest last month was won by a researcher who exploited a flash vulnerability, the vast majority of the people commenting on the ZDNet article claimed that the issue was somehow "windows only".  Ubuntu even published a blog post that claimed that they "won" (IMHO they didn't, because Shane has said that the only reason he chose not to attack the Ubuntu machine was that he was more familiar with Windows).  The reality is that nobody "wins" these contests (except maybe the security researcher who gets a shiny new computer at the end).  It's just a matter of time before the machine will get 0wned.

    Ignoring stories like this make people believe that somehow security issues are isolated to a single platform, and that in turn leaves them vulnerable to hackers.  It's far better to acknowledge that the IT industry as a whole has an issue with security and ask how to move forwards.


    Edit: Ubunto->Ubuntu (oops :))

  • Larry Osterman's WebLog

    Resilience is NOT necessarily a good thing


    I just ran into this post by Eric Brechner who is the director of Microsoft's Engineering Excellence center.

    What really caught my eye was his opening paragraph:

    I heard a remark the other day that seemed stupid on the surface, but when I really thought about it I realized it was completely idiotic and irresponsible. The remark was that it's better to crash and let Watson report the error than it is to catch the exception and try to correct it.

    Wow.  I'm not going to mince words: What a profoundly stupid assertion to make.  Of course it's better to crash and let the OS handle the exception than to try to continue after an exception.


    I have a HUGE issue with the concept that an application should catch exceptions[1] and attempt to correct them.  In my experience handling exceptions and attempting to continue is a recipe for disaster.  At best, it takes an easily debuggable problem into one that takes hours of debugging to resolve.  At it's worst, exception handling can either introduce security holes or render security mitigations irrelevant.

    I have absolutely no problems with fail fast (which is what Eric suggests with his "Restart" option).  I think that restarting a process after the process crashes is a great idea (as long as you have a way to prevent crashes from spiraling out of control).  In Windows Vista, Microsoft built this functionality directly into the OS with the Restart Manager, if your application calls the RegisterApplicationRestart API, the OS will offer to restart your application if it crashes or is non responsive.  This concept also shows up in the service restart options in the ChangeServiceConfig2 API (if a service crashes, the OS will restart it if you've configured the OS to restart it).

    I also agree with Eric's comment that asserts that cause crashes have no business living in production code, and I have no problems with asserts logging a failure and continuing (assuming that there's someone who is going to actually look at the log and can understand the contents of the log, otherwise the  logs just consume disk space). 


    But I simply can't wrap my head around the idea that it's ok to catch exceptions and continue to run.  Back in the days of Windows 3.1 it might have been a good idea, but after the security fiascos of the early 2000s, any thoughts that you could continue to run after an exception has been thrown should have been removed forever.

    The bottom line is that when an exception is thrown, your program is in an unknown state.  Attempting to continue in that unknown state is pointless and potentially extremely dangerous - you literally have no idea what's going on in your program.  Your best bet is to let the OS exception handler dump core and hopefully your customers will submit those crash dumps to you so you can post-mortem debug the problem.  Any other attempt at continuing is a recipe for disaster.



    [1] To be clear: I'm not necessarily talking about C++ exceptions here, just structured exceptions.  For some C++ and C# exceptions, it's ok to catch the exception and continue, assuming that you understand the root cause of the exception.  But if you don't know the exact cause of the exception you should never proceed.  For instance, if your binary tree class throws a "Tree Corrupt" exception, you really shouldn't continue to run, but if opening a file throws a "file not found" exception, it's likely to be ok.  For structured exceptions, I know of NO circumstance under which it is appropriate to continue running.


    Edit: Cleaned up wording in the footnote.

  • Larry Osterman's WebLog

    Early Easter Eggs


    Jensen Harris's blog post today talked about an early Easter Egg he found in the Radio Shack TRS-80 Color Computer BASIC interpreter.


    What's not widely known is that there were Easter Eggs in MS-DOS.  Not many, but some did slip in.  The earliest one I know of was one in the MS-DOS "Recover" command.

    The "Recover" command was an "interesting" command.

    As it was explained to me, when Microsoft added support for hard disks (and added a hierarchical filesystem to the operating system), the powers that be were worried that people would "lose" their files (by forgetting where they put them).

    The "recover" command was an attempt to solve this.  Of course it "solved" the problem by using the "Take a chainsaw to carve the Sunday roast" technique.

    You see, the "Recover" command flattened your hard disk - it moved all the files from all the subdirectories on your hard disk into the root directory.  And it renamed them to be FILE0001.REC to FILE<n>.REC.

    If someone ever used it, their immediate reaction was "Why on earth did those idiots put such a useless tool in the OS, now I've got got to figure out which of these files is my file, and I need to put all my files back where they came from".  Fortunately Microsoft finally removed it from the OS in the MS-DOS 5.0 timeframe.


    Before it flattened your hard disk, it helpfully asked you if you wanted to continue (Y/N)?.

    Here's the Easter Egg: On MS-DOS 2.0 (only), if you hit "CTRL-R" at the Y/N prompt, it would print out the string "<developer email alias> helped with the new DOS, Microsoft Rules!"

    To my knowledge, nobody ever figured out how to get access to this particular easter egg, although I do remember Peter Norton writing a column about it in PC-WEEK (he found the text of the easter egg by running "strings" on the binary).

    Nowadays, adding an easter egg to a Microsoft OS is immediate grounds for termination, so it's highly unlikely you'll ever see another.


    Somewhat later:  I dug up the documentation for the "recover" command - the version of the documentation I found indicates that the tool was intended to recover files with bad sectors within it - apparently if you specified a filename, it would create a new file in the current directory that contained all the clusters from the bad file that were readable.  If you specified just a drive, it did the same thing to all the files on the drive - which had the effect of wiping your entire disk.  So the tool isn't TOTALLY stupid, but it still was pretty surprising to me when I stumbled onto it on my test machine one day.


  • Larry Osterman's WebLog

    What's wrong with this code, part 10

    Ok, time for another "what's wrong with this code".  This one's trivial from a code standpoint, but it's tricky...

    // ----------------------------------------------------------------------
    // Function:
    // CThing1::OnSomethingHappening()
    // Description:
    // Called when something happens
    // Return:
    // S_OK if successful
    // ----------------------------------------------------------------------
    HRESULT CThing1::OnSomethingHappening()
        HRESULT hr;

        <Do Some Stuff>
        // Perform some operation...
        hr = PerformAnOperation();
        if (FAILED(hr))
            hr = ERROR_NOT_SUPPORTED;
        IF_FAILED_JUMP(hr, Error);

        return hr;

        goto Exit;

    Not much code, no?  So what's wrong with it?

    As usual, answers and kudos tomorrow.

  • Larry Osterman's WebLog

    A few of my favorite Win7 Sound features – UI refinements


    Well, we shipped Windows 7, and now I’d like to talk about a few of my favorite features that were added by the Sound team.  Most of them fit in the “make it just work the way it’s supposed to”, but a few are just cool.

    I also want to call out some stuff that people probably are going to miss in the various Windows 7 reviews.

    One of the areas I want to call out is the volume UI.  There’s actually been a ton of work done on the volume UI in Windows 7, although most of it exists under the covers.  For instance, the simple volume control (the one you get to with a single click from the volume notification area) uses what we call “flat buttons”.

    Windows 7 Simple Volume UI:      

    Windows Vista Simple Volume UI:

    Win7 Simple Volume Vista Simple Volume

    Both the mute control and the device button are “flat buttons” – when you mouse over the buttons, the button surfaces:

    simple volume with "flat button" enabled

    By using the “flat buttons”, the UI continues to have the old functionality, but it visually appears cleaner.  There have been a number of other changes to the simple volume UI.  First off, we will now show more than one slider if you have more than one audio solution on your machine and you’re using both of them at the same time.  This behavior is controlled by the new volume control options dialog:


    As I mentioned above, the device icon is also a “flat button” – this enables one click access to the hardware properties for you audio solution.


    The volume mixer has also changed slightly.  You’ll notice the flat buttons for the device and mute immediately.  We also added a flat button for the System Sounds which launches the system sounds applet.


    Another subtle change to the volume mixer is that there are now meters for individual applications as well as for the master volume:


    And finally, the volume mixer no longer flickers when resizing (yay!).  Fixing the flicker was a problem that took a ton of effort (and I needed to ask the User team for help figuring out the problem) – the solution turned out to be simple but it took some serious digging to figure it out.

  • Larry Osterman's WebLog

    Chris Pirillo's annoyed by the Windows Firewall prompt


    Yesterday, Chris Pirillo made a comment in one of his posts:

    And if you think you’re already completely protected in Windows with its default tools, think again. This morning, after months of regular Firefox use, I get this security warning from the Windows Vista Firewall. Again, this was far from the first time I had used Firefox on this installation of Windows. Not only is the dialog ambiguous, it’s here too late.

    I replied in a comment on his blog:

    The reason that the Windows firewall hasn’t warned you about FF’s accessing the net is that up until this morning, all of it’s attempts have been outbound. But for some reason, this morning, it decided that it wanted to receive data from the internet.

    The firewall is doing exactly what it’s supposed to do - it’s stopping FF from listening for an inbound connection (which a web browser probably shouldn’t do) and it’s asking you if it’s ok.

    Why has your copy of firefox suddenly decided to start receiving data over the net when you didn’t ask it to?

    Chris responded in email:

    Because I started to play XM Radio?  *shrug*

    My response to him (which I realized could be a post in itself - for some reason, whenever I respond to Chris in email, I end up writing many hundred word essays):

    Could be - so in this case, the firewall is telling you (correctly) exactly what happened.

    That's what firewalls do.

    Firefox HAS the ability to open the ports it needs when it installs (as does whatever plugin you're using to play XM radio (I documented the APIs for doing that on my blog about 3 years ago, the current versions of the APIs are easier to use than the ones I used)), but for whatever reason it CHOSE not to do so and instead decided that the correct user experience was to prompt the user when downloading.

    This was a choice made by the developers of Firefox and/or the developer of XM radio plugin - either by design, ignorance, schedule pressure or just plain laziness, I honestly don't know (btw, if you're using the WMP FF plugin to play from XM, my comment still stands - I don't know if this was a conscious decision or not).

    Blaming the firewall (or Vista) for this is pointless (with a caveat below). 


    The point of the firewall is to alert you that an application is using the internet in a way that's unexpected and ask you if it makes sense. You, the user, know that you've started playing audio from XM, so you, the user expect that it's reasonable that Firefox start receiving traffic from the internet. But the firewall can't know what you did (and if it was able to figure it out, the system would be so hideously slow that you'd be ranting on and on about how performance sucks).

    Every time someone opens an inbound port in the firewall, they add another opportunity for malware to attack their system. The firewall is just letting the user know about it. And maybe, just maybe, the behavior that's being described might get the user to realize that malware has infected their machine and they'll repair it.

    In your case, the system was doing you a favor. It was a false positive, yes, but that's because you're a reasonably intelligent person. My wife does ad-hoc tech support for a friend who isn't, and the anti-malware stuff in Windows (particularly Windows Defender) has saved the friends bacon at least three times this year alone.


    On the other hand, you DO have a valid point: The dialog that was displayed by the firewall didn't give you enough information about what was happening.  I believe that this is because you were operating under the belief that the Windows firewall was both an inbound and outbound firewall.  The Windows Vista firewall  IS both, but by default it's set to allow all outbound connections (you need to configure it to block outbound connections).  If you were operating under the impression that it was an outbound firewall, you'd expect it to prompt for outbound connections.

    People HATE outbound firewalls because of the exact same reason you're complaining - they constantly ask people "Are you sure you want to do that?" (Yes, dagnabbit, I WANT to let Firefox access the internet, are you stupid or something?).

    IMHO outbound firewalls are 100% security theater[1][2]. They provide absolutely no value to customers. This has been shown time and time again (remember my comment above about applications being able to punch holes in the firewall? Malware can do the exact same thing). The only thing an outbound firewall does is piss off customers. If the Windows firewall was enabled to block outbound connections by default, I guarantee you that within minutes of that release, the malware authors would simply add code to their tools to disable it.  Even if you were to somehow figure out how to block the malware from opening up outbound ports[3], the malware will simply hijack a process running in the context of the user that's allowed to access the web. Say... Firefox. This isn't a windows specific issue, btw - every other OS available has exactly the same issues (malware being able to inject itself into processes running in the same security context as the user running the malware).

    Inbound firewalls have very real security value, as do external dedicated firewalls. I honestly believe that the main reason you've NOT seen any internet worms since 2002 is simply because XP SP2 enabled the firewall by default. There certainly have been vulnerabilities found in Windows and other products that had the ability to be turned into a worm - the fact that nobody has managed to successfully weaponize them is a testament to the excellent work done in XP SP2.


    [1] I'm slightly overexaggerating here - there is one way in which outbound firewalls provide some level of value, and that's as a defense-in-depth measure (like ASLR or heap randomization). For instance, in Vista, every built-in service (and 3rd party services if they want to take the time to opt-in) defines a set of rules which describes the networking behaviors of the service (I accept inbound connections on UDP from port <foo>, and make outbound connections to port <bar>). The firewall is pre-configured with those rules and will prevent any access to the network from those services. The outbound firewall rules make it much harder for a piece of malware to make outbound connections (especially if the service is running in a restricted account like NetworkService or LocalService). It is important to realize this is JUST Defense-in-Depth measure and CAN be worked around (like all other defense-in-depth measures). 

    [2] Others disagree with me on this point - for example, Thomas Ptacek over at Matasano wrote just yesterday: "Outbound filtering is more valuable than inbound filtering; it catches “phone-home” malware. It’s not that hard to implement, and I’m surprised Leopard doesn’t do it."  And he's right, until the "phone-home" malware decides to turn off the firewall. Not surprisingly, I also disagree with him on the value of inbound filtering.

    [3] I'm not sure how you do that while still allowing the user to open up ports - functionality being undocumented has never stopped malware authors.

  • Larry Osterman's WebLog

    Another pet peeve. Nounifying the word "ask"


    Sorry about not blogging, my days are filled with meetings trying to finish up our LH beta2 features - I can't wait until people see this stuff, it's that cool.

    But because I'm in meetings back-to-back (my calender looks like a PM's these days), I get subjected to a bunch of stuff that I just hate.

    In particular, one "meme" that seems to have taken off here at Microsoft is nounifying the word "ask".

    I can't tell how many times I've been in a meeting and had someone say: "So what are your teams asks for this feature?" or "Our only ask is that we have the source process ID added to this message".

    For the life of me, I can't see where this came from, but it seems like everyone's using it.

    What's wrong with the word "request"?  It's a perfectly good noun and it means the exact same thing that a nounified "ask" means.


  • Larry Osterman's WebLog

    Windows Vista Sound causes Network Throughput slowdowns.


    AKA: How I spent last week :).

    On Tuesday Morning last week, I got an email from "":

    You've probably already seen this article, but just in case I'd love to hear your response.

    Playing Music Slows Vista Network Performance?

    In fact, I'd not seen this until it was pointed out to me.  It seemed surprising, so I went to talk to our perf people, and I ran some experiments on my own.

    They didn't know what was up, and I was unable to reproduce the failure on any of my systems, so I figured it was a false alarm (we get them regularly).  It turns out that at the same time, the networking team had heard about the same problem and they WERE able to reproduce the problem.  I also kept on digging and by lunchtime, I'd also generated a clean reproduction of the problem in my office.

    At the same time, Adrian Kingsley-Hughes over at ZDNet Blogs picked up the issue and started writing about the issue.

    By Friday, we'd pretty much figured out what was going on and why different groups were seeing different results - it turns out that the issue was highly dependent on your network topology and the amount of data you were pumping through your network adapter - the reason I hadn't been able to reproduce it is that I only have a 100mbit Ethernet adapter in my office - you can get the problem to reproduce on 100mbit networks, but you've really got to work at it to make it visible.  Some of the people working on the problem sent a private email to Adrian Kingsley-Hughes on Friday evening reporting the results of our investigation, and Mark Russinovich (a Technical Fellow, and all around insanely smart guy) wrote up a detailed post explaining what's going on in insane detail which he posted this morning.

    Essentially, the root of the problem is that for Vista, when you're playing multimedia content, the system throttles incoming network packets to prevent them from overwhelming the multimedia rendering path - the system will only process 10,000 network frames per second (this is a hideously simplistic explanation, see Mark's post for the details)

    For 100mbit networks, this isn't a problem - it's pretty hard to get a 100mbit network to generate 10,000 frames in a second (you need to have a hefty CPU and send LOTS of tiny packets), but on a gigabit network, it's really easy to hit the limit.


    One of the comments that came up on Adrian's blog was a comment from George Ou (another zdnet blogger):

    ""The connection between media playback and networking is not immediately obvious. But as you know, the drivers involved in both activities run at extremely high priority. As a result, the network driver can cause media playback to degrade."

    I can't believe we have to put up with this in the era of dual core and quad core computers. Slap the network driver on one CPU core and put the audio playback on another core and problem solved. But even single core CPUs are so fast that this shouldn't ever be a problem even if audio playback gets priority over network-related CPU usage. It's not like network-related CPU consumption uses more than 50% CPU on a modern dual-core processor even when throughput hits 500 mbps. There’s just no excuse for this."

    At some level, George is right - machines these days are really fast and they can do a lot.  But George is missing one of the critical differences between multimedia processing and other processing.

    Multimedia playback is fundamentally different from most of the day-to-day operations that occur on your computer. The core of the problem is that multimedia playback is inherently isochronous. For instance, in Vista, the audio engine runs with a periodicity of 10 milliseconds. That means that every 10 milliseconds, it MUST wake up and process the next set of audio samples, or the user will hear a "pop" or “stutter” in their audio playback. It doesn’t matter how fast your processor is, or how many CPU cores it has, the engine MUST wake up every 10 milliseconds, or you get a “glitch”.

    For almost everything else in the system, if the system locked up for even as long as 50 milliseconds, you’d never notice it. But for multimedia content (especially for audio content), you absolutely will notice the problem. The core reason behind it has to do with the physics of sound, but whenever there’s a discontinuity in the audio stream, a high frequency transient is generated. The human ear is quite sensitive to these high frequency transients (they sound like "clicks" or "pops"). 

    Anything that stops the audio engine from getting to run every 10 milliseconds (like a flurry of high priority network interrupts) will be clearly perceptible. So it doesn’t matter how much horsepower your machine has, it’s about how many interrupts have to be processed.

    We had a meeting the other day with the networking people where we demonstrated the magnitude of the problem - it was pretty dramatic, even on the top-of-the-line laptop.  On a lower-end machine it's even more dramatic.  On some machines, heavy networking can turn video rendering to a slideshow.


    Any car buffs will immediately want to shoot me for this analogy, because I’m sure it’s highly inaccurate (I am NOT a car person), but I think it works: You could almost think of this as an engine with a slip in the timing belt – you’re fine when you’re running the engine at low revs, because the slip doesn’t affect things enough to notice. But when you run the engine at high RPM, the slip becomes catastrophic – the engine requires that the timing be totally accurate, but because it isn’t, valves don’t open when they have to and the engine melts down.


    Anyway, that's a long winded discussion.  The good news is that the right people are actively engaged on working to ensure that a fix is made available for the problem.

  • Larry Osterman's WebLog

    Beware of the dancing bunnies.


    I saw a post the other day (I'm not sure where, otherwise I'd cite it) that proclaimed that a properly designed system didn't need any anti-virus or anti-spyware software.

    Forgive me, but this comment is about as intellegent as "I can see a worldwide market for 10 computers" or "no properly written program should require more than 128K of RAM" or "no properly designed computer should require a fan".

    The reason for this is buried in the subject of this post, it's what I (and others) like to call the "dancing bunnies" problem.

    What's the dancing bunnies problem?

    It's a description of what happens when a user receives an email message that says "click here to see the dancing bunnies".

    The user wants to see the dancing bunnies, so they click there.  It doesn't matter how much you try to disuade them, if they want to see the dancing bunnies, then by gum, they're going to see the dancing bunnies.  It doesn't matter how many technical hurdles you put in their way, if they stop the user from seeing the dancing bunny, then they're going to go and see the dancing bunny.

    There are lots of techniques for mitigating the dancing bunny problem.  There's strict privilege separation - users don't have access to any locations that can harm them.  You can prevent users from downloading programs.  You can make the user invoke magic commands to make code executable (chmod +e dancingbunnies).  You can force the user to input a password when they want to access resources.  You can block programs at the firewall.  You can turn off scripting.  You can do lots, and lots of things.

    However, at the end of the day, the user still wants to see the dancing bunny, and they'll do whatever's necessary to bypass your carefully constructed barriers in order to see the bunny

    We know that user's will do whatever's necessary.  How do we know that?  Well, because at least one virus (one of the Beagle derivatives) propogated via a password encrypted .zip file.  In order to see the contents, the user had to open the zip file and type in the password that was contained in the email.  Users were more than happy to do that, even after years of education, and dozens of technological hurdles.

    All because they wanted to see the dancing bunny.

    The reason for a platform needing anti-virus and anti-spyware software is that it forms a final line of defense against the dancing bunny problem - at their heart, anti-virus software is software that scans every executable before it's loaded and prevents it from running if it looks like it contain a virus.

    As long as the user can run code or scripts, then viruses will exist, and anti-virus software will need to exist to protect users from them.


  • Larry Osterman's WebLog

    We've RI'ed!!!

    We've RI'ed!

    ??  What on earth is he talking about ??

    An RI is a "Reverse Integration".  The NT source system is built as a series of branches off of a main tree, and there are two sets of operations that occur - when a change is made to the trunk, the changes are "forward integrated" to be branches.  New feature development goes on in the branches, and when the feature is ready for "prime time", the work is "reverse integrated" back into the main tree, and those changes are subsequently forward integrated into the various other branches.

    The primary reason for structure is to ensure that the trunk always has a high level of quality - the branches may be of varying quality levels, but the main trunk always remains defect free.

    Well, yesterday afternoon, our feature RI'ed into the main multimedia branch, this is the first step towards having our code in the main Windows product (which should happen fairly soon).

    When a feature is RI'ed into any of the main Windows branches, code has to go through a series of what are called "Quality Gates".  The quality gates are in place to ensure a consistent level of engineering quality across the product - among other things, it ensures that the feature has up-to-date test and development specifications, an accurate and complete threat model, that the tests for the feature have a certain level of code coverage.  There are a bunch of other gates beyond these, but they're related to internal processes that aren't relevant.

    The quality gates may seem like a huge amount of bureaucracy to go through, and they can be difficult, but their purpose is really worthwhile - the quality gates are what ensures that no code is checked into the trunk that doesn't meet the quality bar for being a part of Windows.

    Our team's been working on this feature (no, I can't say what it is, yet :() for over three years, it's been a truly heroic effort on the part of everyone involved, but especially on the part of the group's development leads, Noel Cross and Alper Selcuk, who were at work at 2AM every day for most of the past three weeks ensuring that all the I's were dotted and the T's were crossed.

    This is SO cool.

    Edit: Cut&Paste error led to typo in Noel's name


  • Larry Osterman's WebLog

    Everyone wants a shiny new UI


    Surfing around the web, I often run into web sites that contain critiques of various aspects of Windows UI.

    One of the most common criticisms on those sites is "old style" dialogs.  In other words, dialogs that don't have the most up-to-date theming.  Here's an example I ran into earlier today:


    Windows has a fair number of dialogs like this - they're often fairly old dialogs that were written before new theming elements were added (or contain animations that predate newer theming options).  They all work correctly but they're just ... old.

    Usually the web site wants the Windows team update the dialog to match the newest styling's because the dialog is "wrong".

    Whenever someone asks (or more often insists) that the Windows team update their particular old dialog, I sometimes want to turn around and ask them a question:

    "You get to choose: You can get this dialog fixed OR you can cut a feature from Windows, you can't get both.  Which feature in Windows would you cut to change this dialog?"

    Perhaps an automotive analogy would help explain my rather intemperate reaction:

    One of the roads near my house is a cement road and the road is starting to develop a fair number of cracks in it.  The folks living near the road got upset at the condition of the road and started a petition drive to get the county to repair the road.  Their petition worked and county came out a couple of weeks later and inspected the road and rendered their verdict on the repair (paraphrasing):  We've looked at the road surface and it is 60% degraded.  The threshold for immediate repairs on county roads is 80% degradation.  Your road was built 30 years ago and cement roads in this area have a 40 year expected lifespan.  Since the road doesn't meet our threshold for immediate repair and it hasn't met the end of its lifespan, we can't justify moving this section of road up ahead of the hundreds of other sections of road that need immediate repair.

    In other words, the county had a limited budget for road repairs and there were a lot of other sections of road in the county that were in a lot worse shape than the one near my house.

    The same thing happens in Windows - there are thousands of features in Windows and a limited number of developers who can change those features.   Changing a dialog does not happen for free.  It takes time for the developers to fix UI bugs.  As an example, I just checked in a fix for a particularly tricky UI bug.  I started working on that fix in early October and it's now January.

    Remember, this dialog works just fine, it's just a visual inconsistency.  But it's going to take a developer some amount of time to fix the dialog.  Maybe it's only one day.  Maybe it's a week.  Maybe the fix requires coordination between multiple people (for example, changing an icon usually requires the time of both a developer AND a graphic designer).  That time could be spent working on fixing other bugs.  Every feature team goes through a triage process on incoming bugs to decide which bugs they should fix.  They make choices based on their limited budget (there are n developers on the team, there are m bugs to fix, each bug takes t time to fix on average, that means we need to fix (m*t)/n bugs before we can ship).

    Fixing theming bug like this takes time that could be spent fixing other bugs.  And (as I've said before) the dialog does work correctly, it's just outdated.

    So again I come back to the question: "Is fixing a working but ugly dialog really more important than all the other bugs?"  It's unfortunate but you have to make a choice.


    PS: Just because we have to make choices like this doesn't mean that you shouldn't send feedback like this.   Just like the neighbors complaining to the county about the road, it helps to let the relevant team know about the issue. Feedback like this is invaluable for the Windows team (that's what the "Send Feedback" link is there for after all).  Even if the team decides not to fix a particular bug in this release it doesn't mean that it won't be fixed in the next release.

  • Larry Osterman's WebLog

    Why is the DOS path character "\"?

    Many, many months ago, Declan Eardly asked why the \ character was chosen as the path separator.

    The answer's from before my time, but I do remember the original reasons.

    It all stems from Microsoft's relationship with IBM.  For DOS 1.0, DOS only supported floppy disks.

    Many of the DOS utilities (except for were written by IBM, and they used the "/" character as the "switch" character for their utilities (the "switch" character is the character that's used to distinguish command line switches - on *nix, it's the "-" character, on most DEC operating systems (including VMS, the DECSystem-20 and DECSystem-10), it's the "/" character" (note: I'm grey on whether the "/" character came from IBM or from Microsoft - several of the original MS-DOS developers were old-hand DEC-20 developers, so it's possible that they carried it forward from their DEC background).

    The fact that the "/" character conflicted with the path character of another relatively popular operating system wasn't particularly relevant to the original developers - after all, DOS didn't support directories, just files in a single root directory.

    Then along came DOS 2.0.  DOS 2.0 was tied to the PC/XT, whose major feature was a 10M hard disk.  IBM asked the Microsoft to add support for hard disks, and the MS-DOS developers took this as an opportunity to add support for modern file APIs - they added a whole series of handle based APIs to the system (DOS 1.0 relied on an application controlled structure called an FCB).  They also had to add support for hierarchical paths.

    Now historically there have been a number of different mechanisms for providing hierarchical paths.  The DecSystem-20, for example represented directories as: "<volume>:"<"<Directory>[.<Subdirectory>">"FileName.Extension[,Version]" ("PS:<SYSTEM>MONITR.EXE,4").   VMS used a similar naming scheme, but instead of < and > characters it used [ and ] (and VMS used ";" to differentiate between versions of files).  *nix defines hierarchical paths with a simple hierarchy rooted at "/" - in *nix's naming hierarchy, there's no way of differentiating between files and directories, etc (this isn't bad, btw, it just is).

    For MS-DOS 2.0, the designers of DOS chose a hybrid version - they already had support for drive letters from DOS 1.0, so they needed to continue using that.  And they chose to use the *nix style method of specifying a hierarchy - instead of calling the directory out in the filename (like VMS and the DEC-20), they simply made the directory and filename indistinguishable parts of the path.

    But there was a problem.  They couldn't use the *nix form of path separator of "/", because the "/" was being used for the switch character.

    So what were they to do?  They could have used the "." character like the DEC machines, but the "." character was being used to differentiate between file and extension.  So they chose the next best thing - the "\" character, which was visually similar to the "/" character.

    And that's how the "\" character was chosen.

    Here's a little known secret about MS-DOS.  The DOS developers weren't particularly happy about this state of affairs - heck, they all used Xenix machines for email and stuff, so they were familiar with the *nix command semantics.  So they coded the OS to accept either "/" or "\" character as the path character (this continues today, btw - try typing "notepad c:/boot.ini"  on an XP machine (if you're an admin)).  And they went one step further.  They added an undocumented system call to change the switch character.  And updated the utilities to respect this flag.

    And then they went and finished out the scenario:  They added a config.sys option, SWITCHAR= that would let you set the switch character to "-".

    Which flipped MS-DOS into a *nix style system where command lines used "-switch", and paths were / delimited.

    I don't know the fate of the switchar API, it's been long gone for many years now.


    So that's why the path character is "\".  It's because "/" was taken.

    Edit: Fixed title - it's been bugging me all week.


  • Larry Osterman's WebLog

    So what's wrong with DRM in the platform anyway?


    As I said yesterday, it's going to take a bit of time to get the next article in the "cdrom playback" series working, so I thought I'd turn the blog around and ask the people who read it a question.

    I was reading Channel9 the other day, and someone turned a discussion of longhorn into a rant against the fact that Longhorn's going to be all about DRM (it's not, there will be DRM support in Longhorn, just like there has been DRM support in just about every version of Windows that's distributed windows media format).

    But I was curious.  Why is it so evil that a platform contain DRM support?

    My personal opinion is that DRM is a tool for content producers.  Content Producers are customers, just like everyone else that uses our product is a customer.  They want a platform that provides content protection.  You can debate whether or not that is a reasonable decision, but it's moot - the content producers today want it.

    So Microsoft, as a platform vendor provides DRM for the content producers.  If we didn't, they wouldn't use our media formats, they'd find some other media format that DOES have DRM support for their content.

    The decision to use (or not use) DRM is up to the content producer.  It's their content, they can decide how to distribute it.  You can author and distribute WMA/WMV files without content protection - all my ripped CDs are ripped without content protection (because I don't share them).  I have a bunch of WMV files shot on the camcorder that aren't DRM'ed - they're family photos, there's no point in using rights management.

    There are professional content producers out there that aren't using DRM for their content (Thermal and a Quarter is a easy example I have on the tip of my tongue (as I write this, they've run out of bandwidth :( but...)).  And there are content producers that are using DRM.

    But why is it evil to put the ability to use DRM into the product?

  • Larry Osterman's WebLog

    What's wrong with this code, part 5


    So there was a reason behind yesterday’s post about TransmitFile, it was a lead-in for “What’s wrong with this code, part 5”.

    Consider the following rather naïve implementation of TransmitFile (clearly this isn’t the real implementation; it’s just something I cooked up on the spot):

    bool TransmitTcpBuffer(SOCKET socket, BYTE *buffer, DWORD bufferSize)
          DWORD bytesWritten = 0;
                DWORD bytesThisWrite = 0;
                if (!WriteFile((HANDLE)socket, &buffer[bytesWritten], bufferSize - bytesWritten, &bytesThisWrite, NULL))
                      return false;
                bytesWritten += bytesThisWrite;
          } while (bytesWritten < bufferSize);
          return true;

    bool MyTransmitFile(SOCKET socket, HANDLE fileHandle, BYTE *bufferBefore, DWORD bufferBeforeSize, BYTE *bufferAfter, DWORD bufferAfterSize)
          DWORD bytesRead;
          BYTE  fileBuffer[4096];

          if (!TransmitTcpBuffer(socket, bufferBefore, bufferBeforeSize))
                return false;

                if (!ReadFile(fileHandle, (LPVOID)fileBuffer, sizeof(fileBuffer), &bytesRead, NULL))
                      return false;
                if (!TransmitTcpBuffer(socket, fileBuffer, bytesRead))
                      return false;

          } while (bytesRead == sizeof(fileBuffer));

          if (!TransmitTcpBuffer(socket, bufferAfter, bufferAfterSize))
                return false;
          return true;

    Nothing particular exciting, but it’s got a big bug in it.  Assume that the file in question is opened for sequential I/O, that the file pointer is positioned correctly, and that the socket is open and bound before the API is called.  The API doesn’t close the socket or file handle on failure; it’s the responsibility of the caller to do so (closing the handles would be a layering violation).  The code relies on the fact that on Win32 (and *nix) a socket is just a relabeled file handle.

    As usual, answers and kudos tomorrow.




  • Larry Osterman's WebLog

    Turning the blog around - End of Life issues.


    I'd like to turn the blog around again and ask you all a question about end-of-life issues.

    And no, it's got nothing to do with Terry Schaivo.

    Huge amounts of text have been written about Microsoft's commitment to platform stability.

    But platform stability comes with an engineering cost.  It gets expensive maintaining old code - typically it's not written to modern coding standards, the longer that it exists, the more heavily patched it becomes, etc.

    For some code that's sufficiently old, the amount of engineering that's needed to move the code to a new platform can become prohibitively expensive (think about what would be involved in porting code originally written for MS-DOS to a 128bit platform).

    So for every API, the older it gets, the more the temptation exists to find a way of ending its viable lifetime.

    On the other hand, you absolutely can't break applications.  And not just the applications that are commercially available - If a customer's line-of-business application fails because you decided to remove an API, you're going to have to put the API back.

    So here's my question: Under what circumstances is it ok to remove an API from the operating system?  Must you carry them on forever?

    This isn't just a Microsoft question.  It's a platform engineering problem - if you're committed to a stable platform (in other words, on your platform, you're not going to break existing applications on a new version of the platform), then you're going to have to face these issues.

    I have some opinions on this (no, really?) but I want to hear from you folks before I spout off on them.

Page 1 of 33 (815 items) 12345»