Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Windows Vista Sound causes Network Throughput slowdowns.

Windows Vista Sound causes Network Throughput slowdowns.

Rate This
  • Comments 62

AKA: How I spent last week :).

On Tuesday Morning last week, I got an email from "":

You've probably already seen this article, but just in case I'd love to hear your response.

Playing Music Slows Vista Network Performance?

In fact, I'd not seen this until it was pointed out to me.  It seemed surprising, so I went to talk to our perf people, and I ran some experiments on my own.

They didn't know what was up, and I was unable to reproduce the failure on any of my systems, so I figured it was a false alarm (we get them regularly).  It turns out that at the same time, the networking team had heard about the same problem and they WERE able to reproduce the problem.  I also kept on digging and by lunchtime, I'd also generated a clean reproduction of the problem in my office.

At the same time, Adrian Kingsley-Hughes over at ZDNet Blogs picked up the issue and started writing about the issue.

By Friday, we'd pretty much figured out what was going on and why different groups were seeing different results - it turns out that the issue was highly dependent on your network topology and the amount of data you were pumping through your network adapter - the reason I hadn't been able to reproduce it is that I only have a 100mbit Ethernet adapter in my office - you can get the problem to reproduce on 100mbit networks, but you've really got to work at it to make it visible.  Some of the people working on the problem sent a private email to Adrian Kingsley-Hughes on Friday evening reporting the results of our investigation, and Mark Russinovich (a Technical Fellow, and all around insanely smart guy) wrote up a detailed post explaining what's going on in insane detail which he posted this morning.

Essentially, the root of the problem is that for Vista, when you're playing multimedia content, the system throttles incoming network packets to prevent them from overwhelming the multimedia rendering path - the system will only process 10,000 network frames per second (this is a hideously simplistic explanation, see Mark's post for the details)

For 100mbit networks, this isn't a problem - it's pretty hard to get a 100mbit network to generate 10,000 frames in a second (you need to have a hefty CPU and send LOTS of tiny packets), but on a gigabit network, it's really easy to hit the limit.


One of the comments that came up on Adrian's blog was a comment from George Ou (another zdnet blogger):

""The connection between media playback and networking is not immediately obvious. But as you know, the drivers involved in both activities run at extremely high priority. As a result, the network driver can cause media playback to degrade."

I can't believe we have to put up with this in the era of dual core and quad core computers. Slap the network driver on one CPU core and put the audio playback on another core and problem solved. But even single core CPUs are so fast that this shouldn't ever be a problem even if audio playback gets priority over network-related CPU usage. It's not like network-related CPU consumption uses more than 50% CPU on a modern dual-core processor even when throughput hits 500 mbps. There’s just no excuse for this."

At some level, George is right - machines these days are really fast and they can do a lot.  But George is missing one of the critical differences between multimedia processing and other processing.

Multimedia playback is fundamentally different from most of the day-to-day operations that occur on your computer. The core of the problem is that multimedia playback is inherently isochronous. For instance, in Vista, the audio engine runs with a periodicity of 10 milliseconds. That means that every 10 milliseconds, it MUST wake up and process the next set of audio samples, or the user will hear a "pop" or “stutter” in their audio playback. It doesn’t matter how fast your processor is, or how many CPU cores it has, the engine MUST wake up every 10 milliseconds, or you get a “glitch”.

For almost everything else in the system, if the system locked up for even as long as 50 milliseconds, you’d never notice it. But for multimedia content (especially for audio content), you absolutely will notice the problem. The core reason behind it has to do with the physics of sound, but whenever there’s a discontinuity in the audio stream, a high frequency transient is generated. The human ear is quite sensitive to these high frequency transients (they sound like "clicks" or "pops"). 

Anything that stops the audio engine from getting to run every 10 milliseconds (like a flurry of high priority network interrupts) will be clearly perceptible. So it doesn’t matter how much horsepower your machine has, it’s about how many interrupts have to be processed.

We had a meeting the other day with the networking people where we demonstrated the magnitude of the problem - it was pretty dramatic, even on the top-of-the-line laptop.  On a lower-end machine it's even more dramatic.  On some machines, heavy networking can turn video rendering to a slideshow.


Any car buffs will immediately want to shoot me for this analogy, because I’m sure it’s highly inaccurate (I am NOT a car person), but I think it works: You could almost think of this as an engine with a slip in the timing belt – you’re fine when you’re running the engine at low revs, because the slip doesn’t affect things enough to notice. But when you run the engine at high RPM, the slip becomes catastrophic – the engine requires that the timing be totally accurate, but because it isn’t, valves don’t open when they have to and the engine melts down.


Anyway, that's a long winded discussion.  The good news is that the right people are actively engaged on working to ensure that a fix is made available for the problem.

  • I for one don't think this is a huge issue.

    Any server would not have any multimedia playing.

    And anybody playing multimedia will not need massive networking; everybody had IPODs and even the crippled network should be enough to watch streaming media.

    I am wondering if these realtime enhancements would apply to other realtime threads or just the audio, if it's all then we could use windows for more "real-time" tasks!

  • Thanks for posting this information. I'm sure a tested fix will take some time but it's good to just have an explanation so that people can't spread ridiculous, baseless FUD about how the problem is caused by DRM and so on.

    (I'm no fan of DRM but I dislike the wrong thing being blamed for a problem.)

  • But on a dual core system, why cant the interrupts be set to run on separate cpus (audio on one, network on the other)? In our voice-call-routing systems, thats what we went out of our way to do (make sure our T1 voice lines had their interrupts serviced separately).

  • PingBack from

  • Nik: It shows up while copying files around, and on a gig network it matters.

    Michael: Because that's not the way the hardware works.  On multicore machines, the hardware interrupts are only serviced on processor 0.  Even on true MP machines, many of them only service interrupts on processor 0.  There's nothing an operating system can do to fix it.

    However there ARE some things that you can do to mitigate the issue, and the right people are working on creating the fix.

  • There's some things that haven't been explained : Why the issue doesn't repro on XP, according to reports you can get high speeds without glitches in audio there. I haven't verified this yet though as I'm lacking other gigabit capable computer right now.

    There's also several reports that using the power configuration that changes CPU speed/multiplier based on load can cause glitches and I can verify this one.

    What's really interesting is that the glitches *only* affect the DirectX audio on vista. If I use ASIO4ALL(.com) that bypasses the Vista audio stack the glitches stop even during these power transitions.

  • Joku: It doesn't reproduce on XP because MMCSS doesn't exist on XP.  And those reports are totally wrong.  You can't get the kind of throughput we're talking about here on XP without turning multimedia playback to a slideshow (especially if you enable ipsec).  Our perf team has the demos to prove it.  To blow away multimedia playback on XP, all you need to do is to have an app that loads the CPU running at priority 15.

    And ASIO shouldn't matter - ASIO apps run at the same priority as everything else, and network interrupts will preempt them either way.

    Btw, you should use a tool like TTCP to generate network traffic.  Otherwise there are other processes that get in the way and change the results (like file I/O time, etc).  TTCP has the advantage of being a raw networking I/O test.

  • @ Nik:

    Although one my think this to be true, in the broadcasting industry 'Audio servers' keep an online cache of around 100GB of 24 bit WAV audio that normally resides in a tape or MAID Archive. I won't do the maths but a typical broadast audio server has 16 outputs, each under seperate control for a different TV channel. Voice overs, Audio descrption services, mutiple languages for the same program, etc,etc. Getting 16 channel of all this content in 24bit WAV files from an archive is pretty intensive, and although we still use WinServer2003, this particular issue could have big consequences for us if we moved to Vista, or even WinServer 2008!

    Anyway, why doesn't this hard coded value ramp up with CPU speed? Could this value be an output of the windows Experience index?


  • Steve, it's hard coded because someone screwed up.  The spec said that it was supposed to be controlled by a registry key (and disabled under certain circumstances).

    Unfortunately that didn't happen, someone screwed up.

  • "And anybody playing multimedia will not need massive networking..."

    I have to wonder if that is true.

    What if one has a Vista MCE computer with HDTV tuners, able to watch and record HDTV, and then what if there are other computers on a gigabit network simultaneously watching (aka streaming) recorded HDTV off that same Vista computer.

    Seems there would be a very common need to support both a heavy network requirement and multimedia playback.

  • WhatAboutHD: By our measurements, you can run at least 2 HD 1080P video streams over the network without encountering this issue.

    This really is limited to file copies or other hideously network intensive operations.

  • Larry: I got the impression from Russinovich's blog post that the problem tends to be much worse in a computer with multiple network adapters (ie 10.000 packets/sec becomes 6.000 packets/sec with three adapters).

    I may be an extreme example but I've got 7 adapters on my machine. 2 VPN, a 1394, a LAN, a WLAN and two VMWare. Does this affect the speeds or does the adapters need to be in-use? I guess that 2 (or three if "1394" counts) is standard nowadays?

    I'm running Windows XP so I can unfortunately not try it out.

  • August, it is.  The VPN and 1394 don't count, but the vmware ones do.

    That's a part of the things we need to fix.

  • From Mark's post, there are two issues.  One is prioritizing the CPU usage of the multimedia threads, and the other is capping of the number network packets received per second.

    Will the solution offer the user a way to say "I don't mind the occasional glitch, please prioritize network performance over the playback of some lame podcast I'm barely paying attention to?"  It seems Vista makes the assumption that users will always prefer glitch-free playback over everything else.

    Even intense network traffic is bursty.  Isn't it possible to use more audio buffers to stay a little farther ahead of the playback, which could absorb some missed interrupts and only glitch if heavy network traffic is sustained?

    The CD player in my car glitches about once per hour.  I hardly notice it anymore.  It's still better than listening to lossy MP3s.

  • Prioritizing the multimedia threads over the rest of the OS isn't an issue actually.  The MMCSS service prevents any MMCSS managed thread from consuming more than 80% of the CPU (it's actually way more complicated than that, but you can use that as a rule-of-thumb).

    The core issue is that on gigabit networks, even though the network stack queues the incoming packets, if the rate of incoming packets gets high enough, then the network stack will hold off the multimedia stack for 10s of milliseconds at a time.  If the network stack had ever had a chance to run, things would be seemless, but...

    The audio stack had to make a trade-off between low latency (smaller buffers) and fewer glitches (bigger buffers).  We're already getting flack because the latency of the audio engine in shared mode is greater than the latency of the XP audio stack, so increasing the buffer size is not an option.

    The relevent teams have all the data that they need to come up with a good solution for the problem and they're actively working on it.

    Please Note - this comment was edited to reflect the actual consumption allowed by MMCSS

Page 1 of 5 (62 items) 12345