AKA: How I spent last week :).
On Tuesday Morning last week, I got an email from "email@example.com":
You've probably already seen this article, but just in case I'd love to hear your response. http://it.slashdot.org/article.pl?sid=07/08/21/1441240 Playing Music Slows Vista Network Performance?
You've probably already seen this article, but just in case I'd love to hear your response.
Playing Music Slows Vista Network Performance?
In fact, I'd not seen this until it was pointed out to me. It seemed surprising, so I went to talk to our perf people, and I ran some experiments on my own.
They didn't know what was up, and I was unable to reproduce the failure on any of my systems, so I figured it was a false alarm (we get them regularly). It turns out that at the same time, the networking team had heard about the same problem and they WERE able to reproduce the problem. I also kept on digging and by lunchtime, I'd also generated a clean reproduction of the problem in my office.
At the same time, Adrian Kingsley-Hughes over at ZDNet Blogs picked up the issue and started writing about the issue.
By Friday, we'd pretty much figured out what was going on and why different groups were seeing different results - it turns out that the issue was highly dependent on your network topology and the amount of data you were pumping through your network adapter - the reason I hadn't been able to reproduce it is that I only have a 100mbit Ethernet adapter in my office - you can get the problem to reproduce on 100mbit networks, but you've really got to work at it to make it visible. Some of the people working on the problem sent a private email to Adrian Kingsley-Hughes on Friday evening reporting the results of our investigation, and Mark Russinovich (a Technical Fellow, and all around insanely smart guy) wrote up a detailed post explaining what's going on in insane detail which he posted this morning.
Essentially, the root of the problem is that for Vista, when you're playing multimedia content, the system throttles incoming network packets to prevent them from overwhelming the multimedia rendering path - the system will only process 10,000 network frames per second (this is a hideously simplistic explanation, see Mark's post for the details)
For 100mbit networks, this isn't a problem - it's pretty hard to get a 100mbit network to generate 10,000 frames in a second (you need to have a hefty CPU and send LOTS of tiny packets), but on a gigabit network, it's really easy to hit the limit.
One of the comments that came up on Adrian's blog was a comment from George Ou (another zdnet blogger):
""The connection between media playback and networking is not immediately obvious. But as you know, the drivers involved in both activities run at extremely high priority. As a result, the network driver can cause media playback to degrade." I can't believe we have to put up with this in the era of dual core and quad core computers. Slap the network driver on one CPU core and put the audio playback on another core and problem solved. But even single core CPUs are so fast that this shouldn't ever be a problem even if audio playback gets priority over network-related CPU usage. It's not like network-related CPU consumption uses more than 50% CPU on a modern dual-core processor even when throughput hits 500 mbps. There’s just no excuse for this."
""The connection between media playback and networking is not immediately obvious. But as you know, the drivers involved in both activities run at extremely high priority. As a result, the network driver can cause media playback to degrade."
I can't believe we have to put up with this in the era of dual core and quad core computers. Slap the network driver on one CPU core and put the audio playback on another core and problem solved. But even single core CPUs are so fast that this shouldn't ever be a problem even if audio playback gets priority over network-related CPU usage. It's not like network-related CPU consumption uses more than 50% CPU on a modern dual-core processor even when throughput hits 500 mbps. There’s just no excuse for this."
At some level, George is right - machines these days are really fast and they can do a lot. But George is missing one of the critical differences between multimedia processing and other processing.
Multimedia playback is fundamentally different from most of the day-to-day operations that occur on your computer. The core of the problem is that multimedia playback is inherently isochronous. For instance, in Vista, the audio engine runs with a periodicity of 10 milliseconds. That means that every 10 milliseconds, it MUST wake up and process the next set of audio samples, or the user will hear a "pop" or “stutter” in their audio playback. It doesn’t matter how fast your processor is, or how many CPU cores it has, the engine MUST wake up every 10 milliseconds, or you get a “glitch”.
For almost everything else in the system, if the system locked up for even as long as 50 milliseconds, you’d never notice it. But for multimedia content (especially for audio content), you absolutely will notice the problem. The core reason behind it has to do with the physics of sound, but whenever there’s a discontinuity in the audio stream, a high frequency transient is generated. The human ear is quite sensitive to these high frequency transients (they sound like "clicks" or "pops").
Anything that stops the audio engine from getting to run every 10 milliseconds (like a flurry of high priority network interrupts) will be clearly perceptible. So it doesn’t matter how much horsepower your machine has, it’s about how many interrupts have to be processed.
We had a meeting the other day with the networking people where we demonstrated the magnitude of the problem - it was pretty dramatic, even on the top-of-the-line laptop. On a lower-end machine it's even more dramatic. On some machines, heavy networking can turn video rendering to a slideshow.
Any car buffs will immediately want to shoot me for this analogy, because I’m sure it’s highly inaccurate (I am NOT a car person), but I think it works: You could almost think of this as an engine with a slip in the timing belt – you’re fine when you’re running the engine at low revs, because the slip doesn’t affect things enough to notice. But when you run the engine at high RPM, the slip becomes catastrophic – the engine requires that the timing be totally accurate, but because it isn’t, valves don’t open when they have to and the engine melts down.
Anyway, that's a long winded discussion. The good news is that the right people are actively engaged on working to ensure that a fix is made available for the problem.
George: 3rd party should be incapable of introducing glitches in Vista as long as MMCSS is running. With MMCSS present, there are basically only two things that can generate glitches. The first is long DPCs (which is the problem with the network), the second is if the disk is so utterly hammered that the I/Os to retrieve the multimedia data can't be read from the disk.
I suspect that the glitches you saw may very well be a result of the same networking issue (but of course I don't know for sure since I've not collected perf traces on your machine).
With the 10K throttleing you shouldn't see any glitches. Jumbo frames won't make a difference.
Tim: Welcome to the world of Microsoft :).
Both Raymond Chen and Ed Bott periodically goes into a rage about people ponying up advice on how to speed up Windows by tweaking registry keys that don't even exist.
It's fun :).
Let me say kudos for admitting the mistake, but...
Larry said : "I hate having to say this, but.... Trust us."
No can do Larry, not anymore. People trusted Microsoft too often in the past and each time they got screwed. Time for trust is over, people want proof.
Larry said : "The people working on this have literally decades of experience in designing extremely high performance systems"
I doubt it. If they have so much experience then they would:
1. Be able to make 2ms audio latency possible
2. Not make this stupid mistake
or at least:
3. Catch this problem in alpha version of Vista
This is a shame, it is a clear sign of poor design, and it undermines customer trust even further.
About CPUs and interrupts, other posters are right and I can confirm it (and you can read it in proper documentation) -- any CPU in multi-core system can service interrupts. You just need to reconfigure an I/O APIC before booting application CPUs. If you still have any doubts, get any live Linux distro on a bootable CD and try it out.
I believe that the cap on the network bandwidth (at least in the current form) is not neccessary at all.
Igor, I'm sorry you feel that way. It's possible that in the future, Mark will write a follow-on article which will express all the myriad of trade-offs involved in the discussions (which ranged from the capabilities of various network cards, processor design, scheduler design, test matrixes, and a boatload of other factors). But I doubt it.
I mentioned the root cause of this issue above: The networking people weren't testing multimedia and the multimedia people didn't have the hardware necessary to test the network. Stuff happens, we realized the issue and we'll learn from it. One of the outcomes of the discussions about this problem is a better understanding of the issues that BOTH teams face to help ensure that mistakes like this don't happen again.
And that is the last I will say on this subject. You can talk amongst yourselves if you like, but I'm done with this particular thread.
Sorry about that.
I'm a car guy and a computer guy and you are correct about your analogy; it isn't analogous... if a timing belt were to begin slipping, the engine, at the least, would quit running and would require physical attention before it would even run at idle speed again. At the worst, mechanical damage would occur, again requiring physical attention before running again.
A better analogy, although still not perfect, is to say that as the engine has to run faster the ignition can't keep up with the higher speed requirements and occasional it mis-fires, causing a noticable "stutter". The faster the engine runs, the worse the mis-fires. Reduce the engine speed and the mis-fires go away.
I just run a test using XP SP2 to see how this all turns out on that system:
Box 1: Xeon, 3 GHz with Gigabit Ethernet, single chip 2 hyperthreaded processors
Box 2: Core Duo, 2 GHz with Gigabit Ethernet, single chip 2 cores
I used some custom software to blast packets between the computers on a TCP connection. This software only sends/receives the packets without looking at the data in them. Windows firewall was off. On box 2 I played a low resolution video (320 x 240, Audio: Windows Media Audio 9.1 32 kbps, 22 kHz, stereo (A/V) 1-pass CBR, Video: Windows Media Video 9) in Windows Media Player 9.
In all cases the video played and there were no sound clicks or pauses. In some cases, the video had small pauses or jerks. The Ethernet speed did not decrease in any test but continued at full speed.
Sending from Box 1 to Box 2: got to 50% on Gigabit with about 50% CPU on box 1 and 45% on box 2. Mark's Process Explorer showed about 20,000 context switches per second on interrupts and DPCs.
Sending from Box 2 to Box 1: got to 30% on Gigabit with about 25% CPU on box 1 and 60% on box 2.
Also sent a 1 GB folder of image files back and forth between the boxes. Network speed never got above about 15% in these tests.
Interesting Note: The newer dual core machine could not send packets as fast as the older Xeon.
Conclusions: CPU usage is high when sending network packets at high speed on XP. The media player has no problems with playing sound clearly with a high level of interrupts and DPCs.
"George: 3rd party should be incapable of introducing glitches in Vista as long as MMCSS is running. With MMCSS present, there are basically only two things that can generate glitches. The first is long DPCs (which is the problem with the network), the second is if the disk is so utterly hammered that the I/Os to retrieve the multimedia data can't be read from the disk.
I suspect that the glitches you saw may very well be a result of the same networking issue (but of course I don't know for sure since I've not collected perf traces on your machine)."
Oh we may have a miscommunications here. I was getting glitches in DVD playback while network test was HALTED and nothing else was going on. I was not having glitches while the 300 mbps test was in progress.
on my Vista Business I ran "net start" and did not see MMCSS service.
and mmcss.dll is not present in %windir%\system32.
seems one day I removed it as I thought is was a tough trojan horse which changes the dependency of Windows Audio service to prevent removing.
Igor: For someone who obviously doesn't hold Microsoft in very high regard, you sure have no problem with holding them to a standard of absolute perfection.
Bugs happen. If you are a programmer, you should know this. Operating systems are very complex and Windows is no exception. In terms of bugs, other OSes are no better, really, when compared to XP SP2 or Vista.
Wilhelm Svenselius: Bugs happens, sure, but a Network throttling is not a BUG, its a design decision, it was designed to do that, its not there by a software bug, it was a intentionally introduced "feature".
Then I ask, what kind of software designer comes with a such poor solution to the "Audio versus Network" problem?
Igor and OSguy, I am a bit surprised by your comments, because the motherboard I'm using (Tyan K8WE) will physically disable certain devices (like the second NIC) if the second CPU socket isn't populated. Programming the APIC won't help, because physically the connection is gone. (OTOH, in my config, some IRQs will then be services by the second CPU, but NIC1 and my soundcard are still both on CPU0)
It seems to me that it would be up to the CPU manufacturer to allow the second core to service interrupts, but it is none-to-obvious that this is the case, and it does not solve all cases (e.g. the typical multi-socket single-core configs).
I'd love to see Larry address this in a blog posting, but the "Linux does!" type of argumentation strikes me as a bit pointless. (Linux can do many strange and not so wonderful things as well)
Larry said: "And that is the last I will say on this subject."
Ok, but you still haven't explained why we can't have 2ms audio latency when those people are such an experts.
Wilhelm said: "For someone who obviously doesn't hold Microsoft in very high regard..."
Let me get this straight, I am _not_ a Microsoft hater (or ./ or Linux troll). True, I am a developer but that doesn't have anything to do with the subject.
Let me remind you -- subject is a bad design decision to throttle down network traffic in case it _might_ interfere with audio. You can replace "audio" in the previous sentence with any other task and it would still be wrong design decision.
Bikedude said: "I am a bit surprised by your comments..."
Why are you discussing something you don't understand?
I suggest you to start reading "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 3A - System Programming Guide" (document order #253668) chapter 8 titled "Advanced Programmable Interrupt Controller (APIC)" where it clearly says:
"In multiple processor (MP) systems, it sends and receives interprocessor interrupt (IPI) messages to and from other logical processors on the system bus. IPI messages can be used to distribute interrupts among the processors in the system or to execute system wide functions (such as, booting up processors or distributing work among a group of processors)."
While you are at it, take a good look at figures 8-2 and 8-3. Any CPU can handly any IRQ and it is not some Linux Voodoo -- it is clearly documented.
Why we are discussing this anyway?
Amiga 500 which was released in 1987 was capable of playing four channel audio, playing animation and copying floppies at the same time and all that on a 7.09 MHz CPU.
Compare that with dual-core Windows PC running at 3,000 MHz today which still blocks while reading a floppy or a CD, and whose audio stutters when you connect to the internet using an internal soft-modem -- to me it is blindingly obvious where the problem is coming from.
Bikedude - we are talking about a problem (sound causing network throughput slowdown) which is reproducible on dual core machines where two processor cores are present and interrupts can be routed to any one of them or both.
Igor nailed it that this throttling of network assuming that audio will skip is a really bad hack to cover up the problems in the networking stack - they can't consume packets without taking over the full CPU.
The reason Linux was highlighted was not because of fanboism but to illustrate the fact that it is possible to do Gigabit speed ethernet traffic and Audio on modern CPUs without having to throttle one or the other.
A nice scheduler ought to handle this situation just fine.
OSGuy said : "A nice scheduler ought to handle this situation just fine."
Unfortunately I don't think it would be that easy.
In my Gigabit NIC (Intel Pro/1000 PL) driver properties there is an entry named "Interrupt Moderation Rate" which can be set to one of:
I presume that things would get even worse if you set this to Off and thus disable IRQ moderation completely leading to an enormous amount of IRQs when NIC is heavily loaded.
My point is that as long as an OS cannot handle such high IRQ rate without dropping packets the issue at hand won't be truly fixed, it will just be worked around until we all get 10Gbps adapters and then it will surface again.
What is interesting is that the following Microsoft document claims that Vista can distribute IRQs to different cores and that it supports Message Signaled Interrupts:
Someone should either check his facts or urge for the document to be updated if what it says is incorrect. It is dated 2004, perhaps that was scrapped?
Igor is right, Vista distributes interrupts to both cores!