Earlier today, someone asked me why 64bit versions of windows don’t support the internal PC speaker beeps. The answer is somewhat complicated and ends up being an interesting intersection between a host of conflicting tensions in the PC ecosystem.
Let’s start by talking about how the Beep hardware worked way back in the day[1]. The original IBM PC contained an Intel 8254 programmable interval timer chip to manage the system clock. Because the IBM engineers felt that the PC needed to be able to play sound (but not particularly high quality sound), they decided that they could use the 8254 as a very primitive square wave generator. To do this, they programmed the 3rd timer on the chip to operate in Square Wave mode and to count down with the desired output frequency. This caused the Out2 line on the chip to toggle from high to low every time the clock went to 0. The hardware designers tied the Out2 line on the chip to the PC speaker and voila – they were able to use the clock chip to program the PC speaker to make a noise (not a very high quality noise but a noise nonetheless).
The Beep() Win32 API is basically a thin wrapper around the 8254 PIC functionality. So when you call the Beep() API, you program the 8254 to play sounds on the PC speaker.
Fast forward about 25 years… The PC industry has largely changed and the PC architecture has changed with it. At this point they don’t actually use the 8254 as the programmable interrupt controller, but it’s still in modern PCs. And that’s because the 8254 is still used to drive the PC speaker.
One of the other things that happened in the intervening 25 years was that machines got a whole lot more capable. Now machines come with capabilities like newfangled hard disk drives (some of which can even hold more than 30 megabytes of storage (but I don’t know why on earth anyone would ever want a hard disk that can hold that much stuff)). And every non server machine sold today has a PC sound card. So every single machine sold today has two ways of generating sounds – the PC sound card and the old 8254 which is tied to the internal PC speaker (or to a dedicated input on the sound card – more on this later).
There’s something else that happened in the past 25 years. PCs became commodity systems. And that started exerting a huge amount of pressure on PC manufacturers to cut costs. They looked at the 8254 and asked “why can’t we remove this?”
It turns out that they couldn’t. And the answer to why they couldn’t came from a totally unexpected place. The American’s with Disabilities Act.
The ADA? What on earth could the ADA have to do with a PC making a beep? Well it turns out that at some point in the intervening 25 years, the Win32 Beep() was used for assistive technologies – in particular the sounds made when you enable the assistive technologies like StickyKeys were generated using the Beep() API. There are about 6 different assistive technology (AT) sounds built into windows, their implementation is plumbed fairly deep inside the win32k.sys driver.
But why does that matter? Well it turns out that many enterprises (both governments and corporations) have requirements that prevent them from purchasing equipment that lacks accessible technologies and that meant that you couldn’t sell computers that didn’t have beep hardware to those enterprises.
This issue was first noticed when Microsoft was developing the first 64bit version of WIndows. Because the original 64bit windows was intended for servers, the hardware requirements for 64bit machines didn’t include support for an 8254 (apparently the AT requirements are relaxed on servers). But when we started building a client 64bit OS, we had a problem – client OS’s had to support AT so we needed to bring the beep back even on machines that didn’t have beep hardware.
For Windows XP this was solved with some custom code in winlogon which worked but had some unexpected complications (none of which are relevant to this discussion). For Windows Vista, I redesigned the mechanism to move the accessibility beep logic to a new “user mode system sounds agent”.
Because the only machines with this problem were 64bit machines, this functionality was restricted to 64bit versions of Windows.
That in turn meant that PC manufacturers still had to include support for the 8254 hardware – after all if the user chose to buy the machine with a 32bit operating system on it they might want to use the AT functionality.
For Windows 7, we resolved the issue completely – we moved all the functionality that used to be contained in Beep.Sys into the user mode system sounds agent – now when you call the Beep() API instead of manipulating the 8254 chip the call is re-routed into a user mode agent which actually plays the sounds.
There was another benefit associated with this plan: Remember above when I mentioned that the 8254 output line was tied to a dedicated input on the sound card? Because of this input to the sound card, the sound hardware needed to stay powered on at full power all the time because the system couldn’t know when an application might call Beep and thus activate the 8254 (there’s no connection between the 8254 and the power management infrastructure so the system can’t power on the sound hardware when someone programs the 3rd timer on the 8254). By redirecting the Beep calls through the system audio hardware the system was able to put the sound hardware to sleep until it was needed.
This redirection also had had a couple of unexpected benefits. For instance when you accidentally type (or grep) through a file containing 0x07 characters in it (like a .obj file) you can finally turn off the annoying noise – since the beeps are played through the PC speakers, the PC mute key works to shut them up. It also means that you can now control the volume of the beeps.
There were also some unexpected consequences. The biggest was that people started noticing when applications called Beep(). They had placed their PCs far enough away (or there was enough ambient noise) that they had never noticed when their PC was beeping at them until the sounds started coming out their speakers.
[1] Thus providing me with an justification to keep my old Intel component data catalogs from back in the 1980s.
Charles just let me know that he’s posted a video that Elliot, Frank and I did talking about the audio features added to Win7 and some of the architectural decisions that went into it.
Enjoy!
As I mentioned yesterday, the Windows SDK is now live. For the Windows SDK, there are 9 new samples (and one changed samples).
Two of the SDK samples demonstrate the WIndows 7 “Ducking” feature – they’re actually based on the code I wrote for my PDC talk last year, but tweaked to show some new scenarios and clean up the code (the original PDC code was not ready for prime time, it was strictly demo-ware).
The other 7 samples reproduce functionality in the old WinAudio SDK sample – instead of providing a single monolithic audio sample, we crafted different samples each of which shows one aspect of audio rendering. All of these samples are simple console applications which read their parameters from the command line. They’re intentionally extremely simple to reduce the potential for confusion in the samples.
The reason we don’t have capture exclusive samples is that we felt that users of the SDK could derive the exclusive capture samples from the render samples if it was important to them.
All the shared mode samples also show how to implement stream switching.
One of the things I’m quite proud about the samples is their structure. Each sample has the following basic file layout:
Directory of C:\Program Files\Microsoft SDKs\Windows\v7.0\Samples\multimedia\audio\RenderExclusiveEventDriven 08/07/2009 10:08 AM <DIR> . 08/07/2009 10:08 AM <DIR> .. 07/14/2009 06:54 PM 7,079 CmdLine.cpp 07/14/2009 06:54 PM 760 CmdLine.h 07/14/2009 06:54 PM 2,084 ReadMe.txt 07/14/2009 06:54 PM 533 stdafx.cpp 07/14/2009 06:54 PM 935 stdafx.h 07/14/2009 06:54 PM 1,067 targetver.h 07/14/2009 06:54 PM 1,925 ToneGen.h 07/14/2009 06:54 PM 18,376 WASAPIRenderer.cpp 07/14/2009 06:54 PM 2,560 WASAPIRenderer.h 07/14/2009 06:54 PM 14,754 WASAPIRenderExclusiveEventDriven.cpp 07/14/2009 06:54 PM 1,283 WASAPIRenderExclusiveEventDriven.sln 07/14/2009 06:54 PM 8,403 WASAPIRenderExclusiveEventDriven.vcproj 12 File(s) 59,759 bytes 2 Dir(s) 62,822,105,088 bytes free
Each sample has the same set of common files:
Each of these samples is essentially identical, in fact they’re sufficiently similar that you can use your favorite file comparison tool to see what code has to change to go from one mode to another. So to see what changes are required to go from exclusive timer driven to exclusive event driven, you can windiff the RenderExclusiveEventDriven\WASAPIRenderer.cpp and RenderExclusiveTimerDriven\WASAPIRenderer.cpp files and see what changes are required to implement the different model.
I just received email that the new Windows 7 SDK is now live! Apparently it’s not on the Windows SDK download site yet, but if you click on the first link, you can download an ISO which contains the SDK.
For the Win7 SDK, I wrote about 8 new samples in the Multimedia\Audio category, they’re described here (check the Multimedia\Audio samples. These samples replace the old WinAudio SDK sample – basically I took the old WinAudio sample and blew it up into 8 different samples which demonstrate different aspects of rendering audio using WASAPI.
I’ll write some more about the samples tomorrow – there are some cool things in the way the samples are structured that I hope will help developers who want to understand the difference between the various WASAPI render and capture modes.
Way back when I was in college, I learned Lisp using a derivative of Lisp called MACLISP (named for MIT’s project MAC, not for anything that came from a fruity company in Cupertino). One of the coolest features that MACLISP offered was the (DWIM) command – basically if you had a typo when entering an s-list into MACLISP, you could type (DWIM) and the MACLISP interpreter would fix your code for you (and yeah, it usually got it somewhat wrong :)).
Stream Switching is a DWIM feature in the audio stack. If an application is rendering to a default audio device and the device is removed, the audio stream will automatically switch to the new device. The same happens if the default device changes for other reasons (if the user changes the default device for example) or if the sample rate on the device changes (this can happen with certain types of audio hardware allow external controls to change the sample rate for the device).
We were able to figure out how to implement the stream switching logic in a fashion that causes it to work without requiring changes from 3rd party applications, which is really cool because it allows us to enable new scenarios without breaking appcompat – as long as the application is rendering to a default endpoint, we’ll stream switch the application without the app being notified.
If an application is rendering to a specific endpoint, we’re not going to stream switch when the endpoint is removed – we don’t know the reasons for the application choosing that particular endpoint so we don’t attempt to second guess the applications intent (maybe the user has asked that their music player only play through the headphones and not through the speakers because playing through the speakers would disturb the baby).
We also don’t support stream switching if the application is using WASAPI (the low level audio rendering APIs) to render audio. That’s for a number of reasons, but mostly it’s because the presumption is that any application that is using WASAPI is using a low level rendering API and thus doesn’t want this kind of behavior.
The stream switching logic is really cool in action, especially if you’ve got a machine which supports dynamic jack detection – when you’re watching a DVD in windows media player and you plug in a pair of headphones, poof – the audio gets redirected to the headphones just like you’d expect it to.
The capture monitor is a feature that allows you to listen to a portable media player (or any other microphone input) through your PC speakers.
First a bit of history. Way back in the dark ages (Windows XP timeframe), audio solution manufacturers used to include an analog circuitry that connected the line in to the speaker jack on the PC. They routed this through an analog volume control and exposed this through the audio device topology. People used this functionality to connect their portable media players to their PCs. While this feature was popular with customers, the cost of providing the circuitry was too much for some IHVs and they started removing this functionality from their products starting some time before Vista shipped.
Not surprisingly, customers complained about this and we decided to implement equivalent functionality in the audio subsystem in Windows. Because we’re doing this in the audio subsystem instead of in hardware, it allows you to configure the capture to run between any two devices.
To configure the capture monitor, you go to the properties page for the input device and select the “Listen” tab:
Once you select the “Listen to this device” checkbox (and hit apply), the system will start capturing data from the input and rendering it to the output device. I’ve been using it to listen to my media player for months :).
I’ve been really gratified to see that both Ed Bott and Lifehacker (who picked this up from Ed) have noticed this feature, it was a huge amount of fun to write. It was also my first experience using TDD (or rather a variant of TDD – instead of writing the test first then the code, I wrote the code and the test for the code at the exact same time) - based on my experiences, I’m totally sold on it as a development paradigm.
Well, we shipped Windows 7, and now I’d like to talk about a few of my favorite features that were added by the Sound team. Most of them fit in the “make it just work the way it’s supposed to”, but a few are just cool.
I also want to call out some stuff that people probably are going to miss in the various Windows 7 reviews.
One of the areas I want to call out is the volume UI. There’s actually been a ton of work done on the volume UI in Windows 7, although most of it exists under the covers. For instance, the simple volume control (the one you get to with a single click from the volume notification area) uses what we call “flat buttons”.
Windows 7 Simple Volume UI:
Windows Vista Simple Volume UI:
Both the mute control and the device button are “flat buttons” – when you mouse over the buttons, the button surfaces:
By using the “flat buttons”, the UI continues to have the old functionality, but it visually appears cleaner. There have been a number of other changes to the simple volume UI. First off, we will now show more than one slider if you have more than one audio solution on your machine and you’re using both of them at the same time. This behavior is controlled by the new volume control options dialog:
As I mentioned above, the device icon is also a “flat button” – this enables one click access to the hardware properties for you audio solution.
The volume mixer has also changed slightly. You’ll notice the flat buttons for the device and mute immediately. We also added a flat button for the System Sounds which launches the system sounds applet.
Another subtle change to the volume mixer is that there are now meters for individual applications as well as for the master volume:
And finally, the volume mixer no longer flickers when resizing (yay!). Fixing the flicker was a problem that took a ton of effort (and I needed to ask the User team for help figuring out the problem) – the solution turned out to be simple but it took some serious digging to figure it out.
A number of times in the past, I’ve mentioned that the PlaySound(xxx, xxx, SND_MEMORY|SND_ASYNC) pattern is almost always a bad idea. After the last wave of crash dumps were received for this problem, our team decided to do something about it. Starting with Windows 7, if you call PlaySound with SND_MEMORY|SND_ASYNC, instead of relying on the memory passed in by the application, we allocate our own buffer for the sound file on the heap and copy the file into that buffer. We’ll only do it for WAV files that are smaller than 2M in size, and if the allocation of the buffer fails, we fall back on the original code path, but it should dramatically reduce the number of apps that crash while using this pattern.
It’s a little thing, but it should make life much easier for those applications.
Whenever you submit a crash report to OCA, a bug gets filed in the relevant product database and gets automatically assigned to the developer responsible for the code. I had a crashing bug in the PlaySound API assigned to me.
In this case, the call was crashing deep inside of the waveOutOpen API and it was crashing because the input WAVEFORMATEX structure was bogus. The strange thing is that the PlaySound API does some fairly thorough validation of the input WAVEFORMATEX read from the .WAV file and that validation had to have passed to get to the call to waveOutOpen.
I looked a bit deeper and came to the realization that every single one of the crashes (in maybe a dozen different applications) had specified SND_MEMORY | SND_ASYNC in their call to PlaySound.
I’ve talked about that particular combination before in my blog, but I wanted to call it out in a top level post in the hopes that people will stop making this common mistake.
When you call PlaySound with the SND_MEMORY flag, it tells the PlaySound API that instead of reading the audio data from a file, you’re passing in a pointer to memory which holds the wave contents for you. That’s not controversial, and can be quite handy if (for instance) you want to build a .WAV file in memory instead of calling the wave APIs directly.
When you call PlaySound with the SND_ASYNC flag, that flag tells PlaySound that instead of blocking until the sound has finished playing, the API should return immediately instead of blocking while the sound is played.
Neither of these flags is controversial and neither of them is particularly dangerous until you combine the two together.
The problem is that there’s really no way of knowing when the sound has finished playing and thus when the application frees the memory, it’s entirely possible that the PlaySound API is still using it. That means that if you ever call PlaySound with both of these flags, you stand a very high chance of crashing due to the combination of these behaviors.
The unfortunate thing is that this behavior has existed since the SND_MEMORY flag was added back in Windows 3.1. The only safe way of dealing with this that works on all current Windows operating systems is to call PlaySound(NULL, 0, 0) before freeing the memory – the call to PlaySound(NULL, 0, 0) will block until the currently playing sound has completed playing (or abort the playsound if it hasn’t started yet).
I recently got a bug reported to me about the visuals in the sound control panel applet not being aligned properly (this is from the UI for a new Windows 7 feature):
The problem as reported was that the microphone was aligned incorrectly w.r.t. the down arrow. – the microphone was too far to the right.
But if you look carefully, you’ll see that that isn’t the case – drawing a box around the controls makes it clearer:
Nitpickers Corner: For those of you that love to count pixels, it’s entirely possible that the arrow might be off by a couple of pixels but fixing it wouldn’t fix the problem, because then the arrow would be off-center with respect to the speakers. The real problem is that the microphone icon is visually weighted to the right – the actual icon resource was lined up with the arrow, but because the visual weight was to the right, it displayed poorly.
It turns out that there’s really no good way of fixing this – if we were to adjust the location of the icons, it wouldn’t help, because a different device would have a different visual center (as the speaker icon does)…
Instead, we looked at the visuals and realized that there was an alternative solution: Adjust the layout for the dialog and the problem more-or-less goes away:
The problem still exists at some level because the arrow is centered with the icons but some icons (like the stalk microphone above) are bottom heavy. But for whatever reason, the visuals aren’t as disconcerting when laid out horizontally.
As I said in the title – sometimes you need to worry about the strangest things.
Sometimes the expectations of our customers mystify me.
One of the senior developers at Microsoft recently complained that the audio quality on his machine (running Windows Server 2008) was poor.
To me, it’s not surprising. Server SKUs are tuned for high performance in server scenarios, they’re not configured for desktop scenarios. That’s the entire POINT of having a server SKU – one of the major differences between server SKUs and client SKUs is that the client SKUs are tuned to balance the OS in favor of foreground responsiveness and the server SKUs are tuned in favor of background responsiveness (after all, its a server, there’s usually nobody sitting at the console, so there’s no point in optimizing for the console).
In this particular case, the documentation for the MMCSS service describes a large part of the root cause for the problem: The MMCSS service (which is the service that provides glitch resilient services for Windows multimedia applications) is essentially disabled on server SKUs. It’s just one of probably hundreds of other settings that are tweaked in favor of server responsiveness on server SKUs.
Apparently we’ve got a bunch of support requests coming in from customers who are running server SKUs on their desktop and are upset that audio quality is poor. And this mystifies me. It’s a server operating system – if you want client operating system performance, use a client operating system.
PS: To change the MMCSS tuning options, you should follow the suggestions from the MSDN article I linked to above:
The MMCSS settings are stored in the following registry key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Multimedia\SystemProfile This key contains a REG_DWORD value named SystemResponsiveness that determines the percentage of CPU resources that should be guaranteed to low-priority tasks. For example, if this value is 20, then 20% of CPU resources are reserved for low-priority tasks. Note that values that are not evenly divisible by 10 are rounded up to the nearest multiple of 10. A value of 0 is also treated as 10.
The MMCSS settings are stored in the following registry key:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Multimedia\SystemProfile
This key contains a REG_DWORD value named SystemResponsiveness that determines the percentage of CPU resources that should be guaranteed to low-priority tasks. For example, if this value is 20, then 20% of CPU resources are reserved for low-priority tasks. Note that values that are not evenly divisible by 10 are rounded up to the nearest multiple of 10. A value of 0 is also treated as 10.
For Vista, this value is set to 20, for Server 2008 the value is set to 100 (which disables MMCSS).
Because the alternative is often much worse.
Several months ago, I got a bug report that if you launched mmsys.cpl then set the “Select” sound to a value, then cleared the sound, the reporters application would ding whenever you moved around their tree control.
I dug into the problem a bit and discovered that the problem was actually in the Windows common controls. Under some circumstances the common controls would call PlaySound specifying a sound alias and the SND_ASYNC and SND_ALIAS flags only.
The problem with this is that if you specify SND_ALIAS without also specifying SND_NODEFAULT, the PlaySound API decides that you really want the sound to be played and it will play the default “ding” sound instead.
From the PlaySound API’s point of view this makes sense. After all, you asked the API to play a sound, it doesn’t know that you meant “only play this sound when the sound file represented by the alias exists”.
In fact, that’s the entire reason for the SND_NODEFAULT flag – it lets the PlaySound API that you only want to play a sound when the sound has been defined.
The PDC folks just announced a host of Windows 7 related PDC talks, one of which is mine.
The title of the talk is “Windows 7: Building Great Communications Applications”. You can find it under the Windows 7 track on the Microsoft PDC site.
The primary target for my talk is developers who are building an application that in any way communicates between users (voice mail, instant messaging, voice over ip, etc). In addition, if you’re a games developer or a media player developer, you should also attend, there’s stuff in the talk for you too.
There are also some other cool talks included in the list that I’m absolutely planning on attending.
See you in LA!
Someone just wandered over to my office and he had noticed the following pattern in his code:
PlaySound(NULL, NULL, SND_NODEFAULT); PlaySound(".Default", NULL, SND_SYSTEM | SND_ASYNC | SND_NODEFAULT);
He was wondering why on earth the code would do that call to PlaySound(NULL).
As I explained it to him, the reason is because you almost always want to cancel the current sound playing before you queue up the next sound.
The problem is that the current implementation[1] of PlaySound(…, SND_ASYNC) simply queues the request to a worker thread which blocks waiting on the currently playing sound to complete. So if you have a situation where you call PlaySound(…, SND_ASYNC) many times in succession, you’ll find that all the calls to PlaySound pile up behind each other, which means that you might end up playing sounds long after the action associated with the sound has completed.
Of course you might want this behavior – it’s certainly possible to string lots of system sounds together to produce any number of interesting effects.
But most of the time you just want to stop the current sound before you start playing the next.
[1] There are obviously no guarantees that the implementation of PlaySound won’t change – I’m just describing what the current code does. Even if the implementation is changed, it won’t change the underlying behavior.
Nils Arthur asked in another post:
While we are talking volume controls. Could you explain why it's only possible to lower the volume in Windows (i.e. setting a volume between 0% and 100%) and not raise it (i.e setting it higher than 100%)?
Before I get into the the answer, let me define some terms: Attenuation means reducing the amplitude of a signal from a baseline - so if the signal is a full range sine wave going from 1.0 to -1.0, if you attenuate it to 50%, you get a sine wave from 0.5 to -0.5. I wrote about it (with pictures :)) in this post.
The answer to Nils' question is both simple and complicated.
The simple part of the answer: Because most PC audio hardware only supports attenuation and not amplification.
Now for the complicated parts of the answer: We only support what the hardware allows for master volume. And most hardware only supports attenuation. There are a lot of reasons for that, but the primary one is that it's dramatically cheaper (and uses less power) to attenuate signals than it is to amplify them.
The other issue w.r.t. amplification/attenuation is signal quality. As I mentioned in the post on volume above, you can attenuate a sample in the digital domain without loss of fidelity. However when you attempt to amplify a signal in the digital domain, it clips. That means that amplification MUST be done in the analog domain. Again, this goes to hardware costs - because amplification needs to be done in the analog domain, it means that the audio hardware needs to have an amplifier that can be digitally controlled, which is (again) more expensive. The audio hardware doesn't even have to support hardware volume - if Windows doesn't find a hardware volume control, it simply inserts a master volume into the audio pipeline.
Some audio hardware DOES support amplification, but the audio volume controls map the volume control from low to high into a range from 0..100 because it's dramatically simpler to represent that to the user.
That means that if an audio solution presents a hardware volume from -96.0dB to +3dB (there are a number of them that do that), we'll map that 99dB range into a 0.0 to 1.0 range that maps nicely into a slider. We've thought about differentiating between attenuation and amplification in the volume UI, but the reality is that the net effect is the same whether we represent amplification or not.
You can see if your audio hardware supports amplification by going to the multimedia control panel. Select the audio endpoint you want to check, go to the "Properties" dialog. On that dialog, check the "Levels" tab, the hardware master volume control is present there. You can right click on the text box and change the units from linear to dB, you can then move the slider around to see the dB range. Or you could write some code and call the IAudioEndpointVolume::GetVolumeRange API, which will return the information directly.
The Intertubes are all atwitter with reports that Dell and other OEMs colluded with the RIAA to disable the Wave Out Mix option on new laptops.
Wow, what a tempest in a teapot. I just LOVE watching conspiracy theories as the echo chamber does it’s magic.
And of course it’s almost certainly hogwash (I don’t know for sure, but I do know that some of the rumors are totally stupid).
First off, what is Wave Out Mix? It’s an option that some audio manufacturers added to their audio hardware (Creative calls it “What U Hear”). Typically the Wave Out Mix is implemented by connecting the analog output from the DAC (Digital-to-Audio Converter) to a specific input on the ADC (Analog-to-Digital Converter) which is labeled as “Wave Out Mix”.
If you record on the Wave Out Mix input, you will capture the samples that are being played via Wave Out.
In Windows Vista, by default we only enable microphone, line in and digital inputs to the audio hardware (the theory being that users typically only want to be able to listen to those inputs). If the audio solution offers other inputs, they’re still there but we bury them somewhat.
You can find those additional inputs in mmsys if you start the sound control panel and go to the “Recording” tab. If you right click and select “Show Disabled Devices” you can enable those alternate inputs.
In addition, these days many OEMs don’t bother adding the Wave Out Mix support. It costs slightly more to order chips with Wave Out Mix support than it does to order chips without the functionality, and OEMs are incredibly cost conscious. The other reason is that for those OEMs that implemented the Wave Out Mix with an analog tap, you can achieve almost the same results with a $2.50 analog cable run between the output and the line in input of the machine.
Part of the reason that I know that this is just a conspiracy theory running rampant is that Windows Vista built the support for the Wave Out Mix input directly into the operating system. If you pass the AUDCLNT_STREAMFLAGS_LOOPBACK flag to the IAudioClient::Initialize method, then the audio system will initialize the engine in loopback mode. You can start capturing data off that IAudioClient object and you’ll get the post-mix output for the endpoint.
The loopback support was designed primarily for use by AEC functionality (which needs to be able to know what samples are being played), but it also allows you to perform essentially the same functionality as the Wave Out Mix hardware used to do.
If you want to play with the loopback functionality, the WinAudio SDK sample application allows you to capture using the loopback functionality.
Someone sent the following screen shot to one of our internal troubleshooting aliases. They wanted to know what the "Name Not Available" slider meant.
The audio system on Vista keeps track of the apps that are playing sounds (it has to, to be able to display the information on what apps are playing sounds :)). It keeps this information around for a period of time after the application has made the sound to enable the scenario where your computer makes a weird sound and you want to find out which application made the noise.
The system only keeps track of the PID for each application, it's the responsibility of the volume mixer to convert the PID to a reasonable name (the audio service can't track this information because of session 0 isolation).
This works great, but there's one possible problem: If an application exits between the time when the application made a noise and the system times out the fact that it played the noise, then the volume mixer has no way of knowing what the name of the application that made the noise was. In that case, it uses the "Name Not Available" text to give the user some information.
Steve Rowe (test lead on the sound team) points to a great article from Ars Technica on D2A:
If you want digital audio in a computer, you have to get it from somewhere. Usually that means taking analog sound out of the air and turning it into the bits that a computer can understand. Ars Technica gives us another installment of the AudioFile. This one covers the subject of Analog to Digital Conversion.
Well worth reading.
Skywing sent me an email earlier today asking me essentially "Why doesn't Windows do a better job of handling the case where the default audio device goes away?"[1]
It's a good question, and one that we've spent a lot of time working on for quite some time (this isn't a new issue for Vista, it's been there since day one, not that it actually matters).
For the vast majority of existing users, this isn't a problem - they only have one audio device on their machine, and thus its a moot question - their default device almost never goes away. On the other hand, looking forwards, it's going to become a more common scenario because of the prevalence of Bluetooth audio hardware and USB headsets (for example, I'm currently listening to Internet radio over a pair of Bluetooth speakers I purchased the other day).
The short answer to Skywing's question is: "It's the responsibility of the application to deal with handling errors". The audio stack bubbles out the error to the application and lets it figure out how to deal with the problem.
Which then begs the next question: "How do you handle errors so that you can recover from them?" It turns out that the answer to that question requires a bit of digging into the audio stack.
As I've discussed in the past, there are four major mechanisms for accessing the audio functionality in Windows Vista. They are:
For the MME APIs, an application usually accesses the audio stack using the WAVE_MAPPER (or MIDI_MAPPER) pseudo-device. The nice thing about the WAVE_MAPPER device is that it doesn't matter which device is the current device, it just uses the one chosen as the user's device. Alternatively, for the MME APIs, you can select a specific device. For the MME APIs, devices are numbered from 0 to <n>; for appcompat reasons, starting in Windows Server 2003, device 0 is typically the user's default device (there were applications that hard coded device 0 for their output device, which caused issues when you had more than one audio device).
For DirectShow and DirectSound, the call to the DirectSoundCreate API takes a GUID which represents the output or input device, or NULL to represent the default device. In addition, it also provides a mechanism to address the default voice communications device (DSDEVID_DefaultVoicePlayback).
For Media Foundation and WASAPI, you specify the specific audio endpoint on which you want to render or capture audio in the initialize call (MFCreateAudioRenderer for MF, for WASAPI, you activate an IAudioClient interface on the endpoint).
In either case, once you start streaming to a device, the only mechanism in place when something goes wrong is that an API call fails. That means that it's up to the application to figure out how to recover from any failure.
For the wave APIs and DSound, you really don't have any choice but to detect the failure close down your local resources and restart streaming - the legacy APIs don't allow you a good mechanism for detecting the cause behind a streaming failure.
For MediaFoundation, MF generates events using its' event generator mechanism to inform an application of interesting events; among the events that can be received is an event indicating that the audio device was removed. There are other relevant events generated, including events that are generated when the audio service is stopped (which also stops streaming), when the mix format for the audio endpoint is changed, etc.
For WASAPI, a WASAPI client can retrieve the IAudioSessionControl service from an IAudioClient object and can use the IAudioSessionControl::RegisterAudioSessionNotification to register an IAudioSessionEvents interface. The audio service will call the IAudioSessionEvents::OnSessionDisconnected method when it tears down an audio stream (all these notifications are also passed through to MediaFoundation's event mechanism).
In Windows Vista, there are six different disconnect events generated, and there's a specific set of recovery steps for each of them:
[1] That's not the actual question that he asked, but the answer to his question is included in the answer to my question, so...
Nick White over at the Windows Vista Blog just posted an article written by Steve Ball, the PM in charge of the sounds team.
It does a pretty good job of covering why my $2000 PCs sometimes glitches like crazy, while my $20 CD player works perfectly every single time.
It's worth a read.
We recently got an internal report from someone using the internal audio notification APIs that they were leaking memory and they wanted to help from us debugging the problem.
I took a look and discovered that the problem was a circular reference that was created when they called:
CFoo::Initialize(){ <SNIP> hr = CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_INPROC_SERVER, __uuidof(IMMDeviceEnumerator), (void**)&m_pEnumerator); if (FAILED(hr)) return hr; if (FAILED(m_pEnumerator->RegisterEndpointNotificationCallback(this))); <SNIP>} CFoo::~CFoo(){ <SNIP> m_pEnumerator->UnregisterEndpointNotificationCallback(this); <SNIP>}
CFoo::Initialize(){ <SNIP>
hr = CoCreateInstance(__uuidof(MMDeviceEnumerator), NULL, CLSCTX_INPROC_SERVER, __uuidof(IMMDeviceEnumerator), (void**)&m_pEnumerator); if (FAILED(hr)) return hr;
if (FAILED(m_pEnumerator->RegisterEndpointNotificationCallback(this))); <SNIP>}
CFoo::~CFoo(){ <SNIP> m_pEnumerator->UnregisterEndpointNotificationCallback(this); <SNIP>}
The root cause of the problem is that the IMMDeviceEnumerator::RegisterEndpointNotificationCallback takes a reference to the IMMNotificationClient object passed in. This shouldn't be a surprise, because the Counted Pointer design pattern requires that every time you save a pointer to an object, you take a reference to that object (and every interface that derives from IUnknown implements the Counted Pointer design pattern). Since the RegisterEndpointNotificationCallback saves it's input pointer for later consumption (when it generates the notification), it has to take a reference to the object.
At the heart of the problem is the fact that CFoo object only calls UnregisterEndpointNotificationCallback in its destructor (which will never be called). If the CFoo object had a "Shutdown()" or other form of finalizer, the call to UnregisterEndpointNotificationCallback could be moved to the finalizer, thus removing the circular reference and avoiding the memory leak. This is by far the best solution - I'm a huge fan of deterministic finalism.
Unfortunately, sometimes it's not possible to have a "Shutdown()" method (for instance, if you're implementing an interface that doesn't implement the finalizer design pattern (in fact, this was the case for the person who reported the problem to us).
In that case, you really want to depend on the fact that the reference count reflects your external references, not your internal references. Effectively, you want to maintain two separate reference counts, one for external clients, the other for internal usage.
One way to achieve this is to use a delegator object - instead of handing "this" to the RegisterEndpointNotificationCallback, you pass a small object that implements IMMNotificationClient. So:
class CFooDelegator: public IMMNotificationClient{ ULONG m_cRef; CFoo *m_pFoo; ~CFoo() { }public: CFooDelegator(CFoo *pFoo) : m_cRef(1), m_pFoo(pFoo) { } STDMETHODIMP OnDeviceStateChanged(LPCWSTR pwstrDeviceId, DWORD dwNewState) { if (m_pFoo) { m_pFoo->OnDeviceStateChanged(pwstrDeviceId, dwNewState); } return S_OK; } STDMETHODIMP OnDeviceAdded(LPCWSTR pwstrDeviceId) { if (m_pFoo) { m_pFoo->OnDeviceAdded(pwstrDeviceId); } return S_OK; } STDMETHODIMP OnDeviceRemoved(LPCWSTR pwstrDeviceId) { if (m_pFoo) { m_pFoo->OnDeviceRemoved(pwstrDeviceId); } return S_OK; } STDMETHODIMP OnDefaultDeviceChanged(EDataFlow flow, ERole role, LPCWSTR pwstrDefaultDeviceId) { if (m_pFoo) { m_pFoo->OnDeviceAdded(flow, role, pwstrDefaultDeviceId); } return S_OK; } STDMETHODIMP OnPropertyValueChanged(LPCWSTR pwstrDeviceId, const PROPERTYKEY key) { if (m_pFoo) { m_pFoo->OnPropertyValueChanged(pwstrDeviceId, key); } return S_OK; } void OnPFooFinalRelease() { m_pFoo = NULL; } STDMETHOD(QueryInterface) (REFIID riid, LPVOID FAR* ppvObj) { *ppvObj = NULL; if (riid == IID_IUnknown) { *ppvObj = static_cast<IUnknown *>(this); } else if (riid == IID_IMMNotificationClient) { *ppvObj = static_cast<IMMNotificationClient *>(this); } else { return E_NOINTERFACE; } return S_OK; } STDMETHOD_(ULONG,AddRef)() { return InterlockedIncrement((LONG *)&m_cRef); } STDMETHOD_(ULONG,Release) () { ULONG lRet = InterlockedDecrement((LONG *)&m_cRef); if (lRet == 0) { delete this; } return lRet; }};
class CFooDelegator: public IMMNotificationClient{ ULONG m_cRef; CFoo *m_pFoo;
~CFoo() { }public: CFooDelegator(CFoo *pFoo) : m_cRef(1), m_pFoo(pFoo) { } STDMETHODIMP OnDeviceStateChanged(LPCWSTR pwstrDeviceId, DWORD dwNewState) { if (m_pFoo) { m_pFoo->OnDeviceStateChanged(pwstrDeviceId, dwNewState); } return S_OK; } STDMETHODIMP OnDeviceAdded(LPCWSTR pwstrDeviceId) { if (m_pFoo) { m_pFoo->OnDeviceAdded(pwstrDeviceId); } return S_OK; } STDMETHODIMP OnDeviceRemoved(LPCWSTR pwstrDeviceId) { if (m_pFoo) { m_pFoo->OnDeviceRemoved(pwstrDeviceId); } return S_OK; } STDMETHODIMP OnDefaultDeviceChanged(EDataFlow flow, ERole role, LPCWSTR pwstrDefaultDeviceId) { if (m_pFoo) { m_pFoo->OnDeviceAdded(flow, role, pwstrDefaultDeviceId); } return S_OK; } STDMETHODIMP OnPropertyValueChanged(LPCWSTR pwstrDeviceId, const PROPERTYKEY key) { if (m_pFoo) { m_pFoo->OnPropertyValueChanged(pwstrDeviceId, key); } return S_OK; }
void OnPFooFinalRelease() { m_pFoo = NULL; }
STDMETHOD(QueryInterface) (REFIID riid, LPVOID FAR* ppvObj) { *ppvObj = NULL; if (riid == IID_IUnknown) { *ppvObj = static_cast<IUnknown *>(this); } else if (riid == IID_IMMNotificationClient) { *ppvObj = static_cast<IMMNotificationClient *>(this); } else { return E_NOINTERFACE; } return S_OK; } STDMETHOD_(ULONG,AddRef)() { return InterlockedIncrement((LONG *)&m_cRef); } STDMETHOD_(ULONG,Release) () { ULONG lRet = InterlockedDecrement((LONG *)&m_cRef); if (lRet == 0) { delete this; } return lRet; }};
You then have to change the CFoo::Initialize to construct a CFooDelegator object before calling RegisterEndpointNotification().
You also need to change the destructor on the CFoo:
CFoo::~CFoo(){ <SNIP> m_pEnumerator->UnregisterEndpointNotificationCallback(m_pDelegator); m_pDelegator->OnPFooFinalRelease(); m_pDelegator->Release(); <SNIP>}
It's important to call UnregisterEndpointNotificationCallback before you call OnPFooFinalRelease - if you don't, there's a possibility that the client's final release of the CFoo might occur while a notification function is being called - if that happens, the destructor might complete and you end up calling back into a partially destructed object. And that's bad :). The good news is that the UnregisterEndpointNotificationCallback function guarantees that all notification routines have completed before it returns.
It's important to realize that this issue occurs with ALL the audio notification callback mechanisms:IAudioEndpointVolume::RegisterControlChangeNotify, IPart::RegisterControlChangeCallback, and IAudioSessionControl::RegisterAudioSessionNotification.
In the beginning, there was a need to be able to describe the format contained in a stream of audio data.
And thus the WAVEFORMAT structure was born in Windows 3.1.
typedef struct WAVEFORMAT { WORD wFormatTag; WORD nChannels; DWORD nSamplesPerSec; DWORD nAvgBytesPerSec; WORD nBlockAlign;} WAVEFORMAT;
The problem with the WAVEFORMAT is that it was ok at expressing audio streams that contained samples whose size was a power of 2, but there was no way of representing audio streams that contained samples whose size was something other than that (like 24bit samples).
So the PCMWAVEFORMAT was born.
typedef struct PCMWAVEFORMAT { WAVEFORMAT wf; WORD wBitsPerSample;} PCMWAVEFORMAT;
If the application passed in a WAVEFORMAT with a wFormatTag of WAVE_FORMAT_PCM, it was required to actually pass in a PCMWAVEFORMAT so that the audio infrastructure could determine the number of bits per sample.
That worked fine and solved that problem, but the powers that be quickly realized that relying on the format tag for extensibility was going to be a problem in the future.
So once again, the structure was extended, and for Windows NT 3.5 and Windows 95, we got the WAVEFORMATEX that we know and love:
typedef struct tWAVEFORMATEX{ WORD wFormatTag; /* format type */ WORD nChannels; /* number of channels (i.e. mono, stereo...) */ DWORD nSamplesPerSec; /* sample rate */ DWORD nAvgBytesPerSec; /* for buffer estimation */ WORD nBlockAlign; /* block size of data */ WORD wBitsPerSample; /* number of bits per sample of mono data */ WORD cbSize; /* the count in bytes of the size of */ /* extra information (after cbSize) */} WAVEFORMATEX, *PWAVEFORMATEX, NEAR *NPWAVEFORMATEX, FAR *LPWAVEFORMATEX;
This solved the problem somewhat. But there was a problem - while all the APIs were changed to express a WAVEFORMATEX, there were still applications that passed in a WAVEFORMAT to the API (and there were WAV files that had been authored with WAVEFORMAT structures). The root of the issue is that there was no way of distinguishing between a WAVEFORMAT (which didn't have a cbSize field) and a WAVEFORMATEX (which did). To resolve this, for WAVEFORMAT structures kept in files, the file metadata provided the size of the structure, so we could use the size of the structure to distinguish the various forms.
When the structure was passed in as a parameter to a function, there was still a problem. For that, the code that parses WAVEFORMATEX structure must rely on the fact that if the wFormatTag field in the WAVEFORAMAT structure was WAVE_FORMAT_PCM, then the WAVEFORMAT structure is actually a PCMWAVEFORMAT, which is the same as a WAVEFORMATEX with a cbSize field set to 0. For all other formats, the code simply assumes that the caller is passing in a WAVEFORMATEX structure.
Unfortunately, the introduction of the WAVEFORMATEX wasn't quite enough. When you're dealing with two channel audio streams, it's easy to simply say that channel 0 is left and channel 1 is right (or whatever). But when you're dealing with a multichannel audio stream, it's not possible to determine which channel goes with which speaker. In addition, with a WAVEFORMATEX, there's still a problem with non power-of-2 formats. This time, the problem happens when you take a 24bit waveformat and try to pack it into 32bit samples - doing this can dramatically speed up any manipulation that needs to be done on the samples, so it's highly desirable.
So one final enhancement was made to the WAVEFORMAT structure, the WAVEFORMATEXTENSIBLE (introduced in Windows 2000):
typedef struct { WAVEFORMATEX Format; union { WORD wValidBitsPerSample; /* bits of precision */ WORD wSamplesPerBlock; /* valid if wBitsPerSample==0 */ WORD wReserved; /* If neither applies, set to zero. */ } Samples; DWORD dwChannelMask; /* which channels are */ /* present in stream */ GUID SubFormat;} WAVEFORMATEXTENSIBLE, *PWAVEFORMATEXTENSIBLE;
In the WAVEFORMATEXTENSIBLE, we have the old WAVEFORMATEX, and adds a couple of fields that allow the caller to specify packing of the samples, and to allow the caller to describe which channels in the stream should be redirected to which speaker. For example, if the dwChannelMask is SPEAKER_FRONT_LEFT | SPEAKER_FRONT_RIGHT | SPEAKER_LOW_FREQUENCY | SPEAKER_TOP_FRONT_LEFT, then channel 0 is the front left channel, channel 1 is the front right channel, channel 2 is the subwoofer, and channel 3 is the top front left speaker. The way you identify a WAVEFORMATEXTENSIBLE is that the Format.wFormatTag field is set to WAVE_FORMAT_EXTENSIBLE and the Format.cbSize field is always set to 0x16.
That's where things live for now - who knows if there will be another revision in the future.
Chris Pirillo had an interesting blog post the other day with the rather uninformative title of "Windows Vista Sound Problems". He has a reader who built a shutdown sound that is almost 2 minutes long, and that reader is upset that the system isn't playing his entire shutdown sound when he shuts is system down.
Chris speculates that it might be tied to the sound event process or to audio driver limitations, but the actual answer is actually much simpler, and is related to the way that the shell handles the shutdown sound.
One of the most significant pieces of feedback that we received about Windows XP was that people (especially people with laptops) were quite upset at the amount of time that it took for XP to shutdown. You could see dramatic proof of this by simply walking around the halls here at Microsoft - you'd see people going from their office to a meeting with their laptop lids cracked partly open. The big reason for this was that XP didn't reliably shut down the system - you'd close the lid of your laptop, stick it in your laptop case and head off to your meeting, when you got there you'd burn your hands because the laptop never shut down, even after 5 minutes with the lid closed.
For Vista, the power management folks decided that they were going to fix this problem - when you closed your laptop (or shut off your computer), they WERE going to shut down the machine. This makes a ton of sense - the act of closing the lid on the laptop is a clear indication that the customers intent is to stop using their machine, so the system should turn itself off when this happens.
This decision had some consequences though. On Windows XP, an application was allowed to delay system shutdown indefinitely - this was a major cause of the overheated laptop problem; on Vista, the system IS going to shut down, even if your application isn't ready for it. So if your application takes a long time to exit (and Microsoft applications are absolutely NOT excluded from this list), than it's going to have the rug yanked out from under its feet.
Since the shutdown process is effectively synchronous, the shell (explorer.exe) attempts to limit the size of the WAV file that's played during system shutdown (it uses the file size as a first order approximation of the length of the sound). If the .WAV file that's registered for the shutdown sound is larger than 4M in size, it won't be played.
So if Chris's reader reworked his file to keep it under 4M in size (which probably can be done with a reduction in sample rate and channel count) than Explorer will happily play the sound.
However Chris's reader may still not be happy with the results. To understand why, you need to dig a bit deeper into the shutdown process.
The Windows shutdown process is (very roughly - this is a 100,000 foot approximation, the actual process is much more complicated):
Remember my comment above about shutting down the user's applications? Well, explorer is still one of the user's applications, and it's subject to the same termination rules as every other application. Some number of seconds into playing the shutdown sound, NTUSER will decide that the explorer is hung and will bring up the "This application is hung, do you want to kill it?" screen (the reason will be something like "Explorer / Playing Logoff sound").
What happens next depends on what the user answers (or has previously answered). If the user answers "yes" to the "Do you wish to terminate this application" prompt, then the system enters "forced shutdown" mode. If they answered "no", than the system will wait until all the applications have terminated.
If the system is in "forced shutdown" mode, than 30 seconds after the prompt, the system WILL kill the remaining applications, regardless of whether or not they're shut down. If Explorer is still playing the logoff sound at that time, it'll be yanked as well, and the logoff sound will be cut short.
Recently BillP, the author of the antispyware application WinPatrol asked on the MSDN forums about a problem he was having with his application.
His app called PlaySound(MAKEINTRESOURCE(IDR_WOOF), hInst, SND_RESOURCE | SND_SYNC | SND_NOWAIT) but it was failing (returning false).
He was wondering if this might be a bug in Vista's playsound implementation - a reasonable assumption given that his application worked just fine on previous versions of Windows.
I knew that we were passing all the PlaySound regression tests, and there are a number of elements of Windows that use SND_RESOURCE (the pearl startup sound is one of them), so the SND_RESOURCE functionality wasn't broken. I was puzzled for a bit and then I realized that there WAS one change to PlaySound in Vista (other than some general code cleanup, the addition of support for the SND_SENTRY and SND_SYSTEM flags and support for accessibility events on PlaySound).
For Vista, I tightened up the validation logic that's used when checking files before the PlaySound call. Among other things, we check to make sure that:
a) The cbSize field in the "fmt " tag in the WAV file is less than 1K in length andb) The WAVEFORMATEX in the "fmt " tag in the WAV file fits inside the "fmt " tag (done by checking that the waveformatex->cbSize+sizeof(WAVEFORMATEX) is less than the size of the "fmt " chunk[1]).
I downloaded BillP's app, and checked the resources in the file. And in sure enough, the WAVEFORMATEX in the file had a length of 0x38 when it should have been 0. Once I patched his binary to change the 0x38 to a 0, his application stared barking away.
At this point there are two options: (1) BillP can fix his application to correct the corrupted resource or (2) Microsoft can change the PlaySound API to allow this kind of corruption (either to allow a bogus cbSize or to edit the app's WAVEFORMATEX to "fix" the bogus cbSize). Changing PlaySound is not trivial at this point, and it's not clear what the right fix is - the error might be in the size of the "fmt " chunk, which means that the information in the cbSize field might be accurate. In addition, the downstream audio rendering APIs are likely to choke on these malformed structures anyway.
It's this kind of subtle breaking change that makes modifying any of the older Windows APIs such a nightmare.
[1] We only check if the "fmt " chunk size is greater than sizeof(WAVEFORMAT) - if the "fmt " chunk is sizeof(WAVEFORMAT) than we assume that this structure is a WAVEFORMAT structure, which doesn't have a cbSize field.
For better or worse, the Windows UI model ties a window to a particular thread, that has led to a programming paradigm where work is divided between "UI threads" and "I/O threads". In order to keep your application responsive, it's critically important to not perform any blocking operations on your UI thread and instead do them on the "I/O threads".
One thing that people don't always realize is that even asynchronous APIs block. This isn't surprising - a single processor core can only do one thing at a time (to be pedantic, the processor cores can and do more than one thing at a time, but the C (or C++) language is defined to run on an abstract machine that enforces various strict ordering semantics, thus the C (or C++) compiler will do what is necessary to ensure that the languages ordering semantics are met[1]).
So what does an "async" API really do given that most APIs are written in languages that don't contain native concurrency support[2] ? Well, usually it packages up the parameters to the API and queues it to a worker thread (this is what the CLR does for many of the "async" CLR operations - they're not really asynchronous, they're just synchronous calls made on some other thread).
For some asynchronous APIs (like ReadFile and WriteFile) you CAN implement real asynchronous semantics - under the covers, the ReadFile API adds a read request to a worker queue and starts the I/O associated with reading the data from disk, when the hardware interrupt occurs indicating that the read is complete, the I/O subsystem removes the read request from the worker queue and completes it [3].
The critical thing to realize is that even for the APIs that do support real asynchronous activity there's STILL synchronous processing going on - you still need to package up the parameters for the operation and add them to a queue somewhere, and that can stall the processor. For most operations it doesn't matter - the time to queue the parameters is sufficiently small that you can perform it on the UI thread.
And sometimes it isn't. It turns out that my favorite API, PlaySound is a great example of this. PlaySound provides asynchronous behavior with the SND_ASYNC flag, but it does a fair amount of work before dispatching the call to a worker thread. Unfortunately, some of the processing done in the application thread can take many milliseconds (especially if this is the first call to winmm.dll).
I originally wrote down the operations that were performed on the application's thread, but then I realized that doing so would cement the behavior for all time, and I don't want to do that. So the following will have to suffice:
In general, PlaySound does the processing necessary to determine the filename (or WAV image) in the application thread and posts the real work (rendering the sound) to a worker thread. That processing is likely to involve synchronous I/Os and registry reads. It may involve searching the path looking for a filename. For SND_RESOURCE, it will also involve reading the resource data from the specified module.
Because of this processing, it's possible for the PlaySound(..., SND_ASYNC) operation to take several hundred milliseconds (and we've seen it take as long as several seconds if the current directory is located on an unreliable network). As a result, even the SND_ASYNC version of the PlaySound API should be avoided on UI threads[4].
[1] I bet most of you didn't know that the C language definition strictly defines an abstract machine on which the language operates.
[2] Yes, I know about the OpenMP extensions to C/C++, they don't change this scenario.
[3] I know that this is a grotesque simplification of the actual process.
[4] For those that are now scoffing: "What a piece of junk - why on earth would you even bother doing the SND_ASYNC if you're not going to really be asynchronous", I'll counter that the actual rendering of the audio samples for many sounds takes several seconds. The SND_ASYNC flag moves all the actual audio rendering off the application's thread to a worker thread, so it can result in a significant improvement in performance.