Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Audio in Vista, the big picture

Audio in Vista, the big picture

Rate This
  • Comments 29

So I've talked a bit about some of the details of the Vista audio architecture, but I figure a picture's worth a bunch of text, so here's a simple version of the audio architecture:

This picture is for "shared" mode, I'll talk about exclusive mode in a future post.

The picture looks complicated, but in reality it isn't.  There are a boatload of new constructs to discuss here, so bear with me a bit.

The flow of audio samples through the audio engine is represented by the arrows - data flows from the application, to the right in this example.

The first thing to notice is that once the audio leaves the application, it flows through a very simple graph - the topology is quite straightforward, but it's a graph nonetheless, and I tend to refer to samples as moving through the graph.

Starting from the left, the audio system introduces the concept of an "audio session".  An audio session is essentially a container for audio streams, in general there is only one session per process, although this isn't strictly true.

Next, we have the application that's playing audio.  The application (using WASAPI) renders audio to a "Cross Process Transport".  The CPT's job is to get the audio samples to the audio engine running in the Windows Audio service.

In general, the terminal nodes in the graph are transports, there are three transports that ship with Vista, the cross process transport I mentioned above, a "Kernel Streaming" transport (used for rendering audio to a local audio adapter), and an "RDP Transport" (used for rendering audio over a Remote Desktop Connection). 

As the audio samples flow from the cross process transport to the kernel streaming transport, they pass through a series of Audio Processing Objects, or APOs.  APOs are used to provide DSP on the audio samples.  Some examples of the APOs shipped in Vista are:

  • Volume - The volume APO provides mute and gain control.
  • Format Conversion - The format converter APOs (there are several) provide data format conversion - int to float32, float32 to int, etc.
  • Mixer - The mixer APO mixes multiple audio streams
  • Meter - The meter APO remembers the peak and RMS values of the audio samples pumped through it.
  • Limiter - The limiter APO prevents audio samples from clipping when rendering.

All of the code above runs in user mode except for the audio driver at the very end.

  • I can't see nothing in firefox or konqueror (same rendering engine than Apple's Safari)
  • I did include a VML warning :(

    I don't know how to get the image to work using firefox unfortunately :(
  • What is VML? Sounds like I'm screwed if I have Safari.
  • I've put a screengrab up at:

    http://www.visuar.com/vistasharedaudiostack.gif

    Larry: you can download my image and put that up on your web/blog host and use that instead of the VML...
  • This thing is excessively broken on Safari.  There's text apparently from the VML drawing all over the post, and I can't select any of the underlying text.  Also there's no drawing at all.  I had to tab through the entire navigation bar to get to the comment button...

    Vorn
  • Ok, vml fixed.
  • Can 3rd parties write their own transports and/or APOs, i.e. will there be publicly documented interfaces for implementing them?

    In particular I'm interested in writing a transport similiar to the RDP transport to route audio to a remote network device.
  • Sean, yes, IHVs will have the ability to write APOs for their audio solution.

    I'm not 100% on the transport issue.
  • What about ISVs writing an APO, e.g. a graphic equalizer that is indepedent of any particular hardware audio solution?

    If so then if ISVs can't write their own transport I could get my APO inserted into the graph which would copy the audio samples to the target network device and allow the samples to continue through the graph to the local audio driver.
  • Hey Larry,

    I think you could reach a bigger audience if you just took a screenshot of the page in IE and replaced the VML with the image of the screenshot. I created a GIF image from the page in IE and it was only 45kb so I dont think that bandwidth would be a big deal.
  • During this series, can you work in a discussion of how Secure Audio Path fits in?
  • All what I want from Vista’s Audio is this:

    I will go home. (it is there today).
    I will open my Tablet PC. (it is there today).
    I will start a game over my wireless network (it is there today).

    I will hear the game’s sound over my surround speakers at home wirelessly, either using the media edition PC that is there is the house or any other way, I want wireless sound driver, not streaming :-) (It does not exist today)
  • G.T.  I know I've read that there are people who are actively investigating wireless speaker solutions, so  there's no reason to believe that it won't work in the future.
  • PingBack from http://www.laranevans.com/posts/127
  • With the new audio stuff in vista, is it possible for the user to push a slider or something that merges all audio channels to one speaker? Occasionally one speaker of my headphones will break and some of the songs I listen to make heavy use of stereo effects and it's kind of annoying.
Page 1 of 2 (29 items) 12