Browse by Tags

Tagged Content List
  • Blog Post: More on audio buffer alignment requirements

    I chatted in the past about how audio device alignment requirements impact the buffer size and the WASAPI alignment dance . There are three alignment requirements on audio buffers: The buffer size must be a multiple of WAVEFORMATEX.nBlockAlign. This allows individual audio frames to be copied...
  • Blog Post: Using the Speech API to convert speech to text

    Some time ago I created a "listen.exe" tool which used SAPI's ISpRecoContext to listen to the microphone and dump any recognized text to the console . Today I had to debug an issue with SAPI reading from a .wav file, so I updated it to accept a listen.exe --file foo.wav argument; this consumes the...
  • Blog Post: Troubleshooting default audio device heuristics

    In Windows 7 we published a white paper which shows how Windows chooses which audio device should be the default . This remains true for Windows 8 and Windows 8.1. The six factors that are considered for each device are: Jack detection capability Whether KSJACK_DESCRIPTION2.JackCapabilities...
  • Blog Post: A mental model for the Windows Phone AudioRoutingManager API

    The Windows Phone SDK includes a Windows.Phone.Media.Devices.AudioRoutingManager API which I had occasion to use. The API allows apps that have communication audio streams (e.g., Voice over IP calls) to control whether the audio goes out over the earpiece, over the speakerphone, or over the Bluetooth...
  • Blog Post: Why is 1 Pascal equal to 94 dB Sound Pressure Level? (1 Pa = 94 dB SPL)

    Last time we talked about why a full-scale digital sine wave has a power measurement of -3.01 dB FS (Spoiler: because it's not a square wave.) This time we'll discuss why an atmospheric sound which generates a root-mean-square pressure of 1 Pascal has a power measurement 94 dB SPL. As before, dB...
  • Blog Post: Getting peak meters and volume settings for all apps and audio devices on the system

    A few previous posts have touched on how to get peak meter readings on the device, and per-app Getting the package full name of a Windows Store app, given the process ID More on IAudioSessionControl and IAudioSessionControl2, plus: how to log a GUID Getting audio peak meter values for all...
  • Blog Post: shellproperty.exe - set/read string properties on a file from the command line

    Yesterday Raymond Chen blogged a "Little Program" which could edit audio metadata . As it happens, I have a similar tool I threw together which accepts a property key and a string property value to update a property, or can read a string or string-vector property. Usage: >shellproperty shellproperty...
  • Blog Post: More on IAudioSessionControl and IAudioSessionControl2, plus: how to log a GUID

    A while back I blogged about using IAudioSessionControl and IAudioSessionControl2 to get a list of active sessions, and then using IAudioMeterInformation to see what the amplitude level of the audio being played from each session was . I decided to go back and push this a little further and see what...
  • Blog Post: Buffer size alignment and the audio period

    I got an email from someone today, paraphrased below: Q: When I set the sampling frequency to 48 kHz, and ask Windows what the audio period is, I get exactly 10 milliseconds. When I set it to 44.1 kHz, I get very slightly over 10 milliseconds: 10.1587 milliseconds, to be precise. Why? A: Alignment...
  • Blog Post: Grabbing the output of the Microsoft Speech API text-to-speech engine as audio data

    A while ago I wrote a post on Implementing a "say" command using ISpVoice from the Microsoft Speech API which showed how to use Speech API to do text-to-speech, but was limited to playing the generated audio out of the default audio device. Recently on the Windows Pro Audio forums, user falven asked...
  • Blog Post: How to dump Speech API object properties

    Stamatis Pap asked in a forum thread how to use a Speech API ISpVoice with a non-default audio device . This MSDN article shows how to use SpEnumTokens to list all the currently active audio outputs, but the number and order of audio outputs is subject to change as things come and go, or as the default...
  • Blog Post: Enumerating mixer devices, mixer lines, and mixer controls

    The WinMM multimedia APIs include an API for enumerating and controlling all the paths through the audio device; things like bass boost, treble control, pass-through audio from your CD player to your headphones, etc. This is called the "mixer" API and is the forerunner of the IDeviceTopology API. ...
  • Blog Post: Enumerating MIDI devices

    In addition to audio playback and recording, Windows Multimedia (WinMM) provides a Musical Instrument Digital Interface (MIDI) API . Here's how to make a list of all the MIDI devices on the system, their capabilities, and the hardware device interface associated with each of them. Source and binaries...
  • Blog Post: Implementing a "listen" command using ISpRecoContext from the Microsoft Speech API

    Earlier today I posted a quick "say.exe" sample app which you give text and it speaks the text aloud using the text-to-speech part of the Windows Speech API. It was very straightforward - only 67 lines of C++ code. It took me a little longer to figure out how to do this "listen.exe" sample app; you...
  • Blog Post: Implementing a "say" command using ISpVoice from the Microsoft Speech API

    I've known for a while that Microsoft Windows comes with text-to-speech and speech-to-text APIs, which power the Narrator and Speech Recognition features respectively. This forum post prompted me to mess around with them a little. I came up with this implementation of a say.exe command which takes...
  • Blog Post: Muting all audio outputs with IAudioEndpointVolume

    I have a selfhost tool that I use to mute all audio outputs programmatically. Pseudocode: IMMDeviceEnumerator::EnumAudioEndpoints for each device: IMMDevice::Activate(IAudioEndpointVolume) IAudioEndpointVolume::SetMute(TRUE) Source and binaries attached.
  • Blog Post: Getting audio peak meter values for all active audio sessions

    The Windows Vista volume mixer shows a peak meter for the device. In Windows 7 we added a peak meter for each application. The audio interface for both is IAudioMeterInformation ; I've used this before in my post about the linearity of Windows volume APIs . This post showed how an application can...
  • Blog Post: Sample: how to enumerate waveIn and waveOut devices on your system

    This shows how to call waveInGetNumDevs, waveInGetDevCaps, waveOutGetNumDevs, and waveOutGetDevCaps. // main.cpp #include <windows.h> #include <mmsystem.h> #include <stdio.h> #define LOG(format, ...) wprintf(format L"\n", __VA_ARGS__) int _cdecl wmain() { UINT devs...
  • Blog Post: Welcome Yuk Lai Suen to the blogosphere!

    Please help to welcome my colleague fellow Windows Sound team test dev Yuk Lai Suen to the blogosphere! Yuk Lai's first post discusses the utility of manual testing . Yuk Lai Suen http://blogs.msdn.com/b/yuk_lai_suen/
  • Blog Post: Beep sample

    A question came in today about the Beep(...) API 1 not being able to set the frequency of the beep that was generated. In order to confirm that it worked I whipped up a quick sample which would take the frequency (and duration) on the command line. Source and binaries attached. For fun I added the...
  • Blog Post: How to validate and log a WAVEFORMATEX

    Last time I wrote about how to query audio endpoint properties . This time, a few words about WAVEFORMATEX structures. WAVEFORMATEX is a variable-sized structure - it can be as small as a PCMWAVEFORMAT or larger than a WAVEFORMATEXTENSIBLE. If the wFormatTag member (the first field) is WAVE_FORMAT_PCM...
  • Blog Post: How to enumerate audio endpoint (IMMDevice) properties on your system

    Source and binaries (amd64 and x86) attached. Pseudocode: CoCreateInstance(..., &pMMDeviceEnumerator); pMMDeviceEnumerator->EnumAudioEndpoints(..., &pMMDeviceCollection); for (each device in the collection) { pMMDevice->OpenPropertyStore(..., &pPropertyStore); for (each property...
  • Blog Post: Linearity of Windows volume APIs - render session and stream volumes

    We have talked about some of the volume APIs Windows exposes . We have also talked about what it means for a volume control to be linear in magnitude, linear in power, or linear in dB . We have also talked about how to read IAudioMeterInformation and how the limiter can attenuate full-scale signals...
  • Blog Post: Linearity of Windows volume APIs - IAudioMeterInformation and full-scale signals

    We have talked about some of the volume APIs Windows exposes . We have also talked about what it means for a volume control to be linear in magnitude , linear in power , or linear in dB . The attachment to this blog post contains: An app I wrote to exercise the IAudioStreamVolume, ISimpleAudioVolume...
  • Blog Post: Basic audio volume theory

    Last time we talked about the different Windows Audio Session APIs for setting volume. Let's talk a little about what volume means. For purposes of illustration, let's take our signal to be a full-scale square wave: Incidentally, the answer to the exercise completely characterize the set...
Page 1 of 3 (52 items) 123