The Windows Phone SDK includes a Windows.Phone.Media.Devices.AudioRoutingManager API which I had occasion to use.

The API allows apps that have communication audio streams (e.g., Voice over IP calls) to control whether the audio goes out over the earpiece, over the speakerphone, or over the Bluetooth headset. This might be done automatically, or might be used to power "Speakerphone" and "Bluetooth" buttons in the app UI.

The starting point is a GetDefault() method which gives you the singleton AudioRoutingManager object.

There are three ways to get information out of this object:

  1. A read-only AvailableAudioEndpoints property tells you the list of currently available audio outputs.
  2. A GetAudioEndpoint method tells you what the current audio output is.
  3. An AudioEndpointChanged callback tells you when either of the previous two things change.

You can also tell the object to change something:

  1. SetAudioEndpoint(…) lets you tell Windows Phone where audio should come out, subject to some restrictions.

There are two enumerated types used by these methods:

  1. AvailableAudioRoutingEndpoints, which is the type of the AvailableAudioEndpoints property. This is a "flags"-style (multi-valued) enum with the following values:
    • None
    • Earpiece
    • Speakerphone
    • Bluetooth
  2. AudioRoutingEndpoint, which is returned by GetAudioEndpoint and is the sole argument for SetAudioEndpoint. This is a single-valued enum with the following values:
    • Default
    • Earpiece
    • Speakerphone
    • Bluetooth
    • WiredHeadset
    • WiredHeadsetSpeakerOnly
    • BluetoothWithNoiseAndEchoCancellation

At first I found this very confusing. SetAudioEndpoint takes an AudioRoutingEndpoint type, but what do I pass to it? And why does GetAudioEndpoint always tell me "Speakerphone?"

After experimenting and chatting with the folks who own the API I was able to construct an internal mental model which made more sense to me.

  1. While communications audio is playing, the Phone has an audio routing policy. Imagine an AudioRoutingPolicy write-only property with the following values:
    • I'm flexible: play to the first available of { wired headset, Bluetooth device, earpiece }
    • Gimme Bluetooth: play to the first available of { Bluetooth device, wired headset, earpiece }
    • No Bluetooth: play to the first available of { wired headset, earpiece }
    • Speakerphone: play to the built-in speaker
  2. If you want to change this policy, the app needs to have either the ID_CAP_VOIP or ID_CAP_VOICEMAIL capability. (The documentation refers to an ID_CAP_AUDIOROUTING capability, but this does not exist.)
    Do:
        var audioRoutingManager = Windows.Phone.Media.Devices.AudioRoutingManager.GetDefault();
        audioRoutingManager.SetAudioEndpoint(x);

    where xis as follows:
    • x = AudioRoutingEndpoint.Bluetooth sets the policy to Gimme Bluetooth
    • x = AudioRoutingEndpoint.Earpiece sets the policy to No Bluetooth
    • x = AudioRoutingEndpoint.Speakerphone sets the policy to Speakerphone
  3. There is no direct way to set the policy to I'm flexible.
  4. There is no direct way to tell what the current value of the AudioRoutingPolicy is. You can sometimes guess, though, based on the value of GetAudioEndpoint and/or AvailableAudioEndpoints.
    • If GetAudioEndpoint is AudioRoutingEndpoint.Speakerphone, then the current policy must be Speakerphone.
    • If GetAudioEndpoint is AudioRoutingEndpoint.Earpiece and AvailableAudioEndpoints & AvailableAudioRoutingEndpoints.Bluetooth is set, then the current policy must be No Bluetooth.
    • If GetAudioEndpoint is AudioRoutingEndpoint.WiredHeadset and AvailableAudioEndpoints & AvailableAudioRoutingEndpoints.Bluetooth is set, then the current policy must be either I'm flexible or No Bluetooth.
    • If GetAudioEndpoint is AudioRoutingEndpoint.Bluetooth or AudioRoutingEndpoint.BluetoothWithNoiseAndEchoCancelation, then AvailableAudioEndpoints & AvailableAudioRoutingEndpoints.Bluetooth must be set, and the current policy must be either I'm flexible or Gimme Bluetooth.
  5. When there are no audio communications streams, the policy is undefined.
  6. When the number of audio communications streams goes from zero to one (perhaps as the result of a phone call or VoIP call), the policy is reset/defaulted to I'm flexible or No Bluetooth depending on the details. This means you shouldn't bother setting a policy until after your phone call starts playing audio.
  7. When a device is connected (a Bluetooth device is connected, or a wired headset is plugged in) the policy is reset to I'm flexible. This (usually) results in audio switching to the new device, which is usually what the user wants.
    (If a device is removed, on the other hand, the policy is not reset.)

Here's a chart I put together on how the different states and policies interact:

If a Bluetooth Hands-Free HF device is: Connected
AvailableAudioEndpoints =
  Speakerphone | Earpiece | Bluetooth
Not connected
AvailableAudioEndpoints =
  Speakerphone | Earpiece
I'm flexible audio routing policy is:
This policy may be automatically invoked when:
  a call starts, or
  a device connects
You can manually invoke it with:
  SetAudioEndpoint(Bluetooth)
WiredHeadset or
WiredHeadsetSpeakerOnly or
Bluetooth or
BluetoothWith...
Depending on what is plugged in, and the capabilities of the device
WiredHeadset or
WiredHeadsetSpeakerOnly or
Earpiece
Depending on what is plugged in
Gimme Bluetooth audio routing policy is:
This policy may be automatically invoked when a call starts
You can manually invoke it with:
  SetAudioEndpoint(Bluetooth)
Bluetooth or
BluetoothWith...
Depending on the capabilities of the device
WiredHeadset or
WiredHeadsetSpeakerOnly or
Earpiece
Depending on what is plugged in
No Bluetooth audio routing policy is:
You can manually invoke this policy with:
  SetAudioEndpoint(Earpiece) or
  SetAudioEndpoint(Default)
WiredHeadset or
WiredHeadsetSpeakerOnly or
Earpiece
Depending on what is plugged in
Speakerphone audio routing policy is:
You can manually invoke this policy with:
  SetAudioEndpoint(Speakerphone)
Speakerphone
Invalid audio routing policies:
The following calls are all errors:
  SetAudioEndpoint(WiredHeadset)
  SetAudioEndpoint(WiredHeadsetSpeakerOnly)
  SetAudioEndpoint(BluetoothWith...)
N/A
SetAudioEndpoint throws an exception

Note that if a wired headset is plugged in, the app has no way to make audio come out of the earpiece. This is true regardless of whether Bluetooth is connected.

It seems like much of my confusion resulted from a single enumerated type (AudioRoutingEndpoint) serving three purposes:

  1. Tell the app where audio is coming out (WiredHeadset vs. Earpiece)
  2. Tell the app what the capabilities of the current output are (Bluetooth vs. BluetoothWithNoiseAndEchoCancellation)
  3. Allow the app to control the audio routing policy (Default vs. Speakerphone)

I think it would have been clearer to make the audio routing policy a different enumerated type from the current audio output or the available audio outputs. But with the "audio routing policy" mental model, it's not too bad.