Matthew van Eerde's web log
I am a Software Development Engineer in Test working for the Windows Sound team. You can contact me via email: mateer at microsoft dot com
Suppose you have a stereo stream that you want to downmix to mono. Why would you do this? Maybe you're playing stereo music to a Bluetooth headset that is in a call, and thus in "headset / handsfree" mode. Maybe you're capturing from a stereo mic and you want to show a visualization based on a mono reduction of the signal.
The basic formula to use is M = L / 2 + R / 2 but there are a couple of things to be aware of.
Consider the simplest case first where the left and right channels are identical. Naturally, the resulting mono signal is identical to both of the channels:
In particular, downmixing a stereo signal of identical full scale sine waves (power = -3 dB FS) results in a -3 dB FS mono sine wave. Well and good.
This, however, is a simple coincidence few could ever have counted upon. (A similar effect is the lack of spectral leakage if your signal period exactly matches up to your FFT window period.) As a rule, downmixing results in a loss of power. To get a basic idea of why this is, let's take two different sine waves and downmix to a two-tone:
Note that downmixing these two totally uncorrelated signals results in a loss of power of 3 dB FS; the power of the two-tone is -6 dB FS, 3 dB FS lower than each of the individual -3 dB FS signals that went into it.
It is tempting to conclude that mixing two signals of power P gives a resultant signal of power between P - 3 dB and P, depending on the degree of correlation. However, this conclusion is incorrect: signals can be correlated; uncorrelated; or anticorrelated.
Once in a while you get a stereo microphone which captures heavily correlated L and R channels, but (due to one reason and another) inverts one of the channels. Instead of being heavily correlated, the L and R signals are now heavily anticorrelated. This is bad enough when you try to listen to it: but when you downmix to mono, the signal disappears!
The effect with a "real" stereo signal is somewhat less dramatic because it's receiving only very highly correlated signals (not perfectly correlated.) So the downmix to mono only almost totally destroys the signal.