# Matthew van Eerde's web log

• #### Phase and stereo-to-mono downmix

Last time we looked at how to downmix a stereo signal to mono (M = L/2 + R/2).  The resulting M signal can only be as powerful as the L and R signals if they are perfectly correlated (L = R); if the L and R signals are uncorrelated (no relationship between L and R), the resulting M signal is only half as powerful; and if the L and R signals are anticorrelated (L = -R), the resulting M signal vanishes.

This time I want to look more closely at what happens to M as {L, R} varies from correlated via uncorrelated to anticorrelated.  In particular I'll take L and R to be full-scale sine waves of equal frequency and offset by a constant phase: {L, R} = {sin θ, sin(θ - φ)}.  We'll see what happens to IM / IL as a function of φ.

Recall from before that the intensity of a signal s(t) over a time interval (a, b) is defined as

where the dc value is

Our downmixed signal is defined as

So our dc is zero (note that I am taking (a, b) to be a single period:)

A refresher on the angle-sum identities, which I will use without proof:

• sin(A + B) = sin A cos B + cos A sin B
• Corollary: sin(A - B) = sin A cos B - cos A sin B
• cos(A + B) = cos A cos B - sin A sin B
• Corollary: cos(A - B) = cos A cos B + sin A sin B

We know from before that IL = IR = sqrt(1/2) = 0.707.... Let's calculate IM as a function of φ:

Nice and simple.

Are we done? No.

When you see a fact, try to see it as intuitively as possible -- Pólya

OK, let's graph this.

That looks suspiciously like a cosine in absolute value.  In particular, after messing around with a graphing calculator, it looks like:

Is this in fact a trigonometric identity?  Let's see...

Much nicer and simpler.  So why didn't our original derivation land us here?

Let's go back to our original graph and shift our frame of reference forward by φ/2.   Instead of {L, R} = {sin θ, sin(θ - φ)} we get {L, R} = {sin(θ + φ/2), sin(θ - φ/2)}.  Solving for IM now gives:

This result is much easier to digest.  The sqrt(1/2) is the intensity of the pre-mixed signal; the loss of intensity due to the downmixing, then, is |cos(φ/2)|.

• #### How to enumerate DirectShow filters on your system

Last time I showed how to enumerate Media Foundation transforms.  This time, DirectShow filters.  Source and binaries attached.

In pseudocode:

GUID directshow_categories[] = { ... }; // static list
ICreateDevEnum = CoCreate(...);
for (each category in directshow_categories) {
display the category name and GUID

IEnumMoniker = ICreateDevEnum::CreateClassEnumerator(category);

for (each IMoniker that IEnumMoniker finds) {
IBindCtx = CreateBindCtx(...);
IPropertyBag = IMoniker::BindToStorage(IBindCtx);

pull the friendly name from the IPropertyBag and display it
}
}

Output of the tool on my system:

>devenum.exe
-- Audio Capture Sources ({33D9A762-90C8-11D0-BD43-00A0C911CE86}) --
Front Microphone (High Definiti
Line In (High Definition Audio
Rear Microphone (High Definitio

-- Audio Compressors ({33D9A761-90C8-11D0-BD43-00A0C911CE86}) --
WM Speech Encoder DMO
WMAudio Encoder DMO
PCM
GSM 6.10
CCITT A-Law
CCITT u-Law
MPEG Layer-3

-- Audio Renderers ({E0F158E1-CB04-11D0-BD4E-00A0C911CE86}) --
Speakers (High Definition Audio
Default DirectSound Device
Default WaveOut Device
DirectSound: Speakers (High Definition Audio Device)

-- Device Control Filters ({CC7BFB46-F175-11D1-A392-00E0291F3959}) --

-- DirectShow Filters ({083863F1-70DE-11D0-BD40-00A0C911CE86}) --
WMAudio Decoder DMO
WMAPro over S/PDIF DMO
WMSpeech Decoder DMO
MP3 Decoder DMO
Mpeg4s Decoder DMO
WMV Screen decoder DMO
WMVideo Decoder DMO
Mpeg43 Decoder DMO
Mpeg4 Decoder DMO
DV Muxer
Color Space Converter
AVI Splitter
VGA 16 Color Ditherer
SBE2MediaTypeProfile
Microsoft DTV-DVD Video Decoder
AC3 Parser Filter
StreamBufferSink
Microsoft TV Captions Decoder
MJPEG Decompressor
CBVA DMO wrapper filter
MPEG-I Stream Splitter
SAMI (CC) Parser
VBI Codec
MPEG-2 Splitter
Closed Captions Analysis Filter
SBE2FileScan
Microsoft MPEG-2 Video Encoder
Internal Script Command Renderer
MPEG Audio Decoder
DV Splitter
Video Mixing Renderer 9
Microsoft MPEG-2 Encoder
ACM Wrapper
Video Renderer
MPEG-2 Video Stream Analyzer
Line 21 Decoder
Video Port Manager
Video Renderer
VPS Decoder
WM ASF Writer
VBI Surface Allocator
File writer
iTV Data Sink
iTV Data Capture filter
DVD Navigator
Microsoft TV Subtitles Decoder
Overlay Mixer2
AVI Draw
RDP DShow Redirection Filter
Microsoft MPEG-2 Audio Encoder
WST Pager
MPEG-2 Demultiplexer
DV Video Decoder
SampleGrabber
Null Renderer
MPEG-2 Sections and Tables
Microsoft AC3 Encoder
StreamBufferSource
Smart Tee
Overlay Mixer
AVI Decompressor
NetBridge
AVI/WAV File Source
Wave Parser
MIDI Parser
Multi-file Parser
File stream renderer
Microsoft DTV-DVD Audio Decoder
StreamBufferSink2
AVI Mux
Line 21 Decoder 2
File Source (Async.)
File Source (URL)
Media Center Extender Encryption Filter
AudioRecorder WAV Dest
AudioRecorder Wave Form
SoundRecorder Null Renderer
Infinite Pin Tee Filter
Enhanced Video Renderer
BDA MPEG2 Transport Information Filter
MPEG Video Decoder

-- External Renderers ({CC7BFB41-F175-11D1-A392-00E0291F3959}) --

-- Midi Renderers ({4EFE2452-168A-11D1-BC76-00C04FB9453B}) --
Default MidiOut Device
Microsoft GS Wavetable Synth

-- Video Capture Sources ({860BB310-5D01-11D0-BD3B-00A0C911CE86}) --

-- Video Compressors ({33D9A760-90C8-11D0-BD43-00A0C911CE86}) --
WMVideo8 Encoder DMO
WMVideo9 Encoder DMO
MSScreen 9 encoder DMO
DV Video Encoder
MJPEG Compressor
Cinepak Codec by Radius
Intel IYUV codec
Intel IYUV codec
Microsoft RLE
Microsoft Video 1

-- WDM Stream Decompression Devices ({2721AE20-7E70-11D0-A5D6-28DB04C10000}) --

-- WDM Streaming Capture Devices ({65E8773D-8F56-11D0-A3B9-00A0C9223196}) --
HD Audio Microphone
HD Audio Muxed capture

-- WDM Streaming Crossbar Devices ({A799A801-A46D-11D0-A18C-00A02401DCD4}) --

-- WDM Streaming Rendering Devices ({65E8773E-8F56-11D0-A3B9-00A0C9223196}) --
HD Audio Speaker

-- WDM Streaming Tee/Splitter Devices ({0A4252A0-7E70-11D0-A5D6-28DB04C10000}) --
Tee/Sink-to-Sink Converter

-- WDM Streaming TV Audio Devices ({A799A802-A46D-11D0-A18C-00A02401DCD4}) --

-- WDM Streaming TV Tuner Devices ({A799A800-A46D-11D0-A18C-00A02401DCD4}) --

-- WDM Streaming VBI Codecs ({07DAD660-22F1-11D1-A9F4-00C04FBBDE8F}) --

-- WDM Streaming Communication Transforms ({CF1DDA2C-9743-11D0-A3EE-00A0C9223196}) --
Tee/Sink-to-Sink Converter

-- WDM Streaming Data Transforms ({2EB07EA0-7E70-11D0-A5D6-28DB04C10000}) --

-- WDM Streaming Interface Transforms ({CF1DDA2D-9743-11D0-A3EE-00A0C9223196}) --

-- WDM Streaming Mixer Devices ({AD809C00-7B88-11D0-A5D6-28DB04C10000}) --

-- BDA Network Providers ({71985F4B-1CA1-11D3-9CC8-00C04F7971E0}) --
Microsoft ATSC Network Provider
Microsoft DVBC Network Provider
Microsoft DVBS Network Provider
Microsoft DVBT Network Provider
Microsoft Network Provider

-- BDA Receiver Components ({FD0A5AF4-B41D-11D2-9C95-00C04F7971E0}) --

-- BDA Rendering Filters ({71985F4A-1CA1-11D3-9CC8-00C04F7971E0}) --

-- BDA Source Filters ({71985F48-1CA1-11D3-9CC8-00C04F7971E0}) --

-- BDA Transport Information Renderers ({A2E3074F-6C3D-11D3-B653-00C04F79498E}) --
BDA MPEG2 Transport Information Filter
MPEG-2 Sections and Tables

-- Video Effects (1 input) ({CC7BFB42-F175-11D1-A392-00E0291F3959}) --
BasicImage
Convolution
Chroma
Matrix
Pixelate
DxtAlphaSetter Class
Text Label Class
Scale
Blur
Glow
ICMFilter
Alpha
Wave
MotionBlur
Emboss
Engrave
Light

-- Video Effects (2 inputs) ({CC7BFB43-F175-11D1-A392-00E0291F3959}) --
CrBlinds
Iris
ZigZag
RandomBars
CrIris
Spiral
Pixelate
Wheel
Strips
CrStretch
Inset
CrSlide
CrInset
Compositor
Blinds
CrSpiral
Wipe
CheckerBoard
DxtCompositor Class
CrBarn
DxtKey Class
Slide
DxtJpeg Class
CrZigzag
Barn
Stretch
RandomDissolve

-- EncAPI Encoders ({7D22E920-5CA9-4787-8C2B-A6779BD11781}) --
Microsoft MPEG-2 Audio Encoder
Microsoft MPEG-2 Video Encoder

-- EncAPI Multiplexers ({236C9559-ADCE-4736-BF72-BAB34E392196}) --
Microsoft MPEG-2 Encoder

• #### How to calculate a sine sweep - the right way

Three quarters are better than a dollar because they make noise!
-- Lilly, Lilly's Purple Plastic Purse

Last time I talked about how to calculate a sine sweep mathematically.  There's a closed-form solution which is quite elegant but, alas, useless in a practical computer implementation due to the very big numbers that are being fed to the sin(...) function.

A computer implementation will care only about the discrete samples that need to be generated.  The integral in the previous post turns out to be counterproductive - we're much more interested in the infinite sequence of φi, and more particularly in their residue mod 2π.

Here are Matlab scripts that implement a sine sweep both ways:

First, the naïve mathematical solution that just calculates the exponential:

function signal = sinesweep_nostate( ...
start, ...
finish, ...
seconds, ...
sample_rate, ...
amplitude, ...
initial_phase, ...
dc ...
)
% sinesweep_nostate returns a single-channel sine logarithmic sweep
%     DO NOT USE THIS FUNCTION IN PRODUCTION
%     THIS IS A PEDAGOGICAL EXERCISE ONLY
%     THE SIGNAL THIS PRODUCES GRADUALLY DISTORTS
%     INSTEAD USE sinesweep
%
%     start: starting frequency in Hz
%     finish: ending frequency in Hz
%     seconds: duration of sweep in seconds
%     samplerate: samples per second
%     amplitude: amplitude
%     initial_phase: starting phase
%     dc: dc

% echo these interesting intermediate values to the console
R = (finish / start) ^ (1 / seconds)
B = 2 * pi * start / log(R)
A = initial_phase - B

time = 0 : 1 / sample_rate : seconds;
phase = A + B * R .^ time;
signal = amplitude * sin(phase) + dc;

end % sinesweep_nostate

Now, the iterative version that adds up the little Δφs to calculate φi iteratively:

function signal = sinesweep( ...
start, ...
finish, ...
seconds, ...
sample_rate, ...
amplitude, ...
initial_phase, ...
dc ...
)
% sinesweep returns a single-channel sine logarithmic sweep
%     start: starting frequency in Hz
%     finish: ending frequency in Hz
%     seconds: duration of sweep in seconds
%     samplerate: samples per second
%     amplitude: amplitude
%     initial_phase: starting phase
%     dc: dc

time = 0 : 1 / sample_rate : seconds;
frequency = exp( ...
log(start) * (1 - time / seconds) + ...
log(finish) * (time / seconds) ...
);

phase = 0 * time;
phase(1) = initial_phase;
for i = 2:length(phase)
phase(i) = ...
mod(phase(i - 1) + 2 * pi * frequency(i) / sample_rate, 2 * pi);
end

signal = amplitude * sin(phase) + dc;

end % sinesweep

If anything, one would expect the first to be more accurate, since it is mathematically precise, and the second is only an approximation.  But let's see how the signal produced by each of them holds up.

I calculate, and plot, the same sweep both ways, with a slight dc offset to facilitate seeing the different sweeps on the same plot.  In particular this is a logarithmic sweep from 20 kHz to 21 kHz, over 30 seconds, using 44100 samples per second, an amplitude of 0.2, an initial phase of π, and a dc of 0.25 for one and 0.26 for the other:

plot(...
x, sinesweep(20000, 21000, 30, 44100, 0.2, pi, 0.25), 'd', ...
x, sinesweep_nostate(20000, 21000, 30, 44100, 0.2, pi, 0.26), 's' ...
)

(x is just 1:30*44100 - necessary to allow a single plot statement.)

Note that the approximate method is plotted in diamonds, and will be 0.01 lower on the resulting graph than the "exact" method, plotted in squares.

(This takes a while to plot because there are a heckuva lot of points...)

Ah, a nice solid mass of green and blue.  I won't show it here.

OK, let's zoom in to a very early chunk of the signal - I expect these to match up closely, with only the dc offset separating the points:

Yup.  Looks pretty good - give or take a pixel (quantization error in the display.)

Now let's see what happens to the signal towards the end of the 30 seconds.

Ick.  Something's rotten in the state of Denmark.  This distortion is so bad that it's visible - you don't even have to take an FFT to see it.

Exercise: take an FFT and look at the distortion.

Advanced Exercise: demonstrate that the distortion is, in fact, in the "exact" method and not in the iterative method... that is to say, show which clock is right.

EDIT: On second glance it looks like the second picture is just showing a horizontal shift between the two signals, which is expected.  I'll need to dig deeper if I'm to prove my assertion that the iterative method produces less distortion than the "exact" method.

• #### How to calculate a sine sweep - the wrong way

Suppose you want to generate a continuous sine-sweep from time tstart to time tend. You want the starting frequency to be ωstart, and the ending frequency to be ωend; you want the sweep to be logarithmic, so that octaves are swept out in equal times. (The alternative would be to sweep linearly, but usually logarithmic sweeping is what you want.) For today we're going to have continuous time and sample values, so sample rate and bit depth will not be part of this discussion; in particular, the units of t will be seconds, and our sample values will go from -1 to 1.

Here's a picture of want we want to end up with:

Without loss of generality, set tstart = 0. This simplifies some of the math.

Let ω(t) be the "instantaneous frequency" at time t. Advanced Exercise: Define "instantaneous frequency" mathematically.

Sweeping ω(t) logarithmically means that log ω(t) is swept linearly from log ωstart to log ωend. So

Quick check: ω(0) = ωstart, ω(tend) = ωend. Yup.

Let's define R (the rate of change of the frequency) as (ωend / ωstart)1 / tend. This formula reduces to

ω(t) = ωstart Rt

Next step is to look at phase. For a sine wave of constant frequency ω, phase progresses linearly with t on the circular interval [0, 2π):
φ = k ω t
for some constant k. If the frequency of a sine wave was 1 cycle per second, the phase would go from 0 to 2π in a single second - this reveals that the correct value of k is 2π:
φ = 2π ω t

What happens with a variable frequency? A little calculus provides the answer. Consider a small change Δt, small enough so that ω(t) is almost constant over the interval:
Δφ ≈ 2π ω(t) Δt
because over that interval the sine sweep function is well approximated by a sine wave of constant frequency ω(t).

Starting from an initial phase φstart, let Δt → 0, and:

Substitute:
u = τ log R
τ = u / log R
= du / log R

Define:
B = 2π ωstart / log R
A = φstart - B
and we have:

φ(t) = A + BRt

Just for fun, here's the explicit solution without intermediate values:

Now, the payoff: the signal is just the sine of the phase, so our sine sweep is:

or, more explicitly:

Ah, beautiful mathematically.

But useless practically.

Why useless?

Well, note that our equation for phase is an exponential. That expression inside the sin(...) gets very big, very quickly. Any practical implementation of sin(...) is going to be utterly useless once its argument gets beyond a certain threshold - you'll get ugly chunky steps before you're halfway into the sweep.

Nonetheless, there is a practical way to generate (pretty good) sine sweeps. More on that next time.

EDIT: all logs are base e.

• #### The case of the default audio device from the future

Hover the mouse over each picture (xkcd-style) in turn to get the text that tells the story.

An amusing behavior of Windows Vista recently came to my attention... sometimes the Sound control panel wants to go back... to the future!

So far, so good.

D'oh! Luckily, there's a way to recover...

Voilà tout.
• #### xargs start

Like many programmers, I've messed around in a lot of development environments - including the UNIX/Linux family of operating systems.

One of the UNIX commands I like is xargs... this is very handy when writing one-off command lines or scripts.

I like it so much that I miss it terribly when I'm in a Windows OS.  So I wrote an xargs.bat and stuck it in my PATH.  Now instead of writing complicated for loops, I can just pipe to xargs.

The script is "powered by" the highlighted line:

C:\Users\MatEer\Desktop\custom-path>type xargs.bat
@echo off

setlocal

if /i (%1)==(/?) goto USAGE

if /i (%1)==() goto USAGE

goto NOQUOTES

:USAGE
echo usage: something-that-produces-output ^| %0 [/?] [/addquotes] thing-to-run
echo   xargs.bat by Matthew van Eerde 10/3/2005
echo.
echo   something-that-produces-output should write to its STDOUT
echo   thing-to-run will have each line of the output appended to it,
echo   then will be run successively
echo.
echo   If /addquotes is set, quotes will be added around the line
echo   before appending the line to thing-to-run
echo.
echo   If you call xargs without piping output to it, xargs will wait
echo   for you to type something in on STDIN.
echo   Ctrl-Z on its own line to finish.
goto END

rem eat /addquotes parameter
shift

rem Alas, shift doesn't affect %*
if (%1)==() goto USAGE
set basecommand=%1
shift

:BUILDBASECOMMAND
if (%1)==() goto DONEBASECOMMAND
set basecommand=%basecommand% %1
shift
goto BUILDBASECOMMAND
:DONEBASECOMMAND

rem run the program specified by %*
rem as many times as there are lines in STDIN
rem with one extra argument -- defined by each line of STDIN -- in quotes
rem
rem all that the find command does is intercept STDIN
rem
for /F "usebackq delims=" %%a in (`find /v ""`) do %basecommand% "%%a"

goto END

:NOQUOTES

rem run the program specified by %*
rem as many times as there are lines in STDIN
rem with extra arguments defined by each line of STDIN
rem
rem all that the find command does is intercept STDIN
rem
for /F "usebackq delims=" %%a in (`find /v ""`) do call %* %%a

goto END

:END

endlocal

This allows wonderfully anaOStic things like findstr /m IFoo * | xargs start

• #### Nitpicking Sam Loyd's "Organ Pipes" mate in two

Sam Loyd was a wacky guy.  His legacy of puzzles is well worth digging through.  (I'd stay away from the 15/14 puzzle though.)

A sprinkling of his puzzles were chess-related, and plenty of them stand up well in the greater lexicon of chess puzzles.  For example, his "Organ Pipes" mate-in-two problem is quite well known:

White to move and mate in two (Sam Loyd, 1859)

(If you haven't seen it before, go ahead and try it out.  It's a cute little interference problem.)

OK, now we get to it.

From a technical point of view, there's a slight flaw here.  Ideally, in a chess problem Black has very many options, and White has only a single path to victory in all variations.

This is not the case here.  If the Bishop on f8 wanders away (1. ... Bg7, 1. ... Bh6) or is blocked by the Rook on e8 (1. ... Re7) then White has two mates: 2. Qb6# or 2. Qxb4#.

 1. Qa5 1. ... Bb7 2. Nf5# 1. ... Bd7 2. Qd5# 1. ... Be6 2. Qe5# 1. ... Bf5 2. Nxf5# 1. ... Rd7 2. Nf5# 1. ... Rd6 2. Qxb4# 1. ... Rd5 2. Qxd5# 1. ... Re7 2. Qb6# or 2. Qxb4# 1. ... Rd6 2. Nf5# 1. ... Re5 2. Qxe5# 1. ... Bc5 2. Qa1# 1. ... Bd6 2. Qd5# 1. ... Be7 2. Qe5# 1. ... Bg7 2. Qb6# or 2. Qxb4# 1. ... Bh6 2. Qb6# or 2. Qxb4#

Luckily, the problem can be easily "fixed" - add a black pawn on a7:

White to move and mate in two
(Sam Loyd, 1859; version by Matthew van Eerde, 2008)

This covers the b6 square.  Now the only mate is to take on b4.

Beauty is a tricky beast though.  This is now a more "technically correct" problem, but the position is no longer quite as "natural"-looking.  (I quote "natural" because I've seen plenty of ugly over-the-board positions.)  The pawn on a6 must have come from b7, and the pawn on b4 must have come from c7 (or perhaps from further away.)

I'm not sure which version I prefer, but it's nice that the flaw is not serious.

• #### How much is a perfect Jeopardy! score?

Some sports and games have a notion of a "perfect score."  For example, a perfect score in bowling is 300 (twelve consecutive strikes;) a perfect score in golf would be 18 (18 holes-in-one.)

After watching Watson destroy Ken Jennings and Brad Rutter I began to wonder what a perfect score in Jeopardy! would be.

Assumptions:

• The player knows (or can guess) all the correct questions for the given answers.
• The player can consistently beat the other players to the buzzer.
• The player knows (or can guess) where all three Daily Doubles are.
• The Daily Double in the Jeopardy! round happens to be on a \$200 clue.
• The two Daily Doubles in the Double Jeopardy! round both happen to be on \$400 clues.
• The player is willing to "bet it all" at all four opportunities (the three Daily Doubles and Final Jeopardy!).

Jeopardy! has three rounds:

1. The Jeopardy! round
The player starts this round with \$0.
There are six categories with five clues each: \$200, 400, 600, 800, 1000.  That's \$3000 for each category for a nominal total of \$18,000.
But wait - one of those clues is a Daily Double!  For maximum value we're assuming it's a \$200 clue.
The player gets all the normal clues correct and amasses \$17,800.
The player then "makes it a true daily double", gets it right, and doubles their total to \$35,600.
2. The Double Jeopardy! round
The player starts this round with \$35,600.  Again, there are six categories with five clues each.  But this time the values are \$400, 800, 1200, 1600, and \$2000.
That's \$6000 for each category for a nominal total of \$36,000.
But wait - two of those clues are Daily Doubles!  For maximum value we're assuming they're both \$400 clues.
The player gets all the normal clues correct and amasses an additional \$35,200.  They now have \$70,800.
The player then selects the first of the two Daily Doubles for this round, "makes it a true daily double", gets it right, and doubles their total to \$141,600.  (Jeopardy!, unlike some game shows, does not materialize winnings at the end of a round; there's just a single running total for each player throughout the episode.)
The player then selects the last Daily Double, "makes it a true daily double", gets it right, and doubles their total to \$283,200.
3. Final Jeopardy!
The player starts this round with \$283,200.
The player looks at the category, bets everything, looks at the answer, writes down the correct question, and doubles their total to \$566,400.

Conclusion: A perfect score for a single episode of Jeopardy! is \$566,400.

The actual record appears to be Roger Craig's win of \$77,000 on September 14 2010.

• #### Generating formulae for polygonal numbers (triangular, square, pentagonal...)

The formula for square numbers is easy to remember:
sn = n2
There are also triangular numbers, pentagonal numbers, hexagonal numbers, and in general k-gonal numbers. These have formulae too... for example, triangular numbers:
tn = n (n + 1) / 2
... but they can be hard to remember.

If you see a fact, try to see it as intuitively as possible.

Some formulae are not worth memorizing - it is more worthwhile to grasp the ideas behind the formulae and rederive them when you need them (or look them up.) This blog post shows how to derive the formulae for k-gonal numbers.

Let's define a few terms. I've defined sn and tn above, and I'll define pn as the nth pentagonal number and hn as the nth hexagonal number, but let's generalize... let gk,n = g(k, n) be the nth k-gonal number. For example:
g(3, n) = tn = n (n + 1) / 2
g(4, n) = sn = n2
The task, then, is to find a formula for g(k, n) for general k and n.

Consider the nth k-gonal number. For simplicity of illustration, I will focus on a particular number, but the approach followed here should be followed while keeping the general case in mind. So, specifically, let's take a look at the 4th pentagonal number p4 = g(5, 4):

The idea is to conceptually "cut" (n – 1) specific lines...

... and then "unroll" the structure to form a kind of "bar chart":

The base of the chart is precisely one edge of the original diagram and so has length n = 4 (counting dots, not lines.) There are a couple of ways to find the height of the last bar - you can observe that each bar (not counting the bottom dot) adds (k – 2) = 3 over the bar before it (as marked below,) and there are (n – 1) = 3 nontrivial bars.
Or you can note that each bar is the bottom dot, plus the sum of (k – 2) = 3 edges, each of which has (n – 1) = 3 unique dots, if we don't count the first dot as being part of the edge.
Either way, you come up with the height being:
h = 1 + (k – 2)(n – 1) = 1 + 3(3) = 10

So we know the base and the height of the triangle. How do we find the area? Well, if we double the triangle, we get a rectangle:

The base of the rectangle is n = 4, and the height is:
h + 1 = 2 + (k – 2) (n – 1) = (k – 2) n + (4 – k) = (5 – 2) 4 + (4 – 5) = 11
The area of the triangle is half of the area of the rectangle:
g(k, n) = n ((k – 2) n + (4 – k)) / 2 = 4 ((5 – 2) 4 + (4 – 5)) / 2 = 22
Counting dots confirms that there are, in fact, 22.

So we have a general theorem - let's try it out on a few special cases. We can use the ones we know as a check.

g(3, n) = n ((3 – 2) n + (4 – 3)) / 2 = n (n + 1) / 2 = tn
g(4, n) = n ((4 – 2) n + (4 – 4)) / 2 = n2 = sn
g(5, n) = n ((5 – 2) n + (4 – 5)) / 2 = n (3n – 1) / 2 = pn
g(6, n) = n ((6 – 2) n + (4 – 6)) / 2 = n (2n – 1) = hn
g(7, n) = n ((7 – 2) n + (4 – 7)) / 2 = n (5n – 3) / 2
...
g(k, n) = n ((k – 2) n + (4 – k)) / 2
...

Further checks: g(k, 1) should be 1 for all k, and g(k, 2) should be k for all k. These also check out.

• #### Car Talk puzzler analysis - palindromic odometers

Last week's Car Talk question was again of mathematical interest - it dealt with palindromic numbers on a car's odometer. (This is not the first palindromic odometer question they've done.)

Read Car Talk's "Tommy's Drive to Work" puzzler question

Short version:

Tommy goes for a short drive (under an hour; possibly well under an hour) in his 19-year old new car. He notices that the odometer reading at the beginning and the end of the drive are both palindromes (we're given that his odometer shows six digits.) The question is, how many miles did he drive?

You're supposed to do a bunch of math, and then experience an epiphany when you realize that, against all odds, there's a single unique answer, even though the odometer readings themselves cannot be uniquely determined.

Alas, the problem is flawed. There are two perfectly good (and different) answers.

## The intended answer

The first reading is given as a palindrome - the first three digits (let's call them a, b, c) are the same as the last three digits in reverse (c, b, a). So we can write the first number as abccba.

The second reading is also given as a palindrome, and we're given that it's a different palindrome (we infer that the odometer is not broken, and he drove at least a mile.) So we can write the second number as xyzzyx.

We want to find whether it's possible to get these two readings while driving a reasonably small distance. The distance driven is the difference between the odometer readings: xyzzyxabccba.

xyzzyxabccba
= (100001 x + 10010 y + 1100 z) – (100001 a + 10010 b + 1100 c)
= 100001 (xa) + 10010 (yb) + 1100 (zc)

Because xyzzyx > abccba, we know that xa. Also, if x = a, we know that yb. Finally, if x = a and y = b, we know that z > c (z cannot equal c in this case.) Let's look at these three cases separately.

Case 1: x = a, y = b, z > c
xyzzyxabccba
= 100001 (xa) + 10010 (yb) + 1100 (zc)
= 100001 (0) + 10010 (0) + 1100 (zc)
= 1100 (zc)

z and c can range from 0 to 9, and z > c, so (zc) can range from 1 to 9. So the smallest the distance can be is 1100 miles - for example, 250052 → 259952 (1100 miles.) This is a bit far for an hour's drive.

Case 2: x = a, y > b, no conditions imposed on z vs. c
xyzzyxabccba
= 100001 (xa) + 10010 (yb) + 1100 (zc)
= 100001 (0) + 10010 (yb) + 1100 (zc)
= 10010 (yb) + 1100 (zc)

y, b, z, and c can range from 0 to 9; y > b; and no conditions are imposed on z vs. c. (yb) can range from 1 to 9, (zc) can range from -9 to 9. So the smallest the distance can be is 100101 + 1100 (-9) = 110 miles - for example, 379973 → 380083 (110 miles.) It's certainly possible to drive 110 miles in an hour but not without attracting the attention of the local police!

Case 3: x > a, no conditions imposed on y vs. b or z vs. c
xyzzyxabccba
= 100001 (xa) + 10010 (yb) + 1100 (zc)

x, a, y, b, z, and c can range from 0 to 9; x > a; and no conditions are imposed on y vs. b, or z vs. c. (xa) can range from 1 to 9, (yb) and (zc) can range from -9 to 9. So the smallest the distance can be is 100001 (1) + 10010 (-9) + 1100 (-9) = 11 miles - for example, 499994 → 500005 (11 miles.) This is much more reasonable.

The intended answer, then, is 11 miles. There are 9 possible ways to get this distance with palindromic starting and ending readings:

099990 → 100001 (11 miles)
199991 → 200002 (11 miles)
299992 → 300003 (11 miles)
399993 → 400004 (11 miles)
499994 → 500005 (11 miles)
599995 → 600006 (11 miles)
699996 → 700007 (11 miles)
799997 → 800008 (11 miles)
899998 → 900009 (11 miles)

## Not a flaw

Do we consider numbers with unmatched leading zeros (like 001221) to be palindromes? I would say no. If we did this, it would open up a whole new field of answers such as 77 → 101 (24 miles.) The problem statement seems to imply that all six digits of the number, including leading zeros, need to participate in the palindrome. We can probably infer that Tommy's odometer is an analog odometer that actually shows leading zeros, rather than a digital odometer which hides them; this is consistent with the car being 19 years old.

## The flaw

All well and good. Alas, there's a flaw in this beautiful analysis - namely, odometers wrap. It's entirely possible for the assumption that xyzzyx > abccba to break down if Tommy started his drive with a number close to 999999, and his odometer wrapped around to 000000 during the drive.

In fact, this leads to the second reasonable answer:
999999 → 000000 (1 mile.)

There are other "wrapped" palindromes, but the next smallest are 998899 → 000000 (1101 miles) and 999999 → 001100 (1101 miles) which are even worse than case 1.

Could a 19-year-old car have a million miles on it?

That comes out to an average of 144.1 miles a day, which is high, but achievable (it's only 6 mph.) It's true that Tommy is unlikely to have put this many miles on a car if he lives only a mile from work, but remember that this is his new car (that is, it only recently came into his possession.)

• #### Grabbing the output of the Microsoft Speech API text-to-speech engine as audio data

A while ago I wrote a post on Implementing a "say" command using ISpVoice from the Microsoft Speech API which showed how to use Speech API to do text-to-speech, but was limited to playing the generated audio out of the default audio device.

Recently on the Windows Pro Audio forums, user falven asked a question about how to grab the output of the text-to-speech engine as a stream for further processing.

Here's how to do it.

The key part is to use ISpStream::BindToFile to save the audio data to a .wav file, and ISpStream::SetBaseStream to save to a given IStream. Then call ISpVoice::SetOutput with the ISpStream, prior to calling ISpVoice::Speak.

ISpStream *pSpStream = nullptr;
hr = CoCreateInstance(
CLSID_SpStream, nullptr, CLSCTX_ALL,
__uuidof(ISpStream),
(void**)&pSpStream
);
if (FAILED(hr)) {
ERR(L"CoCreateInstance(ISpVoice) failed: hr = 0x%08x", hr);
return -__LINE__;
}
ReleaseOnExit rSpStream(pSpStream);

if (File == where) {
hr = pSpStream->BindToFile(
file,
SPFM_CREATE_ALWAYS,
&SPDFID_WaveFormatEx,
&fmt,
0
);
if (FAILED(hr)) {
ERR(L"ISpStream::BindToFile failed: hr = 0x%08x", hr);
return -__LINE__;
}
} else {
// stream
pStream = SHCreateMemStream(NULL, 0);
if (nullptr == pStream) {
ERR(L"SHCreateMemStream failed");
return -__LINE__;
}

hr = pSpStream->SetBaseStream(
pStream,
SPDFID_WaveFormatEx,
&fmt
);
if (FAILED(hr)) {
ERR(L"ISpStream::SetBaseStream failed: hr = 0x%08x", hr);
return -__LINE__;
}
}

hr = pSpVoice->SetOutput(pSpStream, TRUE);
if (FAILED(hr)) {
ERR(L"ISpVoice::SetOutput failed: hr = 0x%08x", hr);
return -__LINE__;
}

Updated source and binaries attached.

Usage:

>say.exe
say "phrase" [--file <filename> | --stream]
runs phrase through text-to-speech engine
if --file is specified, writes to .wav file
if --stream is specified, captures to a stream
if neither is specified, plays to default output

Here's how to generate a .wav file (uh.wav attached)

>say.exe "uh" --file uh.wav
Stream is 1

And here's how to generate an output stream. The app consumes this and prints the INT16 sample values to the console. uh.txt attached.

>say.exe "uh" --stream
Stream is 1
0        0;        0        0;        0        0;        0        0
0        0;        0        0;        0        0;        0        0
...
86       86;    -1052    -1052;    -2839    -2839;    -3774    -3774
-4199    -4199;    -4581    -4581;    -4284    -4284;    -3640    -3640
-3100    -3100;    -2011    -2011;     -393     -393;      533      533
...

• #### Getting peak meters and volume settings for all apps and audio devices on the system

A few previous posts have touched on how to get peak meter readings on the device, and per-app

Let's put it all together and write a single command-line tool which enumerates:
1. All active audio devices (both playback and recording)
2. Dumps the peak meter and volume levels for each device
3. All active audio applications (audio sessions) per device
4. Dumps the peak meter and volume levels for each audio session
Note there is no API for enumerating individual streams within a session.
Pseudocode:
For each flow in (render, capture)
For each device in IMMDeviceEnumerator::EnumAudioEndpoints(flow)
Display the name of the device
Get and display IAudioMeterInformation::GetPeakValue for the device
Get and display IAudioEndpointVolume data for the device
For each session in IAudioSessionManager2::GetSessionEnumerator
Skip the session unless the state is "active"
Get and display IAudioMeterInformation::GetPeakValue for the session
Display session information
Get and display ISimpleAudioVolume information
Get and display IChannelAudioVolume information

Sample output:

>meters.exe
-- Playback devices --
Line out (High Definition Audio Device)
Peak: 0.404736
Mute: 0
Volume range: 0% to 100% (-46.5 dB to 0 dB in steps of 1.5 dB)
Master: 74% (-4.32831 dB)
Channel 1 of 2: 74% (-4.32831 dB)
Channel 2 of 2: 74% (-4.32831 dB)

Active session #1
Peak value: 0.240089
Icon path:
Display name:
Grouping parameter: {98710e41-6535-4cf0-b9b3-4181a0b7103e}
Process ID: 3496 (single-process)
System sounds session: no
Package full name: Microsoft.ZuneMusic_2.2.41.0_x64__8wekyb3d8bbwe
Master volume: 1 (0 dB FS)
Not muted
Channel #1 volume: 1 (0 dB FS)
Channel #2 volume: 1 (0 dB FS)

Active session #2
Peak value: 0.329753
Icon path:
Display name:
Grouping parameter: {fc078096-d2fc-4883-8b0d-af4619266c02}
Process ID: 6720 (multi-process)
System sounds session: no
HWND: 0x00000000000D1390 Windows Media Player
Master volume: 1 (0 dB FS)
Not muted
Channel #1 volume: 1 (0 dB FS)
Channel #2 volume: 1 (0 dB FS)

Internal speakers (High Definition Audio Device)
Peak: 0
Mute: 1
Volume range: 0% to 100% (-46.5 dB to 0 dB in steps of 1.5 dB)
Master: 65.7804% (-6 dB)
Channel 1 of 1: 65.7804% (-6 dB)

-- Recording devices --
Microphone (High Definition Audio Device)
Peak: 0.000274658
Mute: 0
Volume range: 0% to 100% (-34.5 dB to 12 dB in steps of 1.5 dB)
Master: 84.7652% (6 dB)
Channel 1 of 2: 84.7652% (6 dB)
Channel 2 of 2: 84.7652% (6 dB)

Active session #1
Peak value: 0.000274658
Icon path:
Display name:
Grouping parameter: {cee77f5a-d651-4392-8ffc-232c6eecdf51}
Process ID: 8212 (single-process)
Session identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Program Files\WindowsApps\Microsoft.WindowsSoundRecorder_6.3.9600.16384_x64__8wekyb3d8bbwe\soundrec.exe%b{00000000-0000-0000-0000-000000000000}
Session instance identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Program Files\WindowsApps\Microsoft.WindowsSoundRecorder_6.3.9600.16384_x64__8wekyb3d8bbwe\soundrec.exe%b{00000000-0000-0000-0000-000000000000}|1%b8212
System sounds session: no
Package full name: Microsoft.WindowsSoundRecorder_6.3.9600.16384_x64__8wekyb3d8bbwe
Master volume: 0.847652 (-1.43565 dB FS)
Not muted
Channel #1 volume: 1 (0 dB FS)
Channel #2 volume: 1 (0 dB FS)

Active session #2
Peak value: 0.000274658
Icon path:
Display name:
Grouping parameter: {c346e9e3-a37e-427b-a2be-1feb2c81b469}
Process ID: 2608 (single-process)
Session identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Windows\System32\SoundRecorder.exe%b{00000000-0000-0000-0000-000000000000}
Session instance identifier: {0.0.1.00000000}.{878a0979-89d6-43ec-9cff-e3f70dac2618}|\Device\HarddiskVolume1\Windows\System32\SoundRecorder.exe%b{00000000-0000-0000-0000-000000000000}|1%b2608
System sounds session: no
HWND: 0x00000000004611FA Sound Recorder
Master volume: 0.847652 (-1.43565 dB FS)
Not muted
Channel #1 volume: 1 (0 dB FS)
Channel #2 volume: 1 (0 dB FS)

Source and binaries attached.

• #### Addition and multiplication table for GF(2²)

I'm reading Joan Daemen and Vincent Rijmen's book The Design of Rijndael and I'm giving myself a refresher course on group theory.

Key to the encryption standard is the Galois field on 256 elements GF(28). A multiplication table of 256 elements by 256 elements quickly becomes a wall of text, so let's reason by analogy and look at GF(22).

There are a number of ways to represent elements of the field; we'll start by representing them as polynomials with degree at most 1, and with integer coefficients modulo 2. There are four such polynomials: {0, 1, x, x + 1}.

Here are the addition and multiplication tables:

 + 0 1 x x + 1 0 0 1 x x + 1 1 1 0 x + 1 x x x x + 1 0 1 x + 1 x + 1 x 1 0
 ⋅ 0 1 x x + 1 0 0 0 0 0 1 0 1 x x + 1 x 0 x x2 mod m x2 + x mod m x + 1 0 x + 1 x2 + x mod m x2 + 1 mod m

Hold on. What's that funny-looking m?

It's a "reduction polynomial" which brings the product back down to degree 1 or less. It has to be a polynomial of degree 2. There are four such polynomials: let's try each and see what we get.

x2
 0 x x 1
x2 + 1
 1 x + 1 x + 1 0
x2 + x
 x 0 0 x + 1
x2 + x + 1
 x + 1 1 1 x

Note that the first three polynomials all factor into products of lower-degree polynomials: x2 = x(x), x2 + 1 = (x + 1)(x + 1), x2 + x = x(x + 1). Only x2 + x + 1 is prime; and this prime reduction polynomial generates a complete multiplication table with no 0s. This is a necessary condition to be a field. Our final tables are:

 + 0 1 x x + 1 0 0 1 x x + 1 1 1 0 x + 1 x x x x + 1 0 1 x + 1 x + 1 x 1 0
 ⋅ 0 1 x x + 1 0 0 0 0 0 1 0 1 x x + 1 x 0 x x + 1 1 x + 1 0 x + 1 1 x

We can also write our elements in binary form: 0 => 00, 1 => 01, x => 10, and x + 1 => 11. In this notation our tables become:

 + 00 01 10 11 00 00 01 10 11 01 01 00 11 10 10 10 11 00 01 11 11 10 01 00
 ⋅ 00 01 10 11 00 00 00 00 00 01 00 01 10 11 10 00 10 11 01 11 00 11 01 10

Rijndael works in GF(28) and uses a reduction polynomial of x8 + x4 + x3 + x + 1. They say this is prime. I sure hope so.

• #### World Cup tiebreaks: how the United States can survive Group G

With World Cup Group G standings being the way they are, I decided to read up on tiebreaks.

Here are the FIFA rules (PDF)

The relevant sections are:

18.4.a: ... with three points for a win, one point for a draw and no points for a defeat (league format)

...

18.6: In the league format, the ranking in each group is determined as follows:

a) greatest number of points obtained in all group matches;
b) goal difference in all group matches;
c) greatest number of goals scored in all group matches.

If two or more teams are equal on the basis of the above three criteria, their rankings shall be determined as follows:

d) greatest number of points obtained in the group matches between the teams concerned;
e) goal difference resulting from the group matches between the teams concerned;
f) greater number of goals scored in all group matches between the teams concerned;
g) the goals scored away from home count double between the teams concerned (if the tie is only between two teams).

18.7 discusses play-off matches between teams that are still tied after all of 18.6.a-g are applied. As we'll see, there is no possibility of the United States being involved in a play-off.

Let's start with 18.6.a. The current point standings in Group G are:

Germany
Four points, with a win and a tie
United States
Four points, with a win and a tie
Ghana
One point, with a tie and a defeat
Portugal
One point, with a tie and a defeat

The remaining games are United States-Germany and Portugal-Ghana.

If the United States-Germany game results in a tie, then the United States and Germany will have five points apiece, with a win and two ties. They will be the top two teams in the group, regardless of the outcome of the Portugal-Ghana game.

If the Portugal-Ghana game results in a tie, then Ghana and Portugal will have two points apiece, with two ties and a defeat. Regardless of the results of the United States-Germany game, the United States and Germany will be the top two teams in the group.

So we need only consider what happens if both games are decisive.

The winner of the United States-Germany game will have seven points, with two wins and a tie. They are in on points.

The loser of the Portugal-Ghana game will have one point, with one tie and two defeats. They are out on points.

The remaining two teams (the loser of the United States-Germany game, and the winner of the Portugal-Ghana game) will have four points apiece, with a win, a tie, and a defeat.

So in this scenario, 18.6.b will be invoked. At this point we have to look at the goal difference for the teams. (The goal difference for, say, Germany is just the number of goals Germany scored, minus the number of goals Germany allowed.) These are currently:

Germany
+4
United States
+1
Ghana
-1
Portugal
-4

The raw numbers look good for Germany and the United States, and bad for Ghana and Portugal. But let's bear in mind that we are considering the scenario where Germany or the United States has just lost, and Ghana or Portugal has just won. How do you win? By scoring more goals than you allow! So the goal differential of the loser of the United States-Germany game will be lower than it is now, and the goal differential of the winner of the Portugal-Ghana game will be higher than it is now.

So as a United States fan:

1. Beat Germany and we're in on points.
2. If we can't beat Germany, tie Germany and we're in on points.
3. If Germany beats us, root for a Portugal-Ghana tie and we're in points.
4. If Germany beats us, and the Portugal-Ghana game is also decisive, root for Portugal to win, and for the margin of victory for both games to add up to no higher than four, and we're in on goal differential.
More explicitly:
1. If Germany wins by one goal, our GD will be zero; Portugal can win by as many as three goals and we will be in on goal differential. If Portugal wins by five or more, we're out on goal differential. If Portugal wins by precisely four, see below.
2. If Germany wins by two goals, our GD will be -1; Portugal can win by as many as two goals and we will be in on goal differential. If Portugal wins by four or more, we're out on goal differential. If Portugal wins by precisely three, see below.
3. If Germany wins by three goals, our GD will be -2; Portugal can win by one goal and we will be in on goal differential. If Portugal wins by three or more, we're out on goal differential. If Portugal wins by precisely two, see below.
4. If Germany wins by four goals, our GD will be -3. If Portugal wins by two or more, we're out on goal differential. If Portugal wins by precisely one, see below.
5. If Germany wins by five goals or more, we're out on goal differential.
5. If Germany beats us, and Ghana beats Portugal, root for the margin of victory in both games to add up to two.
1. If Germany wins by one goal, our GD will be zero; if Ghana wins by one goal, their GD will also be zero. See below.
2. If Germany wins by two goals or more, we're out on goal differential.

What if Germany beats us, and Portugal beats Ghana, but the margin of victory of both games adds up to exactly four? Or what if Germany beats us and Ghana beats Portugal, and the margin of victory of both games adds up to precisely two - that is, each game is decided by a single goal?

We can still get in on 18.6.c. At this point we have to look at the number of goals scored by each team. These are currently:

Germany
6
United States
4
Ghana
3
Portugal
2

Note that the United States has one more goal than Ghana, and two more goals than Portugal. This is good. In the "goal differential" case, there was a strong relationship between winning and changing your goal differential. But there is only a weak relationship between winning and changing your raw goal count.

For example, if Germany beats the United States 3-2, and Ghana beats Portugal 1-0, the United States will actually have pulled even farther ahead in the 18.6.c race!

So:

1. If Germany beats us, and Portugal beats Ghana, and the margin of victory of both games adds up to exactly four, root for Portugal's winning score to be no more than one goal higher than our losing score, and we're in on goals scored.
1. If Portugal's winning score is three or more goals higher than our losing score, we're out on goals scored.
2. If Portugal's winning score is precisely two goal higher than our losing score, see below.
2. If Germany beats us, and Ghana beats Portugal, and the margin of victory of both games adds up to exactly two, root for Ghana's winning score to be no more than our losing score, and we're in on goals scored.
1. If Ghana's winning score is two or more goals than our losing score, we're out on goals scored.
2. If Ghana's winning score is precisely one goal higher than our losing score, see below.

What if Germany beats us, and Portugal beats Ghana, and the margin of victory of the two games adds up to exactly four, and Portugal's winning score is precisely two goals more than our losing score? Or what if Germany beats us and Ghana beats Portugal, and the margin of victory of both games adds up to precisely two, and Ghana's winning score is precisely one goal higher than our losing score?

Then we are in or out on 18.6.d-g. At this point we have to look at the results of the Ghana-United States and United States-Portugal matches. In each match, the first team is the home team and the second team is the away team.

Ghana-United States
1-2
United States-Portugal
2-2

So:

1. If Germany beats us, and Portugal beats Ghana, and the margin of victory of both games adds up to exactly four, and Portugal's winning score is precisely two goals higher than our losing score, we're out:
1. We're still tied on direct match points (18.6.d)
2. We're still tied on direct match goal differential (18.6.e)
3. We're still tied on direct match goals scored (18.6.f)
4. We're out on direct match goals scored with away goals counting double (18.6.g)
2. If Germany beats us, and Ghana beats Portugal, and the margin of victory of both games adds up to exactly two, and Ghana's winning score is precisely one more than our losing score, then we're in on direct match points (18.6.d)

Note there is no possibility of a playoff. Had the United States-Portugal game been a 0-0 tie instead of a 2-2 tie, that would have been a possibility, since double 0 is still 0.

• #### shellproperty.exe - set/read string properties on a file from the command line

Yesterday Raymond Chen blogged a "Little Program" which could edit audio metadata. As it happens, I have a similar tool I threw together which accepts a property key and a string property value to update a property, or can read a string or string-vector property.

Usage:

>shellproperty
shellproperty read <key> from <filename>
shellproperty set <key> to <string> on <filename>

Here's an example _fixup.bat script I use to set audio metadata on my copy of Giuseppe Sinopoli's recording of Madama Butterfly, to help distinguish it from other recordings of the same opera that I have.

@echo off
dir /s /b "I *.mp3" | xargs /addquotes shellproperty set PKEY_Music_AlbumTitle to "Madama Butterfly - Sinopoli / Freni: 1 of 3" on
dir /s /b "II *.mp3" | xargs /addquotes shellproperty set PKEY_Music_AlbumTitle to "Madama Butterfly - Sinopoli / Freni: 2 of 3" on
dir /s /b "III *.mp3" | xargs /addquotes shellproperty set PKEY_Music_AlbumTitle to "Madama Butterfly - Sinopoli / Freni: 3 of 3" on

Source and amd64/x86 binaries attached, but in substance it's very similar to Raymond's "Little Program".

Possible future improvements:

1. When setting, allow specifying a vartype on the command line.
2. Allow specifying a property key by fmtid and pid.
3. Handle more vartypes for displaying properties.
4. Allow dumping all properties on a given file.
• #### Finding the longest substring which occurs twice in a given string

I'm reading Jon Bentley's Programming Pearls and one of the interesting exercises was to find the longest substring which occurs twice in a given string of length n.

There's a naïve solution where you look at every pair of (distinct) indexes (i, j), and calculate the length of the common prefix of the substrings starting at those locations; this is O(n2) assuming that the length of the eventual maximum substring is O(1) (that is, << n.)

Jon shows that there is an O(n log n) algorithm, which is a significant savings if n is large (e.g., War and Peace.)  This involves building an array of all substrings, then sorting them lexically, then walking the sorted array; the trick is that now we only need to check pairs of indexes that correspond to adjacent entries in the array.  That step can be done in O(n) time; the O(n log n) comes from the sorting step.

I wrote up a quick app that implements his suggested algorithm.  Source follows.

// main.cpp

#include <windows.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <search.h>

#define LOG(format, ...) wprintf(format L"\n", __VA_ARGS__)

template<typename T>
class DeleteArrayOnExit {
public:
DeleteArrayOnExit(T *p) : m_p(p), m_canceled(false) {}
~DeleteArrayOnExit() { if (!m_canceled) { delete [] m_p; } }
void Cancel() { m_canceled = true; }
void Swap(T *p) { m_p = p; }
private:
T *m_p;
bool m_canceled;
};

// caller must delete [] the buffer when done with it
int __cdecl pwcscmp(const void *pstra, const void *pstrb) {
return wcscmp(*(LPWSTR*)pstra, *(LPWSTR*)pstrb);
}

// returns length of longest common substring
// don't include the null terminator if both strings are identical
// e.g., comlen(banana, bananas) == comlen(banana, banana) == 6
int comlen(LPCWSTR a, LPCWSTR b) {
int i = 0;

// keep going while the strings are the same and not ended
while (a[i] && (a[i] == b[i])) { i++; }

return i;
}

int _cdecl wmain() {
LPWSTR text = ReadFromStdIn();
if (nullptr == text) {
return -__LINE__;
}
DeleteArrayOnExit<WCHAR> deleteText(text);

size_t size = wcslen(text) + 1;
LPWSTR *suffixes = new LPWSTR[size];
if (nullptr == suffixes) {
LOG(L"Could not allocate memory for suffixes");
return -__LINE__;
}
DeleteArrayOnExit<LPWSTR> deleteSuffixes(suffixes);

for (size_t i = 0; i < size; i++) {
suffixes[i] = &text[i];
}

qsort(suffixes, size, sizeof(LPWSTR), pwcscmp);

// find the longest common adjacent pair
LPWSTR szMax = suffixes[0];
size_t lenMax = 0;
for (size_t i = 0; i < size - 1; i++) {
size_t len = comlen(suffixes[i], suffixes[i + 1]);
if (len > lenMax) {
lenMax = len;
szMax = suffixes[i];
}
}

WCHAR *substring = new WCHAR[lenMax + 1];
if (nullptr == substring) {
LOG(L"Could not allocate memory for substring");
return -__LINE__;
}
DeleteArrayOnExit<WCHAR> deleteSubstring(substring);
if (0 != wcsncpy_s(substring, lenMax + 1, szMax, lenMax)) {
LOG(L"wcsncpy_s failed");
return -__LINE__;
}
substring[lenMax] = L'\0';

// intentionally not using LOG to avoid trailing newline
wprintf(L"%s", substring);

return 0;
}

WCHAR *text = new WCHAR[1];
if (nullptr == text) {
LOG(L"Could not allocate memory for text");
return nullptr;
}
DeleteArrayOnExit<WCHAR> deleteText(text);
text[0] = L'\0';

// read a 2 KB chunk
const size_t buffer_size = 1024;
WCHAR *buffer = new WCHAR[buffer_size];
if (nullptr == buffer) {
LOG(L"Could not allocate memory for buffer");
return nullptr;
}
DeleteArrayOnExit<WCHAR> deleteBuffer(buffer);

bool done = false;
do {
if (fgetws(buffer, buffer_size, stdin)) {
size_t size = wcslen(text) + wcslen(buffer) + 1;
WCHAR *new_text = new WCHAR[size];
if (nullptr == new_text) {
LOG(L"Could not allocate memory for new text");
return nullptr;
}
DeleteArrayOnExit<WCHAR> deleteNewText(new_text);

WCHAR *dest = new_text;

if (0 != wcscpy_s(dest, size, text)) {
LOG(L"wcscpy_s failed");
return nullptr;
}
dest += wcslen(text);
size -= wcslen(text);

if (0 != wcscpy_s(dest, size, buffer)) {
LOG(L"wcscpy_s failed");
return nullptr;
}

// that should do it for copying
// now swap new_text => text

delete [] text;
text = new_text;

deleteText.Swap(new_text);
deleteNewText.Cancel();

} else if (feof(stdin)) {
done = true;
} else {
LOG(L"Error reading from STDIN");
return nullptr;
}
} while (!done);

deleteText.Cancel();
return text;
}

• #### Buffer size alignment and the audio period

I got an email from someone today, paraphrased below:

Q: When I set the sampling frequency to 48 kHz, and ask Windows what the audio period is, I get exactly 10 milliseconds. When I set it to 44.1 kHz, I get very slightly over 10 milliseconds: 10.1587 milliseconds, to be precise. Why?
A: Alignment.

A while back I talked a bit about the WASAPI exclusive-mode alignment dance. Some audio drivers have a requirement that they deal in buffer sizes which are multiples of a certain byte size - for example, a common alignment restriction for HD Audio hardware is 128 bytes.

A more general audio requirement is that buffer sizes be a multiple of the size of a PCM audio frame.
For example, suppose the audio format of a stream is stereo 16-bit integer. A single PCM audio frame will be 2 * 2 = 4 bytes. The first two bytes will be the 16-bit signed integer with the sample value for the left channel; the last two bytes will be the right channel.
As another example, suppose the audio format of a stream is 5.1 32-bit floating point. A single PCM audio frame will be 6 * 4 = 24 bytes. Each of the six channels are a four-byte IEEE floating-point value; the channel order in Windows will be {Left, Right, Center, Low-Frequency Effects, Side Left, Side Right}.

The audio engine tries to run at as close to a 10 millisecond cadence as possible, subject to the two restrictions above. Given a "desired minimum interval" (in milliseconds), and a streaming format, and an "alignment requirement" in bytes, you can calculate the closest achievable interval (without going under the desired interval) as follows:

Note: this only works for uncompressed formats
aligned_buffer(desired_milliseconds, format, alignment_bytes)
desired_frames = nearest_integer(desired_milliseconds / 1000.0 * format.nSamplesPerSec)
alignment_frames = least_common_multiple(alignment_bytes, format.nBlockAlign) / format.nBlockAlign
actual_frames = ceiling(desired_frames / alignment_frames) * alignment_frames
actual_milliseconds = actual_frames / format.nSamplesPerSec * 1000.0

Here's a table of the actual buffer size (in frames and milliseconds), given a few typical inputs:

 Desired (milliseconds) Format Alignment (bytes) Desired frames Alignment (frames) Actual (frames) Actual (milliseconds) 10 44.1 kHz stereo 16-bit integer 128 441 32 448 10.16 10 48 kHz stereo 16-bit integer 128 480 32 480 10 10 44.1 kHz 5.1 16-bit integer 128 441 32 448 10.16 10 48 kHz 5.1 16-bit integer 128 480 32 480 10 10 44.1 kHz 5.1 24-bit integer 128 441 64 448 10.16 10 48 kHz 5.1 24-bit integer 128 480 64 512 10.66

So to be really precise, the buffer size is actually 640/63 = 10.158730 milliseconds.

• #### Basic audio volume theory

Last time we talked about the different Windows Audio Session APIs for setting volume. Let's talk a little about what volume means.

For purposes of illustration, let's take our signal to be a full-scale square wave:

Incidentally, the answer to the exercise

completely characterize the set of digital signals that achieve 0 dB FS intensity. If you have, say, 1000 samples of mono 16-bit integer audio to play with, how many distinct signals achieve 0 dB FS intensity?

is "all signals with N/2 samples having value 1 and N/2 samples having value -1". There are

such signals. For N = 1000 this is comes out to about 2.7e299.

## Linear in amplitude

If we multiply the signal by a number between 0 and 1, we reduce the volume. That's the first natural volume scale - the "linear in amplitude" scale. 0 is silence, 1 is an unchanged signal, and 1/2 is a signal with all the amplitudes divided by 2.

## Linear in power

Recalling the formula for the intensity of a signal, we realize that the power (intensity squared) of a signal depends on the square of the magnitude. This leads us to the second natural volume scale - the "linear in power" scale. 0 is still silence, and 1 is still an unchanged signal, but now 1/2 maps to a signal with half of the power of the original signal. In particular, a full-scale square wave with power 1 would drop to a square wave with power 1/2 - this has amplitude sqrt(1/2) which is significantly more than 1/2.

## Linear in dB

So far so good. However, sound is perceived in a relative fashion. Humans are not very good at determining absolute volume, but are very good at determining relative volume. If I play a sound and ask you "what is the dB SPL of the sound I just played", you would have trouble answering; but if I played two sounds and asked "which sound was louder", you would be able to tell me. You would even be able to give me an estimate such as "it's about twice as loud".

So a natural scale for volume is based on relative power - that is, a logarithmic scale dB = 10 * log10( PA / P0 ) where PA is the attenuated signal and P0 is the original. That looks something like this:

Note that the relative power scale has no bottom! It's turtles all the way down.

So how do you map an infinite volume scale onto a finite slider, while preserving the nice property that equal distances map to equal power ratios?

• #### Car Talk puzzler analysis - Limerick equation

Car Talk's puzzler of the week has some mathematical interest. See Car Talk's transcript of the question.

In brief, we are given the equation...

(12 + 144 + 20 + 3√4) / 7 + 5(11) = 92 + 0

(which holds true, by the way)
... and asked to transcribe it as a limerick.

The last line of the limeric is given: "Is nine squared, and not a bit more."

The answer, ROT13'd for your convenience until the Car Talk folks reveal it:

N qbmra, n tebff, naq n fpber
Cyhf guerr gvzrf gur fdhner ebbg bs sbhe
Qvivqrq ol frira
Cyhf svir gvzrf ryrira
Is nine squared, and not a bit more.

This is not the only equation to be immortalized in a limerick. There's also this old chestnut. It relies on pronouncing the letter z as "zee" (rather than "zed",) as well as reading "log base e" (usually spelled "ln") for "log".

The integral z2 dz
From one to the cube root of three
All times the cosine
Of three pi over nine
Equals log of the cube root of e.

This equation also holds true.

There's also this classic from Lewis Carroll which waxes a bit philosophical:

Yet what mean all such gaieties to me
Whose life is full of indices and surds
x2 + 7x + 53
= 11/3.

Perhaps fittingly, this last equation has only imaginary solutions.

• #### Anand vs. Gelfand world chess championship 2012 oldest pair of contenders since 1886

In 2012, World Chess Champion Viswanathan Anand will attempt to defend his title aqainst challenger Boris Gelfand. This is a very unusual match in that both players are fairly old by World Chess Champion contender standards. I decided to see just how unusual it was, and do so with some degree of rigor.

One tricky bit is that chess championships (usually) have (at least) two players, so we have to define an age metric for pairs of people. Creating a well-ordering on tuples is sometimes controversial. I chose to have the comparison routine be:
age(contenders) = min(age(contender) : contender ∈ contenders)
which is to say, the age of the youngest contender.

Another tricky bit was deciding which matches were definitively "world chess championship matches." I pulled the list of world chess championship matches from chessgames.com. For time periods where the organizational ownership of the title is in question, this includes matches sponsored by all contending organizations.

As a naïve first pass, I looked up the birth years for all the contenders and subtracted that from the year of the championship to get an estimated age. This could be off by a year if the youngest contender's birthday comes after (or during?) the match. Nevertheless, this was accurate enough to give me a short list of matches to investigate further:.

Closer investigation of each of the highlighted matches revealed that, astonishingly, in every case the youngest contender's birthday came after the match:

• 2012 Viswanathan Anand (43?) vs. Boris Gelfand: Anand's birthday (December 11) comes after the match (starts in May) so he will still be 42.
• 1993 Anatoly Karpov (42?) vs. Jan Timman (42?): Timman's birthday (December 14) came after the match (finished November 1) so he was still 41.
• 1934 Alexander Alekhine (42?) vs. Efim Bogolyubov: Alekhine's birthday (October 31) came after the match (April to June) so he was still 41.
• 1910 Emanuel Lasker (42?) vs. Dawid Janowski (42?): Janowski was born May 25; Lasker December 24. Lasker's birthday came after the match (finished December 8) so he was still 41.
• 1892 Wilhelm Steinitz vs. Mikhail Chigorin (42?): Chigorin's birthday November 12 (October 31 old style) came after the match (finished February 28) so he was still 41.
• 1886 Wilhelm Steinitz vs. Johannes Zukertort (44?): Zukertort's birthday (September 7) came after the match (finished March 29) so he was still 43.

We conclude that Anand vs. Gelfand (2012) features the oldest contenders since the very first World Chess Championship Steinitz vs. Zukertort (1886) - and is within a year of even that! If the 2014 championship is a rematch, it will set the record.

1 Topalov was the clear winner of the 2005 FIDE World Championship Tournament so there was no need for a runoff.

• #### Implementing a "listen" command using ISpRecoContext from the Microsoft Speech API

Earlier today I posted a quick "say.exe" sample app which you give text and it speaks the text aloud using the text-to-speech part of the Windows Speech API.  It was very straightforward - only 67 lines of C++ code.

It took me a little longer to figure out how to do this "listen.exe" sample app; you run it, speak into the microphone, and it uses the speech-to-text part of the Windows Speech API to print what you're saying to the console.  This is a little more involved: 202 lines of C++ code.

Pseudocode:

CoInitialize()
CoCreateInstance(ISpRecoContext)
pSpRecoContext->SetInterest(recognition events only, thanks)
pSpRecoContext->CreateGrammar()
pSpRecoGrammar->SetDictationState(active)
while(...) {
wait for a speech event (or the user to press Enter)
pSpRecoContext->GetEvents()
for each speech event {
make sure SPEVENT.eEventId is SPEI_RECOGNITION
event.lParam is an ISpRecoResult
pSpRecoResult->GetText()
print the text
}
}

Usage:

>listen.exe
Speak into the microphone naturally; I will print what I understand.
Press ENTER to quit.
(At this point you start talking into the microphone. Text shows up here shortly after you say it.)

Source and binaries attached.

• #### How many numbers are straights?

As many people, I have to deal with a lot of numbers all the time (bug numbers, changelist numbers, tracking numbers, ...) and as a mathematician I sometimes notice when numbers have peculiar properties.

For example, I recently ran into 152463, which is a straight (made up of consecutive digits, not necessarily in order.)  I then ran into 35426, which is also a straight.  Is this odd?

To put it another way - what proportion of numbers are straights?

We consider only non-negative integer numbers {0, 1, 2, ... }.  No duplicate digits are allowed, so the very last straight is 9,876,543,210.

Break it down by the length of the number in digits.  For example, how many four-digit numbers are there?  We can choose any number from 1-9 for the first digit, and any number from 0-9 for each of the other three digits; so there are 9 * 103 numbers of length 4.

How many of these are straights?  There are seven ways to choose a set of four consecutive digits to make up a four-digit straight: {0,1, 2, 3}, {1, 2, 3, 4}, {2, 3, 4, 5}, {3, 4, 5, 6}, {4, 5, 6, 7}, {5, 6, 7, 8}, {6, 7, 8, 9}.

Each of these seven ways can then be shuffled to make the straights themselves.  For six of these seven sets of digits, all possible shufflings are allowed; there are 4! shufflings for each set.  This gives 6 * 4! straights.

But for {0, 1, 2, 3}, we have to be careful not to end up with 0 as a leading digit.  We choose the leading digit first out of the set {1, 2, 3} (there are three ways to do this) and then we have complete freedom to shuffle the three remaining digits (including 0.)  This gives 3 * 3! straights.

So there are 6 * 4! + 3 * 3! straights which are four digits long.

Considering all possible lengths from 1 to 10 (we'll consider 0 as a number of length 1) gives the following table:

 Length Numbers Straights % Straights 1 10 10 100.000% 2 9 * 101 17 18.889% 3 9 * 102 46 5.111% 4 9 * 103 162 1.800% 5 9 * 104 696 0.773% 6 9 * 105 3480 0.387% 7 9 * 106 19,440 0.216% 8 9 * 107 115,920 0.129% 9 9 * 108 685,440 0.076% 10 9 * 109 3,265,920 0.036% Total 1010 40,91,131 0.041%

Since roughly 1% of five-digit numbers are straights, it is not surprising that I see them that often.

• #### Rotating a matrix redux

As Vincent Tan points out, there is no way to create an n x n matrix R such that RA is a rotated version of A - for example, in the 2 x 2 case:

There is no 2 x 2 matrix R such that
`R [ a b ] = [ b a ]  [ c d ]   [ c d ]`

Vincent Tan points out that such an R would entail hidden assumptions but asks for a simpler proof.

Here it is.  I'll go back to the n by n case, assuming n >= 2.  (The 0 x 0 case and the 1 x 1 case are actually trivially solvable using the identity matrix.)

Assume there is such a matrix R = (rij)i,j∈{1...n} which has the property that RA = B = (bij)i,j∈{1...n} is the rotated version of A for any n x n matrix A = (aij)i,j∈{1...n}.

To see that this is impossible, consider the matrix with a 1 in the upper-left-hand corner and zeros everywhere else; that is, a1,1 = 1, and aij = 0 for all other combinations of i and j.

To be the rotated version of this matrix, B should have bn1 = 1 and bij = 0 for all other combinations of i and j.  (My coordinates are such that bn1 is the number in the top right corner of the matrix.)

But how did that 1 get in the top right hand corner?  The rules of matrix multiplication imply that bn1 = r1,1an1 + r2,1an2 + ... + rn1ann. At first blush, this looks fine... until we realize that bn1 is 1, and all of the anj are 0... so we have the contradiction

1 = r1,10 + r2,10 + ... + rn10 = 0

QED.

But this is not how rotation matrices are applied.  Rotation matrices R are not applied as RA = B; instead, they're applied as RAR-1 = B.  Not that this helps our hapless interviewee; the operation being asked of him ("rotate a matrix") can't be done by a rotation matrix anyway.  That is, there is no magic matrix R that works under this operation either.

EDIT: A simple way to prove that RAR-1 = B doesn't work either (for n >= 2) is to think about what happens when you choose A to be the identity matrix I (the matrix with 1's on the main diagonal [i = j] and 0's everywhere else.)

RIR-1 = RR-1 = I; and for n >= 2, I is not its own "rotation."

We therefore arrive at the paradoxical result that you can't rotate a matrix via matrix rotation.

• #### Spot the Bug - IMFOutputSchema

I found a simple but nasty bug the other day in this implementation of the IMFOutputSchema interface.

The symptoms: running this code outside of a debugger caused the app to crash.  Running it under a debugger (to catch the crash) caused the app to run clean.

// outputschema.h
HRESULT CreateTrustedAudioDriversOutputSchema(
DWORD dwConfigData,
GUID guidOriginatorID,
IMFOutputSchema **ppMFOutputSchema
);

// outputschema.cpp
// ... various include files removed ...
// CMFAttributesImpl implements the IMFAttributes interface, minus the IUnknown methods
class CTrustedAudioDriversOutputSchema : public CMFAttributesImpl<IMFOutputSchema> {

friend
HRESULT CreateTrustedAudioDriversOutputSchema(
DWORD dwConfigData,
GUID guidOriginatorID,
IMFOutputSchema **ppMFOutputSchema
);

private:
CTrustedAudioDriversOutputSchema(DWORD dwConfigData, GUID guidOriginatorID);
~CTrustedAudioDriversOutputSchema();

ULONG m_cRefCount;
DWORD m_dwConfigData;
GUID m_guidOriginatorID;

public:
// IUnknown methods
HRESULT STDMETHODCALLTYPE QueryInterface(
/* [in] */ REFIID riid,
/* [out] */ LPVOID *ppvObject
);
ULONG STDMETHODCALLTYPE Release();

// IMFOutputSchema methods
HRESULT STDMETHODCALLTYPE GetConfigurationData(__out DWORD *pdwVal);
HRESULT STDMETHODCALLTYPE GetOriginatorID(__out GUID *pguidOriginatorID);
HRESULT STDMETHODCALLTYPE GetSchemaType(__out GUID *pguidSchemaType);

}; // CTrustedAudioDriversOutputSchema

HRESULT CreateTrustedAudioDriversOutputSchema(
DWORD dwConfigData,
GUID guidOriginatorID,
IMFOutputSchema **ppMFOutputSchema
) {
if (NULL == ppMFOutputSchema) {
return E_POINTER;
}

*ppMFOutputSchema = NULL;

CTrustedAudioDriversOutputSchema *pSchema =
new CTrustedAudioDriversOutputSchema(dwConfigData, guidOriginatorID);

if (NULL == pSchema) {
LOG(eError, _T("new CTrustedAudioDriversOutputSchema returned a NULL pointer"));
return E_OUTOFMEMORY;
}

*ppMFOutputSchema = static_cast<IMFOutputSchema *>(pSchema);

return S_OK;
} // CreateTrustedAudioDriversOutputSchema

// constructor
CTrustedAudioDriversOutputSchema::CTrustedAudioDriversOutputSchema(
DWORD dwConfigData, GUID guidOriginatorID
)
: m_dwConfigData(dwConfigData)
, m_guidOriginatorID(guidOriginatorID)
{}

// destructor
CTrustedAudioDriversOutputSchema::~CTrustedAudioDriversOutputSchema() {}

#define RETURN_INTERFACE(T, iid, ppOut) \
if (IsEqualIID(__uuidof(T), (iid))) { \
*(ppOut) = static_cast<T *>(this); \
return S_OK; \
} else {} (void)0

// IUnknown::QueryInterface
HRESULT STDMETHODCALLTYPE
CTrustedAudioDriversOutputSchema::QueryInterface(
/* [in] */ REFIID riid,
/* [out] */ LPVOID *ppvObject
) {

if (NULL == ppvObject) {
return E_POINTER;
}

*ppvObject = NULL;

RETURN_INTERFACE(IUnknown, riid, ppvObject);
RETURN_INTERFACE(IMFAttributes, riid, ppvObject);
RETURN_INTERFACE(IMFOutputSchema, riid, ppvObject);

return E_NOINTERFACE;
}

ULONG STDMETHODCALLTYPE

ULONG uNewRefCount = InterlockedIncrement(&m_cRefCount);

return uNewRefCount;
}

// IUnknown::Release
ULONG STDMETHODCALLTYPE
CTrustedAudioDriversOutputSchema::Release() {

ULONG uNewRefCount = InterlockedDecrement(&m_cRefCount);

if (0 == uNewRefCount) {
delete this;
}

return uNewRefCount;
}

// IMFOutputSchema::GetConfigurationData
HRESULT STDMETHODCALLTYPE
CTrustedAudioDriversOutputSchema::GetConfigurationData(__out DWORD *pdwVal) {

LOG(eInfo1, _T("IMFOutputSchema::GetConfigurationData called"));

if (NULL == pdwVal) { return E_POINTER; }

*pdwVal = m_dwConfigData;

return S_OK;
}

// IMFOutputSchema::GetOriginatorID
HRESULT STDMETHODCALLTYPE
CTrustedAudioDriversOutputSchema::GetOriginatorID(__out GUID *pguidOriginatorID) {

LOG(eInfo1, _T("IMFOutputSchema::GetOriginatorID called"));

if (NULL == pguidOriginatorID) { return E_POINTER; }

*pguidOriginatorID = m_guidOriginatorID;

return S_OK;
}

// IMFOutputSchema::GetSchemaType
HRESULT STDMETHODCALLTYPE
CTrustedAudioDriversOutputSchema::GetSchemaType(__out GUID *pguidSchemaType) {

LOG(eInfo1, _T("IMFOutputSchema::GetSchemaType called"));

if (NULL == pguidSchemaType) { return E_POINTER; }

*pguidSchemaType = MFPROTECTION_TRUSTEDAUDIODRIVERS;

return S_OK;
}

• #### Blown movie lines - Casablanca

Ah... Casablanca.

Arguably the best movie of all time.  And yet a blown line?  How could I be so bold?

The line:

Play it [As Time Goes By] again, Sam. -- Casablanca

Wow... that line, blown?  Am I way off base here?  Arguably the best line of arguably the best movie... blown?  How so?

Well, for two reasons.

1. The line does not actually appear in the movie at all.

There are two scenes where Sam is urged to play As Time Goes By to recall the Paris romance between the two major characters.  Here is the dialog from both scenes:

Ilsa: Play it once, Sam. For old times' sake.
Sam: I don't know what you mean, Miss Ilsa.
Ilsa: Play it, Sam. Play "As Time Goes By."

... and later...

Rick: You played it for her, you can play it for me!
Sam: Well, I don't think I can remember...
Rick: If she can stand it, I can! Play it!

"Play it again, Sam" has thus entered the pantheon of unforgettable movie lines... even though it was never actually said.

2. The song was nearly changed.

The film composer - a fellow named Max Steiner - did a very good job on the film.  As Time Goes By is, thanks to his score, now indelibly etched in the ears of many a classic movie buff.

But he hated the song!  He attempted to convince the producers of the film that the song was boring, and that he could write a much better song.

Shame, Mr. Steiner.  Shame.

What's worse... the producers went along with it! They agreed to allow him to write a song to replace As Time Goes By.  There were only a couple of scenes that needed to be reshot, and they started the logistical machinery to recall the cast in those scenes to do the reshooting.

Unfortunately... or rather, fortunately... Ingrid Bergman had already moved on to For Whom The Bell Tolls... and she had cut her hair...

The topic of this arguably-best-line of the arguably-best-movie was thus saved by a haircut.

Page 2 of 6 (138 items) 12345»