Some time ago I created a "listen.exe" tool which used SAPI's ISpRecoContext to listen to the microphone and dump any recognized text to the console.

Today I had to debug an issue with SAPI reading from a .wav file, so I updated it to accept a listen.exe --file foo.wav argument; this consumes the audio in the .wav file instead of listening to the microphone.

Pseudocode for the difference:


Also, we have to tell the ISpRecoContext that we're interested in SPEI_END_SR_STREAM events as well as SPEI_RECOGNITION events.

Full source and binaries attached.

A gotcha: the .wav file has to have a WAVEFORMATEX.wFormatTag = WAVE_FORMAT_PCM. If it's anything else, ISpRecoGrammar::SetDictationState fails with SPERR_UNSUPPORTED_FORMAT. Neither WAVE_FORMAT_IEEE_FLOAT nor (WAVE_FORMAT_EXTENSIBLE with SubFormat = KSDATAFORMAT_SUBTYPE_PCM) work.

EDIT September 22 2015: moved source to github