I am a Software Development Engineer in Test working for the Windows Sound team. You can contact me via email: mateer at microsoft dot com
Friend key: 28904932216450_59cd9d55374be03d8167d37c8ff4196b
Some time ago I created a "listen.exe" tool which used SAPI's ISpRecoContext to listen to the microphone and dump any recognized text to the console.
Today I had to debug an issue with SAPI reading from a .wav file, so I updated it to accept a listen.exe --file foo.wav argument; this consumes the audio in the .wav file instead of listening to the microphone.
Pseudocode for the difference:
Also, we have to tell the ISpRecoContext that we're interested in SPEI_END_SR_STREAM events as well as SPEI_RECOGNITION events.
Full source and binaries attached.
A gotcha: the .wav file has to have a WAVEFORMATEX.wFormatTag = WAVE_FORMAT_PCM. If it's anything else, ISpRecoGrammar::SetDictationState fails with SPERR_UNSUPPORTED_FORMAT. Neither WAVE_FORMAT_IEEE_FLOAT nor (WAVE_FORMAT_EXTENSIBLE with SubFormat = KSDATAFORMAT_SUBTYPE_PCM) work.