I heartily announce that our new managed Speech API is in the Avalon & Indigo Beta 1 RC!
With the System.Speech namespace you can incorporate both speech recognition and speech synthesis in your applications.
Recognition:
The main classes for speech recognition are:
For example, to load a grammar containing your app’s commands into the shared desktop recognizer:
DesktopRecognizer desktopRecognizer = new DesktopRecognizer(); desktopRecognizer.LoadGrammar(new Grammar(new Uri(grammarPath))); desktopRecognizer.SpeechRecognized += delegate(object sender, RecognitionEventArgs e) { // Do appropriate handling when we get a recognition // Console.WriteLine("User said {0}", e.Result.Text); };
DesktopRecognizer desktopRecognizer = new DesktopRecognizer();
desktopRecognizer.LoadGrammar(new Grammar(new Uri(grammarPath)));
desktopRecognizer.SpeechRecognized += delegate(object sender, RecognitionEventArgs e)
{
// Do appropriate handling when we get a recognition
// Console.WriteLine("User said {0}", e.Result.Text);
};
You’ll also need to have an SR engine installed. There are various ways to get these. Tablets already have an engine. If you have a recent version of Office, you’ll have an engine. You can also download an engine from the SAPI web site http://www.microsoft.com/speech/download/sdk51/.
Synthesis:
The main classes for speech synthesis are:
For example, if you want your app to say “hello world”, just write:
SpeechSynthesizer synth = new SpeechSynthesizer(); synth.Speak(“Hello world!”);
SpeechSynthesizer synth = new SpeechSynthesizer();
synth.Speak(“Hello world!”);
You can easily splice this with a “ding” wave file by using the PromptBuilder:
PromptBuilder builder = new PromptBuilder(); builder.AddAudio (new Uri (@"file://\windows\media\ding.wav")); builder.AddText("Hello world!"); SpeechSynthesizer synth = new SpeechSynthesizer(); synth.Speak(builder);
PromptBuilder builder = new PromptBuilder();
builder.AddAudio (new Uri (@"file://\windows\media\ding.wav"));
builder.AddText("Hello world!");
synth.Speak(builder);
Windows comes with a synthesis engine.
The API uses the W3C standard formats for recognition grammars (SRGS) and synthesis (SSML).