Giving Computers a Voice
| |
This article demonstrates how easy it is to enable managed applications to finally talk back using Microsoft's Speech API (SAPI). |
|
Matt Harrington
Difficulty: Easy
Time Required: 1-3 hours
Cost: Free
Hardware:
|
Summary
Most of us talk to our computers all the time. You wouldn't believe the things my wife says to her poor machine when something goes wrong. This article demonstrates how easy it is to enable managed applications to finally talk back using Microsoft's Speech API (SAPI). But don't worry: since you control what the application says, you can ensure it uses nicer language than my wife.
SAPI
SAPI is the speech API that gives applications access to speech recognition and text-to-speech (TTS) engines. This article focuses on TTS. For TTS, SAPI takes text as input and uses the TTS engine to output that text as spoken audio. This is the same technology used by the Windows accessibility tool, Narrator. Every version of Windows since XP has shipped with SAPI and an English TTS engine.
TTS puts user's ears to work. It allows applications to send information to the user without requiring the user's eyes or hands. This is a very powerful output option that isn't often utilized on PCs.
Three steps are needed to use TTS in a managed application:
- Create an interop DLL
Since SAPI is a COM component, an interop DLL is needed to use it from a managed app. To create this, open the project in Visual Studio. Select the Project menu and click Add Reference. Select the COM tab, select "Microsoft Speech Object Library" in the list, and click OK. These steps add this reference to your project and create an Interop.SpeechLib.dll in the same folder as your executable. This interop DLL must always be in the same folder as your .exe to work correctly.
- Reference the interop namespace
Include this namespace in your application. In C#, add "using SpeechLib;"; iIn VB, add “Imports SpeechLib”.
- call
Speak()
Create a SpVoice object and call Speak():
Visual C#
SpVoice voice = new SpVoice();
voice.Speak("Hello World!", SpeechVoiceSpeakFlags.SVSFDefault);
Visual Basic
voice = New SpVoice
voice.Speak("Hello World!", SpeechVoiceSpeakFlags.SVSFDefault)
That's it! The downloads for this article are simple C# and VB.NET "hello world" samples to try out.
Another Example
One of the best uses of TTS is for audio notifications. The System Monitor sample on coding4fun is a perfect candidate to be TTS-enabled. It informs a user when a particular system event has occurred through a message box, system tray balloon tip, or email. I'll use its extensible notification infrastructure to add TTS to this list.
First, create an interop DLL and add a reference to the namespace using the steps above.
Second, add this file to the SystemMonitor project:
SpeechNotifier.cs:
Visual C#
using System;
using SpeechLib;
namespace SystemMonitor.Notifiers
{
class SpeechNotifier : NotifierBase
{
public SpeechNotifier()
{
_voice = new SpVoice();
}
public override void Execute(string title, string message)
{
string text = "System monitor warning! " + title;
_voice.Speak(text, SpeechVoiceSpeakFlags.SVSFDefault);
}
private SpVoice _voice;
}
}
SpeechNotifier.vb:
Visual Basic
imports System
imports SpeechLib
Namespace SystemMonitor.Notifiers
Class SpeechNotifier Inherits NotifierBase
Public Sub New()
_voice = new SpVoice
End Sub
Public Overrides Sub Execute(title As String, message As String)
String text = "System monitor warning! " + title;
_voice.Speak(text, SpeechVoiceSpeakFlags.SVSFDefault);
End Sub
Private SpVoice _voice
End Sub
End Class
End Namespace
Third, add this line in the App.config file to any monitors you want to be spoken:
<notifier type="SystemMonitor.Notifiers.SpeechNotifier,SystemMonitor" />
For example:
<monitor runFrequency="00:04" type="SystemMonitor.Monitors.DiskSpaceMonitor,SystemMonitor">
<settings>
<setting name="driveLetter" value="C" />
<setting name="freeMegabytes" value="10000" />
</settings>
<notifiers>
<notifier type="SystemMonitor.Notifiers.SpeechNotifier,SystemMonitor" />
</notifiers>
</monitor>
This example will say "System monitor warning! Disk space low." when the space on c:\ drops below the specified value.
General TTS Tips
- Keep it short. The voice used by the default XP TTS engine is not exactly soothing. Most users would not want to hear it speak long texts, but it often works well for conveying short, functional pieces of information.
- Have a preface. Many TTS apps are designed to be run in the background and speak audio when the user is focused on a different application. In this case it's helpful to let the user know the context. It wouldn't make sense if your computer randomly said "Bill Gates," but you would know what "New mail from Bill Gates" meant.
- Keep it "speakable." Test out the text your app intends to say. The default TTS engine has some limitations you can avoid. For example, "c:\" would be pronounced "see colon backslash". This can be avoided by speaking just "c" instead.
- Change defaults. The speech control panel allows the user to select the default TTS engine (for computers with more than one installed) and set the default speaking rate.
Cool project ideas
The TTS functionality on PCs is an under-used resource with much promise. Here are some more project ideas:
- GotDotNet has the source code of a few email and RSS readers. Reading the title of a message as it arrives in the background would be a great use of TTS.
- The System Monitor sample above just scratches the surface of event notifications. That application could be the foundation for monitoring any kind of system resource or application event. And with TTS you can let your ears do the work while your hands and eyes are busy.
- The It's Hot in Here coding4fun sample uses a web service to read weather forecasts. Don't bother checking a web page for the weather; just modify this app to read it with a hotkey while your eyes do more important things. TTS is great for any similar web service update application: news headlines, stock quotes, sports scores, auction prices, and more.
- Perhaps more importantly, that same article also introduces us to Phidgets. Upstage the guy next door by sending your Phidget battlebot into the fray with a TTS "You killed my father, prepare to die!".
What's next?
This article outlines how you can use TTS on PCs today. But the real story is what's coming up.
WinFX will contain the fully managed API for speech. No more ugly COM interop. Developers will get all the benefits of a managed API and full intellisense support.
The default TTS engine on Windows Vista (formerly Windows code name "Longhorn") is much better than the default XP engine. Check out some of the resources below for updates as Windows Vista gets closer to shipping.
Resources
Here's some blogs from members of the Speech Components Group at Microsoft: