Speech recognition in Vista works well for me. Dictation accuracy is very high - especially since I flicked the switch to train it on my emails and documents - and the correction experience is smooth and efficient. But I hardly use it. I find it very difficult to dictate to my computer.
WTF? Typing is easier than speaking?
Yes. I can't speak the way I write. If I'm typing, I'll begin a sentence without knowing how it's going to end, future phrases will form as I finish typing previous ones. I'll pause, go back, select text, delete it, write again. It feels almost as if hitting the keys is a direct extension of the thought process.
If I'm dictating, there's no such flow. I'll begin a sentence (having got over the minor panic triggered by the Listening microphone UI over a blank document) and the phrase will appear - correctly - on my page, but saying it out loud has corrupted the thoughts that would have helped complete it. I have to step back and think hard about what should come next. Every phrase forces a little mental reboot.
I think what's going on - in addition to my inability to think in complete sentences - is that my thought processes while typing have habituated themselves to the gated speeds of my motor operations. The extra time is valuable and they use it to do the work of sounding out the current phrase in context, thinking up the next phrase, and so on. So typing actually greases the wheels of forming the right words and sentences. That's missing when I dictate, and I find it quite paralyzing.
Surely many keyboard users, even very slow typists, will be reluctant to move to speech recognition systems - even high accuracy systems - because of the advantages of the thinking on-the-side that we seem to do while hitting keys. The standard words-per-minute (wpm) metric for text input is usually measured over copying pre-existing texts. The text creation part is assumed to be equal to each. But my wpm drops dramatically when I dictate, because the creation processes that work with typing are unavailable to me. But what does the wpm look like when you do have them available?
Many users of speech recognition (and of transcription-taking secretaries) have obviously overcome this. Most visibly, Richard Powers, a novelist who wrote the 2006 National Book Award winner using speech recognition on his TabletPC, trained himself after years of typing:
I needed weeks to get over the oddness of auditioning myself in an empty room, to trust to the flow of speech, to learn to hear myself think all over again.
He broke through - and argues now that typing is the obstacle to the thought process:
What could be less conducive to thought’s cadences than stopping every time your short-term memory fills to pass those large-scale musical phrases through your fingers, one tedious letter at a time? You’d be hard-pressed to invent a greater barrier to cognitive flow.
Is the grass really greener over there? I'd love to know. If anyone has done it, please send a comment. I'm also going to try it out for myself. Over the next few weeks, I'll be training myself to think aloud.