Mike Cohen (Google) podcast
Spykdog introduced me to an interesting 15-min podcast interview with Mike Cohen, who runs speech recognition development at Google.
- The big trend recently in SR is in concept recognition like how-may-I-help-you systems, but these systems are most interesting for the way they cut the number of menus you have to navigate, not in making other tasks more straightforward.
- Multimodal will take several more years to take off because the web infrastructure isn't there yet, so meanwhile it's better to focus on those conversational-type systems.
- He doesn't have an opinion about whether we'll see more hosted speech systems (the "in-the-cloud" environments like Angel.com) or whether it's better to run your own server.
- Speech development is so complicated that it's unclear whether packaged dialog or grammar libraries help much -- you will still need lots of experts for customization.
- The best speech hires are computational linguists, because an intuitive feel for language is often more relevant than computer skills.
We'll see where this goes. I'm a little surprised Google isn't more interested in predictive AX modeling, but then this was only a short interview.