Welcome to MSDN Blogs Sign in | Join | Help

Grunt recognition

Wired News is carrying a story today on a research project for identifying and classifying animal communication using speech recognition technology developed for humans. The most interesting thing about the Dr Dolittle Project is that the researchers are applying a framework of contextual and behavioural information:

Researchers feed observations and sound recordings into a computer that uses algorithms to analyze and group the sounds according to what the scientists would like to study (for example, calls elephants make while playing). Each sound can also be matched with an individual label, marking things like the animal it came from.

Now, researchers are adding more sounds to the computer database without telling the computer associated behaviors or which animals they come from. By doing this, they hope the computer can learn to determine what the sounds mean, said Pete Scheifele, an assistant professor in residence at the University of Connecticut...

This use of contextual knowledge to learn semantics is something that human-geared recognition systems don't do nearly enough of. (Try substituting the word 'human' for all the animals in the article, and you'll get a laugh, but also a sense of how a behaviour-based communication framework could be useful.) When did you last interact with a system that sensibly applied some knowledge of your context - the 'state' of your business with the company; the fact that you called a minute ago; your tendency to barge-in on prompts you have heard before; your immediate dialog history...? Sure, some dialog context is hard-wired in grammars specific to dialog states, but those grammars are probably universal to all users, and not geared to you and your context. 

This is something I've been thinking about for a while - analytics tools are all about learning from the data, after all. I'd be very interested to know if you've thought about scenarios that would benefit from an application of broader contextual information, and how the Microsoft Speech Server tools could help you realize them.

Finally, it's nice to know that speech recognition in the animal kingdom faces an all too familiar problem:

"Just discriminating animal sounds from background noise can be a considerable challenge," [a researcher] said.

Published Thursday, July 27, 2006 4:30 PM by Stephen Potter

Comments

Thursday, July 27, 2006 8:52 PM by BlakeHandler

# re: Grunt recognition

I wonder if this can be used to understand the grunts of my daughter and her teenage friends? (^_^)
Thursday, July 27, 2006 9:04 PM by Stephen Potter

# re: Grunt recognition

Haha, I'm sure they could put together some pretty realistic models in no time at all. :-)  

(My daughter's only 2, and I can say now that context is everything, grammars alone would get me nowhere...)
New Comments to this post are disabled
 
Page view tracker