Dictomail: automatic voicemail transcription

I just tried out a very cool system from http://www.dictomail.com/, using an interactive demo hosted by a company called Admiral Online. Their web site has an interactive demo that tells you to call a phone number, leave any message, and hang up. Here is the way it transcribed my message:
ANI:4255551212 HI JENNIFER, THIS IS RICHARD SPRAGUE. I'M JUST CALLING TO ASK YOU A COUPLE QUESTIONS ABOUT THE MEETING AND TO TELL YOU THAT THE RAIN IN SPAIN FALLS MAINLY ON THE PLAIN. YOU CAN REACH ME AGAIN AT 2136782217. I LOOK FORWARD TO HEARING FROM YOU, BYE., -- Your voice message was translated by DictoMail. Receive messages on your cellphone and desktop. Call 818-206-0775 to order service now.
The transcription was perfect, in spite of the following ways I tried to fool it:
  • "the rain in spain" thing, spoken quickly and with blurred speech.
  • the return phone number (which I made up) spoken with lots of 'ums' and repeated digits. I actually said "two one three, um, six seven eight, ah, two--um--two two one seven."
  • "I look forward..." spoken as quickly as I could
. Obvious questions I'd like to know: where did they get their ASR engine? Why didn't they use Microsoft's engine? and how did they make such a cool demo?
Published 28 January 05 05:43 by sprague

Comments

# Jerry Dennany said on January 28, 2005 7:30 AM:
I can answer one of your questions - They didn't use Microsoft's engine because it sucks. It's slow, and doesn't do as good of a job as several other solutions on the market.
# Charles Cook said on January 28, 2005 7:52 AM:
ASR then corrected by people? Hence the delay. Or just people?
# Ron said on January 28, 2005 9:46 AM:
If I were them, I'd flag passages like the iffy phone number for a human to qa. I would give higher priority to iffy passages based on, say, randomization (don't do all of them), how important is this message to transcribe (important customer?).
# Richard Sprague said on January 28, 2005 2:52 PM:
But tell me WHY the MS engine sucks. All of us in the MS group want to improve it our accuracy. Please give us examples and we'll look.

Note that the vast majority of problems with SR accuracy are caused by things like poor microphone quality or the error correction UI--yes, we know and we're trying to fix. But head-to-head under apples-to-apples conditions, we think we're pretty good--at least we want to be, and we love to hear about counter-examples.

Richard Sprague
# Brent said on February 1, 2005 4:56 PM:
I have no serious experience with the MS ASR engine but I'm sure it's competant from my hobby time with it. But you know I don't think it matters which SR engine they're using, because they are obviously correcting it with a manual processing step.
# Sprague WebLog said on February 2, 2005 7:14 PM:
# Sprague WebLog said on February 2, 2005 7:15 PM:
# Sean said on February 18, 2005 4:36 PM:
After using the demo, I too was convinced that a human being was handling the translation. I was actually called by a DictoMail sales person right after the demo. The first question I asked was whether a human being was involved. She said she could not comment because that was part of their proprietary technology (i.e., a big fat yes). What I want to know is whether someone has gone through their terms of service to see if they notify the user that someone will be listening to their messages. To highlight that point, I sent an email this morning asking DictoMail to confirm if a human being is translating the messages because of privacy concerns on my end. By the way, the sales person stated that many doctors use the service. I am sure there must be some HIPPA violations related to that if a human being is involved!!!
# HAB said on February 21, 2005 11:40 AM:
Well, I've been around with them on this issue because of HIPPA concerns. They assure me that there is a combination of techniques used to translate the messages including a computerized voice recognition engine, coupled with some human review of certain words or phrases that do not translate. I was assured that no single human reviews an entire message nor do they have access from where the messages are coming. Further, message fragments are randomly distributed to different people at different service centers.
I was also assured that many clients including major law firms, physicians, and even the US army has used the service with assurances of extreme confidentiality. For what it is worth, i've been very impressed with the service so far.
New Comments to this post are disabled

Search

This Blog

Syndication

Page view tracker