At long last I have finished a post on the new DetectAnsweringMachine activity in the latest Beta. While writing this post, there were a number of things that were confusing to me so I hope this information helps.
I write a post some time ago about detecting answering machines. Basically, my approach was to use a concept grammar to determine whether a response is an answering machine or a live person. The DetectAnsweringMachine activity that ships with Office Communications Server Speech Server works in a somewhat similar way except it also takes into account the time the utterance took in order to determine whether the response is human or answering machine.
The way the algorithm works is the following.
1) Using a grammar that you supply, DetectAnsweringMachine gets a response and in particular pays attention to the time length of the response. So, typically a ‘hello’ response would take less than a second.
2) By default, if the length of the utterance is less than one second, it assumes the response was human (this is of course configurable).
3) By default, if the length of the utterance is greater than four seconds, it assumes the response was an answering machine (also configurable).
4) If the length of the utterance was between one and four seconds, it uses the grammar’s classification.
This means that if your grammar return “answering machine” but the response took less than one second, it will still classify it as a human regardless of what the grammar returned. In reality, we have found that using this algorithm alone for answering machine detection is more reliable than using the grammar alone. However, using a good grammar should see an improvement of a few percentage points which in many systems is considerable.
Where the confusing part is, is creating the grammar. Note that if you do not add a Grammar to the activity (using the TurnStarting event) the runtime will throw an exception. Between going through the research of some others on my team and doing my own research, I created a HMIHY grammar that you can use for the classification. HMIHY grammars still work best here because you are unlikely to guess the actual text of the response – you really need to generate a background model to recognize the utterance which is what HMIHY grammars do nicely.
To create a HMIHY grammar, first open a .gbuilder file in the conversational grammar builder and right click on the answers node and select ‘Add Concept Answer’. You will need to this new answer to ‘DetectedEntity’. This is very important and actually I did not find it mentioned anywhere in the documentation I have. If you do not name your answer this, you will receive an exception when you run your application.
Next you will need to add two concepts to the answer – ANSWERING_MACHINE_RESPONSE and LIVE_PERSON_RESPONSE. The sentences you use are up to you, but to start out the following are what I used.
Hi this is John
you have reached
five five five one two five three
we are sorry we are not able to answer your phone
please leave your name telephone number and a brief message after the beep
get back to you as soon as possible
thank you for calling
please leave a voice message for
when you are finished recording hang up or press pound for more options
after the tone please record your message
Hi this is
we are sorry we are not able to answer your phone Thanks bye
please leave you name telephone number and a brief message after the beep
leave your name and number I'll get back to you when I can
I am not here right now please leave your message
are not able to come to the phone be happy to call you back as soon as we can
to leave a fax please start transmission To leave a voice message please speak after the beep
Hi you've reached John if you want to send a fax press the start button on your fax machine after you finish your message
good evening I'm not here now but if you leave a message I'll get you back as soon s possible
Sorry that we can not come to the phone right now record your message after the tone to send numerical page press nine
When you are finished recording hang up or for delivery options press nine Thank you
Hi you've reached five five five one two five three Please leave a message we'll call you as soon as we can Thanks for calling we'll return your call
please leave your name and phone number and we'll call you back Thank you bye
We are unavailable to take your call we'll call you back Have a nice day
we are not home right now Have a great day
no one can come to the phone at this moment leave a message and we'll call you back
leave a message after the beep
After the tone record your message
it's not available you may leave a message after the tone your call is important to us sorry that I missed your phone call
Please leave your message after the tone
please leave us a message
You reached the voice mailbox of Sorry that I missed your call I can't speak with you right now or not in my office I'll get back to you when I return
sorry that I missed your call I'm either on the phone or away from my desk or on the other line
you have reached a voice message system after the tone please record your message
your call has been forwarded to an automatic voice mail system
Please leave a message for
Please leave a voice mail for
Hello we are not at home right now please leave a message after the beep
Hello no one is available to take your call please leave a message after the tone
Hello I'm unavailable to answer your call right now please leave your name number and a message after the tone
Hello we are not available now Please leave your name and phone number after the beep We'll return your call
You have reached a number that does not accept solicitations if you are a solicitor please add this number to your do not call list and hang up now Otherwise please press five or stay on the line
number which does not accept calls from telemarketers All other callers may press five if they wish to complete the call
the number that does not accept calls from telemarketers if you are a telemarketer please add this number to your do not call list and hang up now If you are not a telemarketer press five
We do not accept telemarketing calls Please remove us from the calling list that is required by federal laws
notice all phone sollicitators place this name and phone number on your do not call list personal or invited business callers press five on your telephone
I'm sorry your call can not be completed at this dial Please check the number and dial again or call your operator to help This is a recording
The number five five five one two five three has been disconnected No further information is available about this number
This number is not in service
You have reached an invalid D. I. D. number Please try your call again
The number you have reached is not in service If you feel you have reached this recorded message in errors Please check the number and try your call again
We are sorry Your call can not be completed at this dial Please check the number and dial again Or call your operator to help This is a recording
The number you have reached five five five one two one two is not in service If you feel you have reached this recorded message in errors Please check the number and try your call again Thank you
Please note the area code and dial your call again
The number five five five one two one two has been disconnected No further information is available about this number
Sorry the number you have dialed in no longer in service
the number you have dialed in not in service at this time Please check your number or call your operator for assistance
We are sorry You have reached a number that has been disconnected or no longer in service
This is John speaking
I am fine
I am OK
Who is it may I help you
Hello May I ask who is calling
Hey what's up man
Hello how are you
hello this is Mary speaking
Hello there mate
Note that conversational grammar builder supports copying multiple phrases – so that is the easiest way to add these sentences to your .gbuilder file.
If you look through these sentences you will probably notice that several of the answering machine sentences actually refer to privacy systems and ‘no longer in service’ messages. Ideally we should be able to detect whether we have received a person, answering machine, privacy system, or whether the number is no longer in service but the reality is when you use the DetectAnsweringMachine activity you can only return the two semantic results above – any other result will cause an exception. So in this case we just classify them as answering machines.
If your situation requires that you determine whether you have encountered a privacy system or a number no longer in service, it would not be difficult to use the same algorithm DetectAnsweringMachine uses in a custom activity that understands other semantic results. If you have difficulties with this, you may be able to butter me up with a few nice comments on my blog but keep in mind how long it took me to write this post in the first place. J
To determine what the result of the activity was, you need to check the DetectionResult property. Of particular interest is there is a DetectionResult enumeration value of OtherAutomata. You can safely ignore this value as it is never used. You will only come across values of None (if the activity has not yet run), AnsweringMachine, or LivePerson.