Welcome to MSDN Blogs Sign in | Join | Help

Mary-Jo Foley posted an article on her blog on ZDNet yesterday talking about how the RTM buzz for Microsoft Exchange 2010 is growing louder. It's an interesting read.

The Speech team has been busy at work supporting the Exchange team for quite a while, with a new feature for Exchange that allows users to see a text preview of voice mails that arrive in their inbox.

I've been using it personally for almost 2 years, in one form or another. I recently set my own Exchange server up, and redirected my wife's cell phone voice mails to it too. She now never listens to her voice mails directly. This is a milestone for me, because this is one of the best examples of my work life impacting my personal life, and vice versa. She does give me constructive advice on how to make the system better ... :-)

In most reviews I've read on Exchange 2010 since the public announcement earlier this year, text preview of voice mails is very often listed in the short list of key new features. That's true in Mary-Jo's article also. Here's what she says about Exchange 2010:

The Exchange 2010 release includes new, integrated e-mail archive functionality; the ability to see text previews of voice mail; a new “Conversation View” feature; customizable call-routing menus; and a “MailTips” feature designed to help stamp out e-mail “faux pas.”

Let us know, once you're using Exchange 2010, how you like the voice mail text preview feature. We like hearing from our customers directly. Both what you like, and what you don't like.

I just read a pretty cool post by Joel 'Jaykul' Bennett from HuddledMasses.org where he shows his readers how to use a PowerShell script he wrote to invoke other PowerShell scripts using Speech Recognition. Great idea, Joel. I like it!

Check it out here.

One of the cooler parts of my jobs is interacting with customers and partners. One such opportunity I have is being on the board of AVIOS, The Applied Voice Input and Output Society. As a member of AVIOS, I've also been involved behind the scenes in the annual Speech Programming Competition.

This year Microsoft is back on the scene as a sponsor, as we have been for the 2 years before. I just posted over on the speech team blog about the contest. You should check it out.

Do you have what it takes to win a free XBOX360, Visual Studio Professional, or a fancy new Zune?

Last night, as 12:01am, Tellme announced their Spring 09 upgrades to their underlying speech platform. You can read all about it here on Tellme.com.

There have been a ton of news articles about it since the announcement, but I liked what Tech Crunch said about the new Zira voice. They said it’s “almost sexy”. I never really thought about TTS voices sounding sexy, but … It made for a good article title anyway. :-)

And … BTW … My team here at Microsoft is really excited about this announcement, too, because our underlying speech technology is powering the new “better speech recognition” and the “almost-sexy new voice called Zira”.

It’s been a real pleasure working with Tellme since the acquisition. I’ve had the good fortune of going down to the Tellme offices in Mountain View regularly working with them on a variety of projects for a while now, and I’m super impressed with all the people I’ve interacted with.

In fact, I’m sitting inside Tellme’s main building typing this right now wondering where I’ll go eat lunch on Castro Street in Mountain View. Will it be Amici’s pizza, or someplace else today?

I've been reading a lot of blog posts lately about bloggers using Speech Recognition technology themselves to write their blog articles. Because some bloggers get paid per post that they create, and some others even get paid per character that they write, Speech Recognition is an obvious choice since it can allow users to "type" up to 160 WPM (words per minute).

Weeding thru my inbox and RSS feeds this morning (there's a lot to read thru since I've been on vacation for a couple weeks) I found this article where Jeff Meundel says on PracticalCommerce.com that Windows Speech Recognition is "perhaps the best and coolest part of [...] Vista."

Thanks Jeff. We like it too. :-)

Are you using Windows Speech Recognition to write blog posts?

This morning I read a blog entry on Wired about a new application coming out from a guy named Scott Forman called ShutterVoice. It sounds pretty cool! Basically, ShutterVoice promises to bring hands-free photography to photographers that are using Cannon DLSRs along with a utility from Cannon called the EOS Utility.

If you have all three, you can do wireless hands-free photography.

Check out the video here.

A reader recently wrote to us here at Microsoft, and asked us if there was a way to disable the built in command for "Cap [textInDocument]".

Sometimes, for some people, this command can be recognized a little too easily if you're actually trying to insert text that already exists in the document. So ... I thought ... Time for another Macro of the Day!

You can disable the built-in command once this macro is in place by saying "Disable Cap [textInDocument]". You can re-enable it by saying "Enable Cap [textInDocument]".

You can download it here in our macro code gallery, or just take a look at it here:

<speechMacros>

  <!--

    Author:  Rob Chambers [MSFT]
    Contact: listen@microsoft.com

    ================
    What can I say?
    ================

      Disable Cap textInDocument
      Enable Cap textInDocument

    ================
    How does it work
    ================

      For some people, at sometimes, one of the the built-in
      Vista commands ("Cap [textInDocument]") gets accidentally
      recognized when it shouldn't.

      This macro is a work around for that problem. It "hijacks"
      the recognition for "Cap [textInDocument]" (if the command
      is enabled) and instead just inserts the text in the document.

      The macro also demonstrates how to have another set of
      commands that enables and disables a 3rd command...
    -->

<command>
<listenFor>Disable cap text in document</listenFor>
<setState name="disableCapTextInDocument" value="1"/>
</command>

<command>
<listenFor>Enable cap text in document</listenFor>
<setState name="disableCapTextInDocument" />
</command>

<command priority="100">
<stateIsSet name="disableCapTextInDocument" value="1"/>
<listenFor>cap [textInDocument]</listenFor>
<setTextFeedback>Inserting text. Say Enable "Cap [textInDocument]" to re-enable...</setTextFeedback>
<insertText>{[textInDocument]}</insertText>
</command>

</speechMacros>

I've got two microphones that I use most of the time.

  1. An Andrea SoundMAX SuperBeam 2 element array-microphone from Andrea Electronics
  2. An xTag wireless microphone from RevoLabs

I really love both of them, and I use them interchangeably throughout the day.

If I'm doing quick command and control stuff while I'm typing, I prefer to use the Andrea microphone. That way, I can just mash the <ctrl-windows> key, say what I want (like "Insert my public signature") and wishes come true.

However, if I'm a more relaxed position, doing dictation, or not sitting near my PC (I have a couch in my office :-)), I'll use my wireless xTag microphone.

But ... Having two different microphones causes me a bit of a problem sometimes. I have to actually switch back and forth between the two microphones. That's certainly possible in Vista with Windows Speech Recognition, but it's not as easy as I wanted it to be.

Thus ... A new "Macro of the day!" was born.

Today's macro allows you to say one of the following things:

Microphone
Switch microphone

[microphone] microphone
Switch to [microphone]
Switch to [microphone] microphone

The first two commands will simply bring up a dialog and let you pick which microphone you want to use. The 2nd group of commands will automatically switch to the microphone you want by simply by saying "Andrea microphone", or "xTag microphone".

Now... I till have to use one of the microphones to switch to the other one, but now it's a seamless transfer.

Here's the macro both in copy/paste form, as well as a direct link to my macro library on MSDN here:

<speechMacros>

  <!--

  Author:  Rob Chambers [MSFT]
  Contact: listen@microsoft.com

  ================
  What can I say?
  ================

    Microphone
    Switch microphone

    [microphone] microphone
    Switch to [microphone]
    Switch to [microphone] microphone

  ================
  How does it work
  ================
  This macro demonstrates how to build a dynamic rule based on
  the audio inputs that the Speech API (SAPI) knows to exist
  on the PC.

  Upon recognition, it will disambiguate the microphone name
  if it's ambiguous (or not included in the phrase), and then
  ultimately set the current microphone to use.

  -->

<command priority="100">
  <listenFor>Microphone</listenFor>
  <listenFor>*+ microphone</listenFor>
  <listenFor>[microphone] microphone</listenFor>
  <listenFor>Switch microphone</listenFor>
  <listenFor>Switch to *+ microphone</listenFor>
  <listenFor>Switch to [microphone] ?microphone</listenFor>
  <disambiguate title="Which microphone do you want to use?" prompt="Choose a Microphone" timeout="15" propname="TokenId"/>
  <script language="VBScript">
    <![CDATA[

      ' Get the token id that was stored as a semantic property in the populated by script below,
      ' as well as the text that was spoken in this utterance that generated the token id semantic property
      strMicrophoneTokenId = "{[TokenId]}"
      strMicrophoneName = "{[*TokenId]}"

      ' If there was no matching microphone spoken (e.g. if the user just said, "Microphone")   
      If (strMicrophoneTokenId = "") Then

        ' Find the rule generator script below, and tell it to update the list of microhpones on teh system
        Set audioInputs = CommandSet.RuleGenerators("microphone").Script.UpdateMicrophones()

        ' Loop thru each of those inputs, and prepare to ask the user which to use with a ChooseForList object
        For i = 0 to audioInputs.Count - 1
          Call ChooseFromList.Items.AddItem(audioInputs(i).Phrase, audioInputs(i).Property)
        Next
        ' Go ahead and ask the user which microhpone to use
        audioInputIndex = ChooseFromList.Choose("Change the default audio input from " + Result.RecoContext.Recognizer.AudioInput.GetDescription() + " to:", "Change audio input")

        ' If the user made a selection, update the token id and what the text for that token is
        If (audioInputIndex >= 0) Then
          strMicrophoneName = audioInputs(audioInputIndex).Phrase
          strMicrophoneTokenId = audioInputs(audioInputIndex).Property
        End If

      End If

      ' If we know what microhone to switch to now...
      If (strMicrophoneTokenId <> "") Then

        ' Tell the speech ux to go to off mode (273 is WM_COMMAND, and 102 is "Off" for the speech ux in Vista)
        Call Application.SendMessage("MS:SpeechTopLevel", "", 273, 102, 0)

        ' Update the default for the input category to the new microphone token id
        Result.RecoContext.Recognizer.GetAudioInputs()(0).Category.Default = strMicrophoneTokenId
        ' Let the user know what we did, but automatically time out after 1 second
        Call Application.Alert("Switched to " & strMicrophoneName , "Microphone", 1)

      End If

    ]]>
  </script>
</command>
<ruleScript name="microphone" propname="TokenId" language="VBScript">
  <![CDATA[

    ' Update the list of Microphones right now
    Call UpdateMicrophones()

    Function UpdateMicrophones()

      ' Clear all the items in case we're updating the list
      Call Rule.Items.RemoveAll

      ' Create a shared recognizer, get the audio inputs that recognizer can use
      ' and add each of the inputs as a phrase to the rule (using the token id as the property)
      Set recognizer = CreateObject("SAPI.SpSharedRecognizer")
      Set audioInputs = recognizer.GetAudioInputs()
      For i = 0 to audioInputs.Count - 1
        Call Rule.Items.AddItem(audioInputs(i).GetDescription(), audioInputs.Item(i).Id, True)
      Next

      ' Commit the changes to the rule now, and return the items to the caller. This enables
      ' us the command's script to use this function to update the list and return the list
      ' in a single call/function.
      Call Rule.Commit
      Set UpdateMicrophones = Rule.Items
    End Function

  ]]>
</ruleScript>

</speechMacros>

It’s official. Microsoft’s audio indexing solution (born out of Microsoft Research) is now online as a part of a Washington State pilot program aimed at making audio recordings from 1973 to present available to the public, with an easy to use search interface.

You can read more about it here or you can just play with it here. You can read the official press release here.

I just tried it out, and it worked great!

I love seeing the technology transfer from research to product groups, and research using existing technology off the shelf from the product groups.

Nice job Microsoft Research. Nice job Speech team.

Many of you know that we've been working on our Windows Speech Recognition Macros utility for a while now.

We released the first technical preview in April, a refresh in August/September, and we're continuing to refine the technology that will ultimately lead to a full release sometime soon.

One of the difficulties users have faced is good documentation for WSR Macros. We're working on completing our documentation and will include it with the product in the future, but we're not quite done yet.

But ... A macro enthusiast in the community has come to rescue for WSR Macros users.

Enter Brad Trott (from Marty Markoe's mymsspeech.com web site) and his latest efforts: WSRMacros: The User Guide.

WSRMacros: The User Guide is a 70 page electronic book chock full of insightful thoughts and ideas on how to use Windows Speech Recognition Macros.

If you're curious how to build powerful macros, and you have $9.95 to spare, it's likely well worth it to purchase your very own copy here.

You can also see other products mymsspeech.com offers for Windows Speech Recognition (including instructional videos, toolkits, voice recorders, and some of the best microphones you can find on the web) here.

If you want to start developing on Speech Server 2007, you should probably start off using the Speech Server 2007 Developer Edition – a free, yet fully functional version of the retail product. You can download it here. Please be sure to follow the steps in the Installation Notes exactly, otherwise the installation may fail.

The Developer Edition is only free for developing your application. You'll still need to purchase Office Communications Server 2007 licenses once  you're ready to deploy your application in a non-test deployment. You can read more about  Speech Server Licensing here.

Speech Server 2007 works with Visual Studio 2005 not Visual Studio 2008. After installing Speech Server 2007, you'll also need to install a patch for a bug that occurs when you have .NET 3.0 SP1 or Vista SP1. You can download that patch here.

For support, you can use the Speech Server 2007 Community Support forums here. For bugs, please file the bug in the ECS-UC-Dev queue, or under OCS 2007.

Documentation is found here. White papers can be found here under the technical articles section. For example, here is a white paper on capacity planning.

The Speech Server (2007) MOM pack can be found here.

Books:

Additional web resources:

I haven't tried it out personally, but over on the Habitually Good blog, Vaibhav says that Google's new Chrome browser doesn't work with Windows Speech Recognition in Windows Vista. That's too bad... Both IE7, IE8 Beta 2, and Firefox work great with it.

Hey Google: What's up? If any Google Devs want to learn more about how to add support for SR, let me know. I'll point you to the documented interfaces for doing that.

One of the joys of being a parent is helping them learn. I'm very lucky, in that regard, because my kids love learning. In fact, my son Zac really loves Math. He gets that from both his Mom and his Dad.

So, when he was a little guy, I made a Word Macro to make math work sheets. At first this was just for Addition. I'd load Word up, tell the macro that I wanted the work sheet to be for +1, it'd create 50 random +1 math problems, I'd print it out, give it to Zac, and he'd do it. We'd do that over and over until he was really good at +1. Then, we'd move on to +2. Etc. Etc. Once he'd mastered addition, I changed the math test to do a +1 through +9 review work sheets. He'd do that until he was really good at it. This progressed thru subtraction, then multiplication, and then division. Around the summer between 2nd and 3rd grade, he had all his math facts squarely mastered.

All along, one of the challenges for him was actually writing the answers down on paper. It was certainly important that he get better at his dexterity, but it also skewed how well he knew the math facts.

So ... Being the speech guy that I am, I always thought, "Hmm... Wouldn't it be cool if I wrote a Math Test program for him?" Then, a few years ago when we started developing the Windows Speech Recognition Macro system, I always had in the back of my mind, "Yeah... I should make a Math Test macro someday."

Last weekend, I finally decided to give that a go. About an hour later I had the basic macro done. You can download it here.

It's self documenting, but if you're a bit skeptic about reading XML, here's the gist of what you can say to start the math test (the rest of the math test should be self-explanatory):

Math Test

Math Test [Plus/Minus/Times/Divided by] [0to9]
Math Test [Plus/Minus/Times/Divided by] [0to9] to [0to9]

[0to99] problem [Addition/Subtraction/Multiplication/Division] Math Test
[0to99] problem [Addition/Subtraction/Multiplication/Division] Math Test [0to9] to [0to9]

[0to99] problem Math Test [Plus/Minus/Times/Divided by] [0to9]
[0to99] problem Math Test [Plus/Minus/Times/Divided by] [0to9] to [0to9]

Let me know what you think!

Here's an interesting article on JetBlue's choice of using Microsoft's speech technology over that of Nuance. I especially like this tag line:

"Redmond shakes up the voice-recognition space by offering more reliable software for less"

The author of the article also has the following to say about Windows Speech Recognition that's baked into every copy of Windows Vista:

"Personally, I've worked with the new speech-recognition tools and I thought they were amazing. Easy to use and, best of all, it understood my New York accent without a problem."

Thanks, Mr. Bruzzese. Glad you like it.

Every year in the Fall it's time for me to check out what's new on TV, and decide what, if anything, I'm going to start watching. This year's no different...

Ah... Well ... It was actually different. This year, just a couple weeks ago, my Tivo completely died. I'd only had it for about 7 months, but it died. Oh well. It was under warranty.

After a couple back-and-forths with Tivo customer service, I was the proud owner of a brand new Tivo HD. But ... It didn't have any of my old recorded shows on it, nor did it have any of my programming on it. Sigh.

If you've ever tried to setup a Tivo with your full schedule of what you watch, you'll probably already know, it's a lot of work to set it up. Especially from the remote control. But ... Tivo does have that fancy Tivo Central Online, that allows you to use your web browser to find shows, and schedule them for recording.

Cool! But ... that's still a lot of typing and going back and forth with my web browser.

Or ... Is it?

Nope! Not if you're using Windows Speech Recognition Macros!

I just whipped up a quick macro that let's me say:

  • "Search Tivo for that", or
  • "Search Tivo for [...]"

I can say anything I want for the [...] part, too.

Yeah!!! Now, I can set up my Tivo as easily as saying "Search Tivo for House", or "Search Tivo for Prison Break". Neat, eh?

You too can set up your Tivo this fall, just as easily, by downloading today's Macro of the Day: Search Tivo for That!

More Posts Next page »
 
Page view tracker