SAPI Documentation Errata: ISpRecoGrammar::SetRuleState
25 April 08 12:46 PM | Charles Oppermann | 0 Comments   

There is a typo in the documentation for the ISpRecoGrammar::SetRuleState method in SAPI 5.3.  The input parameters are listed as:

HRESULT SetRuleState(
   LPCWSTR       *pszName,
   void          *pReserved,
   SPRULESTATE   NewState
);

This instead it should be:

HRESULT SetRuleState(
   LPCWSTR       pszName,
   void          *pReserved,
   SPRULESTATE   NewState
);

Note that instead of "*pszName" the parameter should be "pszName".

We'll update MSDN and the Windows SDK documentation, but in the meantime we wanted to publish this errata.

Filed under: , ,
Speech Content in the Windows SDK
26 February 08 01:46 AM | Charles Oppermann | 5 Comments   

I'm happy to announce the availability of the RTM release of the Windows SDK.  This release - the first RTM one since Vista - contains the following speech-related items:

  • Updated: SAPI 5.3 documentation
  • Updated: System.Speech documentation
  • Updated: Sample source code
    • 8 C++ projects
    • 3 C# projects
    • 2 sample engines - TTS and SR
  • New: Grammar Compiler (GC.EXE) tool now part of the tool binaries included in the SDK

The Windows SDK completely replaces the older SAPI 5.1 SDK and supports development on Windows XP, Windows Server 2003, Windows Vista, and Windows Server 2008.

Customers can download this SDK as a DVD image (1,330MB ISO file) from this location:
http://www.microsoft.com/downloads/details.aspx?FamilyId=F26B1AA4-741A-433A-9BE5-FA919850BDBF

Or go through a guided setup process where only the components they need are downloaded.  Speech is part of the base install.
http://www.microsoft.com/downloads/details.aspx?FamilyId=E6E1C3DF-A74F-4207-8586-711EBE331CDC

 I'm particularly interested in your feedback regarding the Windows SDK as a whole and in particular getting speech information.

Display Context Menus Where The Cursor Is, Not Where the Mouse Is
06 September 07 01:39 PM | Charles Oppermann | 0 Comments   

This is a little user interface rant of mine since I'm speech and keyboard-oriented.  While editing text, when I say "Press Shift F Ten" or press the Application Key (to the right of the spacebar on Windows keyboards), I expect the context menu to appear at the text cursor location, since that's where the action is going to take place.

However, some applications assume the mouse activitated the functionality and positions the context menu wherever the mouse is.  Since I'm using speech or typing and haven't touched the mouse in a while, the menu appears nowhere near where the cursor or selection is.

A more common variant of this is when the menu appears in the upper-left corner of the edit box when activated by keyboard. 

The article titled Using Menus on MSDN contains sample code that always uses lParam for the X/Y location to display the menu.  The documentation on WM_CONTEXTMENU is clear:

If the context menu is generated from the keyboard—for example, if the user types SHIFT+F10—then the x- and y-coordinates are -1 and the application should display the context menu at the location of the current selection rather than at (xPos, yPos).

That advice is ignored in the Using Menus topic, so I used the "Add Community Content" to add a note, and I'll file a bug on this so that it can be fixed in the future.

Using MSDN's Community Content feature, I added the following to the Using Menus article:

Remember when processing the WM_CONTEXTMENU message, that the X/Y coordinates might be -1/-1 which indicates that the keyboard generated the menu, thus, the menu should be shown at the cursor location or at the location of the selection - NOT at -1/-1 or the mouse pointer location.

The samples currently in this article do not account for this and will attempt to display the menu at -1/-1, which is confusing to the keyboard user.  Pressing the Application Key on Windows keyboards (to the right of the spacebar) generates a VK_APPS virtual scan code which by default generates a WM_CONTEXTMENU.  You can also get this if the user presses SHIFT+F10.

Never handle SHIFT+F10 or VK_APPS to popup a context menu, rely on the WM_CONTEXTMENU message and if the location given is -1/-1, revert to using the text cursor and/or selection information to place the menu.

Filed under: ,
The Desktop Is Not For Programs
04 September 07 01:11 AM | Charles Oppermann | 6 Comments   

I'm constantly amazed that people think that putting shortcuts to programs on the desktop makes accessing that program "easier".

For the second time in about a week, I've encountered people asking how to put shortcuts to programs on the desktop. 

The desktop is ill-suited for this. To start with, items located on it are often not visible because other windows are placed in front of the desktop.  Depending on the current window layout, you might have to make one or more mouse or keyboard operations to select the desktop item you want. 

To make matters worse, the location of the items will shift positions as screen resolutions change (because of games, connecting monitors, etc.) and items are added.

While oftentimes commercial software will litter the desktop with shortcuts, the purpose is to increase visibility, not ease of use.

Why not use the Start Menu? If WordPad is a program you use often, just "Pin" it to the Start Menu and it'll always be there, available in less keystrokes than trying to use it off the desktop.

If you really want quick access, pin the item to the Start Menu and then modify the item's properties to have a shortcut key assigned. Only items in the Start Menu can have shortcut keys assigned to them.

Update 9/4/2007: My bad, shortcut keys can be assigned to shortcut file that are located on the desktop.  My initial test of this failed, and since I knew that only certain locations respect shortcut keys, I figured that the desktop was not one of them.  I'll try to find a definitive list, but it appears that any of the Start Menu locations and the Desktop are valid places for a shortcut file to have a shortcut key assigned.  Interestingly, shortcut keys for items in the Quick Launch toolbar location are not respected.

Filed under: ,
Cool Developers Go Flying
13 May 07 01:24 AM | Charles Oppermann | 0 Comments   

Last week the speech team wrapped up a milestone of work and to celebrate I took up our group program manager Richard Sprague up for a quick tour of downtown Seattle, the Eastside area including Bill Gates' house and the main Microsoft campus in Redmond.

Richard took some video of the trip and posted the highlights on MSN's Soapbox video service:


Video: Flying over Seattle

Filed under: ,
Background on Audio Volume in Windows Vista
18 April 07 09:33 AM | Charles Oppermann | 1 Comments   

 Our friend in the multimedia group and prolific blogger Larry Osterman is writing a series of articles on how volume is treated in Windows Vista.

There is a whole new audio sub-system in Vista and Larry's blog is a great source of information for developers.

Volume in Windows Vista, part 1: What is "volume"?
Volume in Windows Vista, part 2: Types of volume in Windows Vista

Upgrading to Windows Vista
08 February 07 08:23 PM | Charles Oppermann | 0 Comments   

I've personally found the upgrade process from Windows XP to Windows Vista to be seamless.  However, I know that people have concerns about their software and devices working with a new version of the operating system.

In Windows Vista there is pro-active and reactive technology to help with compatibility issues, so I think compatibility problems will be not be as big a concern.  However, with over 200 million users, there will be issues for some people.  The best thing you can do is plan ahead.

Use the Windows Vista Upgrade Advisor on your current machine to identify potential problem areas.  In many cases where there is compatibility issues, there is already a new version available from the manufacturer.  The upgrade advisor will link directly there where you can get the new version.

For technical folks, you can browse the Hardware Compatibility List (HCL) and see the level of support available.

Now for a purely personal comment.  In the past, I've found that software that doesn't work with newer versions of the operating system (not just Windows Vista), tend be of lower quality overall, or tend to perform tasks that require intertwining with the operating system.  The days of allowing applications free rein to manipulate the file system and registry is over.  Too many applications abused this and Windows has had to clamp down to prevent exploits.

Already have Windows Vista installed and are having problems?  Here is some help.

Now, for Speech API developers, I'm happy to say that Windows Vista includes SAPI 5.3, which is backwards compatible with SAPI 5.1 that was included with Windows XP.

Filed under:
Every single thing Windows Vista Speech Recognition is listening for
23 January 07 07:05 PM | Charles Oppermann | 1 Comments   

Rob Chambers is passionate about speech and a prolific blogger.  Something that I've always wanted was a list of all the commands that Windows Speech Recognition recognizes.  I knew I could probably scan through the internal grammars that WSR uses, but what I didn't know was that Rob had already posted such a list nearly a year ago.

One command that surprised me was "move speech recognition to the bottom" (or "top").  Sometimes the UI panel at the top of screen gets in the way.  I knew I could click the Minimize button, but that would not be an option for everyone trying to use their computer hands-free.

Link to Rob's Rhapsody : Every single thing Windows Vista Speech Recognition is listening for

Windows Live WiFi Hotspot Locator
02 January 07 07:06 PM | Charles Oppermann | 3 Comments   

How I wish I had this tool last month as I was without power and telephone/DSL service at home and looking for a Internet hookup.

Sure, Starbucks and McDonalds have WiFi, but Starbucks wants $9.95 for 24 hours.  McDonalds will give you 2 hours for $2.95, which is not bad.

The Windows Live WiFi Hotspot Locator will accept location information and show you the hotspots within a selected radius.  You can easily print the list or map each location.  Very handy.

How NASA Can Help Microsoft
13 December 06 10:46 AM | Charles Oppermann | 0 Comments   

On an internal blog at Microsoft, I came across a posting by a Corporate VP on some books he was going read while on vacation.  One of the books was a autobiography of one of my hero's, Gene Kranz, who was Flight Director for several flights in Project Gemini and Apollo.

Kranz lead his team of Flight Controllers (known as the White Team) during two of the most dramatic events in the space program at the time; the touchdown of the Apollo 11 Lunar Module, Eagle, and again when the Apollo 13 Service Module exploded.

Kranz's book is called Failure Is Not An Option and is wonderfully written and I recommend it highly.  In my opinion, and echoed in the internal blog posting I read, the lessons of the space race of the late 1950's and 1960's are of value to any large organization.  Kranz's management style is no-nonsense, and that constant practice through simulation kept everyone alert and allowed them to react quickly to unplanned situations.

Probably the worst thing in his Flight Control world was encountering a problem that wasn't already thought about. The idea being as anything went wrong, everyone has experience with the problem from simulation and could react very quickly. Absolutely there are numerous lessons for Microsoft in how NASA and it's contractors approached the space race to the Moon.

In this arena, anything could happen, so you assumed that from the start, planned for it, designed for it, and executed it with that in mind.  Imagine a world in which software developers assume everything could fail and one in which simulation (testing) does fail everything.  The result would be much more robust code.

I think Microsoft does more of this kind of testing than another major software maker.  While software components, particular the interactions with the operating system platform are complex, the various systems of the Apollo missions where incredibly complex as well.  There were several major companies and hundreds of small contractors providing pieces to the system and they all had to work together perfectly.

After reading his book, I got to meet Gene at a MoF dinner a few years ago. He's got a great personality and mentality for thinking through problems.

I also recommend Angle of Attack by Harrison Storms, who was a senior VP for North American Aerospace, personally responsible for Apollo Command Module. It too has lessons on how business can react to tragic failure (as what happened when the Apollo 1 command module caught fire during a ground test, killing three astronauts).

These are not business books, but biographies of head-strong people leading large organizations doing high-profile work.

Filed under: ,
Windows Speech Recognition Getting Some Respect
12 December 06 07:46 AM | Charles Oppermann | 2 Comments   

When our new speech recognition for Windows Vista was demonstrated at the Microsoft Financial Analyst Meeting this past August, it went disastrously (read about it at my personal blog, along with links to the video).

As I said in that posting, I was angry at first because the high-profile failure didn't need to happen (Larry Osterman explains the technical details).  But at the same time, I know our technology was good, and as a side benefit, the fallout from the demo would give WSR more recognition.

Now that Windows Vista is getting more attention in the mainstream technology press, people are trying out WSR and discovering that it works pretty well.  Here's what Jupiter Research analyst Michael Gartenberg had to say:

"I’ve been using the integrated speech recognition in Windows Vista for the last few days, for a variety of tasks and in a variety of applications. I’m pleased to say it works well, and greatly improves the usability of my computer for entering text. It’s so darn good, it feels a little bit out of science fiction. But then again, isn’t that the way technology is supposed to work?"

Thanks to Todd Bishop's Microsoft Blog over at the Seattle Post-Intelligencer for pointing this out.

Browse the Speech API 5.3 SDK Documentation Online
08 December 06 12:24 AM | Charles Oppermann | 0 Comments   

You can download the full Windows SDK from here, but if you just want to read the documentation for SAPI, you can do so from this link.

To browse the Managed Speech API documentation, here are the links to the various namespaces:

System.Speech.AudioFormat
System.Speech.Recognition
System.Speech.Recognition.SrgsGrammar
System.Speech.Synthesis
System.Speech.Synthesis.TtsEngine

Filed under:
Microsoft Speech API SDK
24 November 06 09:58 AM | Charles Oppermann | 10 Comments   

The Speech API Software Developers Kit (SAPI SDK) contains the documentation, samples, and header and library files to create applications and utilities that use speech recognition and voice synthesis. In addition, the SAPI SDK can be used to create speech recognition and voice synthesis engines that can be used by other applications.

Generally, the version of SAPI is determined by the platform that shipped it. SAPI 5.1 was included with Windows XP along with the Microsoft Sam TTS engine. The initial release of Windows XP did not include a speech recognition engine. The Tablet PC Edition of Windows XP did include version 6.1 of Microsoft's speech recognition engine. This was also shipped with Office 2003. Office 2003 also included SAPI TTS voices from Lernout & Hauspie, called LH Michael and LH Michelle. Also note that some vendors include SR and TTS engines with their products. For example, my laptop came with speech recognition and TTS engine provided by Toshiba.

With Windows Vista, the version of SAPI that is installed is 5.3. We have replaced the Microsoft Sam voice with next generation technology in a new female voice we call Microsoft Anna. We have also made major improvements to the speech recognition engine (now version 8.0) and that is included in all editions of Windows Vista.

For the SDK, you can download the SAPI 5.1 SDK to create applications and engines that work on Windows XP and Windows Server 2003. These applications or engines should also be forward-compatible with SAPI 5.3 on Windows Vista and beyond. The SAPI 5.1 SDK is a stand-alone package, separate from other Microsoft SDK's.

With SAPI 5.3, we integrated our SDK into the main Windows SDK (sometimes known as the Platform SDK). You can use the Windows SDK to create applications for Windows Vista, Windows XP, and Windows Server 2003. What OS version you target is done at compile-time, and that prevents features that only exist in latter versions from being available.

You can get an ISO image to burn the SDK to a DVD here:

http://www.microsoft.com/downloads/details.aspx?familyid=7614FE22-8A64-4DFB-AA0C-DB53035F40A0

To selectively download and install various components of the Windows SDK, go here:

http://www.microsoft.com/downloads/details.aspx?FamilyId=C2B1E300-F358-4523-B479-F53D234CDCCF

Something else that is new is our Managed Speech API's. Codenamed SpeechFX, the Managed Speech API is part of the Microsoft .NET Framework 3.0. The new System.Speech namespace provides managed classes for speech recognition and synthesis. This makes it much easier to write speech applications from managed code, such as C# or Visual Basic .NET.

The Managed Speech API documentation is included with the Windows SDK. Applications that use .NET Framework 3.0 will work on Windows Vista, Windows XP and Windows Server 2003. Note that you have to redistribute the .NET Framework 3.0 with your application for Windows XP and Windows Server 2003. The framework is already included with Windows Vista.

Internet Explorer, Screen Readers and Keyboard Access
07 November 06 02:53 PM | Charles Oppermann | 0 Comments   

When I get around to posting a bio, you'll find out that I have a background in making technology accessibility to people with disabilities.  I worked in this area for Windows 95 through Windows 2000 and Internet Explorer 2.0 through 4.01.

With the recent release of IE7, there have been the usual questions about whether adaptive aids will work with it, such as screen readers for the blind and visually impaired.

Kelly Ford, a tester in the IE group has produced a nice write up regarding this.  Read it at the IE Team Blog.

Filed under: ,
Steve Wozniak at Microsoft
06 October 06 02:35 PM | Charles Oppermann | 10 Comments   

One of the things I love about Microsoft is that you often get talk with, or listen to interesting people.  Over the years, I got to shake hands with Steven Spielberg (wearing a Microsoft Bob baseball cap), Jay Leno (Windows 95 launch), Stevie Wonder (accessibility event), and just the other week, former NFL player Mike Utley.  I just missed out on meeting Muhammad Ali once because I was heads down on shipping IE 4.01.

Today, Steve Wozniak, co-founder of Apple Computer came by the campus.  "Woz" as his is commonly known, was a electronics geek in his teens and designed many little gadgets before Steve Jobs convinced him to sell his micro computer design that HP rejected.  That became the Apple I in 1976.

Woz was giving a speech and signing copies of his book, "iWoz". He gave a well practicied, but informal talk about his early years of fiddling with electronic things and his early interactions with Steve Jobs.

Jobs and Woz considered themselves best friends, but in all the accounts of I've read of both men, they really couldn't be much different.  Woz was grounded and knew he wanted to be an engineer, while Jobs was a free spririt.  In his talk, he didn't dispell that image.  Woz describes Jobs as having a lot of strange friends, doing strange things, and being a "free thinker" several times.  It makes me wonder what they saw in each other in the first place.

In all, a nice talk by Steve.

UPDATE on Monday, October 9, 2006:  Port 25 has a video interview with Woz on the day he visited.

UPDATE on Monday, October 9, 2006:  Reworded portion on Apple I for clarity.  Thanks Josh.

More Posts Next page »
Page view tracker