<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Working the Spoken Word : Speech-to-text</title><link>http://blogs.msdn.com/spokenword/archive/tags/Speech-to-text/default.aspx</link><description>Tags: Speech-to-text</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>What a year for speech recognition at Microsoft</title><link>http://blogs.msdn.com/spokenword/archive/2007/12/31/what-a-year-for-speech-recognition-at-microsoft.aspx</link><pubDate>Mon, 31 Dec 2007 23:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6890751</guid><dc:creator>Stephen Potter</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/spokenword/comments/6890751.aspx</comments><wfw:commentRss>http://blogs.msdn.com/spokenword/commentrss.aspx?PostID=6890751</wfw:commentRss><wfw:comment>http://blogs.msdn.com/spokenword/rsscomments.aspx?PostID=6890751</wfw:comment><description>&lt;P&gt;Yeah, yeah, the year in review, what a&amp;nbsp;crushingly unoriginal idea for a post. But&amp;nbsp;wait - this is worth it.&amp;nbsp;2007 was a huge year for speech recognition products at Microsoft.&amp;nbsp;I think we'll look back on it as a real turning point. Here's how it shaped up.&lt;/P&gt;
&lt;P&gt;&lt;A class="" href="http://www.microsoft.com/exchange/evaluation/unifiedmessaging/default.mspx" mce_href="http://www.microsoft.com/exchange/evaluation/unifiedmessaging/default.mspx"&gt;&lt;IMG style="WIDTH: 116px; HEIGHT: 140px" height=140 hspace=5 src="http://blogs.msdn.com/photos/spokenword/images/6923676/secondarythumb.aspx" width=116 align=right border=0 mce_src="http://blogs.msdn.com/photos/spokenword/images/6923676/secondarythumb.aspx"&gt;&lt;/A&gt;(Going into the year, Exchange Server 2007 had just shipped with &lt;A class="" href="http://www.microsoft.com/exchange/evaluation/unifiedmessaging/default.mspx" mce_href="http://www.microsoft.com/exchange/evaluation/unifiedmessaging/default.mspx"&gt;Unified Messaging&lt;/A&gt;, including Outlook Voice Access that gives you access over the phone to email, calendar and other useful features. It's a significant integration of speech technology into the heart of a high-volume server product. Huge posters had been up on campus for months, and inside Exchange, they called it 'the sizzle on the steak'. &lt;/P&gt;
&lt;P&gt;Meanwhile the teams in Windows, Speech Server, automotive and core technology are hard at work... )&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;January 2007&lt;/EM&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;A class="" href="http://www.microsoft.com/windows/products/windowsvista/features/details/speechrecognition.mspx" mce_href="http://www.microsoft.com/windows/products/windowsvista/features/details/speechrecognition.mspx"&gt;&lt;IMG style="WIDTH: 127px; HEIGHT: 140px" height=140 src="http://blogs.msdn.com/photos/spokenword/images/6923407/secondarythumb.aspx" width=127 align=left border=0 mce_src="http://blogs.msdn.com/photos/spokenword/images/6923407/secondarythumb.aspx"&gt;&lt;/A&gt;&lt;STRONG&gt;Windows Vista&lt;/STRONG&gt;&amp;nbsp;ships with&amp;nbsp;&lt;A class="" href="http://www.microsoft.com/windows/products/windowsvista/features/details/speechrecognition.mspx" mce_href="http://www.microsoft.com/windows/products/windowsvista/features/details/speechrecognition.mspx"&gt;Windows Speech Recognition&lt;/A&gt; built into the operating system in&amp;nbsp;eight different languages.&amp;nbsp;Now this is a significant investment in the voice user interface as a means of commanding and dictation for desktop users. The entire desktop is speech-enabled under the 'say what you see' metaphor; correction and selection are easy; and the system adapts to your voice and your typical word usage as time goes on. Since the release of WSR, many media reviews have been overwhelmingly positive - check out &lt;A class="" href="http://blogs.msdn.com/robch/" mce_href="http://blogs.msdn.com/robch/"&gt;Rob Chambers' blog&lt;/A&gt;&amp;nbsp;(one of the driving forces behind speech in Vista) for links and discussions. &lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&lt;/EM&gt;&amp;nbsp;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;March 2007 &lt;/EM&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;A class="" href="http://www.microsoft.com/presspass/press/2007/mar07/03-14PowerOfSpeechPR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/mar07/03-14PowerOfSpeechPR.mspx"&gt;&lt;IMG style="WIDTH: 250px; HEIGHT: 166px" height=166 hspace=5 src="http://www.microsoft.com/presspass/images/press/2007/03-14PowerOfSpeech_thumb.jpg" width=250 align=right vspace=5 border=0 mce_src="http://www.microsoft.com/presspass/images/press/2007/03-14PowerOfSpeech_thumb.jpg"&gt;&lt;/A&gt;Microsoft announces intent to &lt;A class="" href="http://www.microsoft.com/presspass/press/2007/mar07/03-14PowerOfSpeechPR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/mar07/03-14PowerOfSpeechPR.mspx"&gt;acquire &lt;STRONG&gt;Tellme Networks&lt;/STRONG&gt;&lt;/A&gt;. Steve Ballmer &lt;A class="" href="http://www.microsoft.com/presspass/press/2007/mar07/03-14PowerOfSpeechPR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/mar07/03-14PowerOfSpeechPR.mspx"&gt;says it all&lt;/A&gt;: &lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;“Speech is universal, simple and holds incredible promise as a key interface for computing. Tellme brings to Microsoft the talent, technology and proven experience in speech that will enable us to deliver a new wave of products and revolutionize human-computer interaction.”&lt;/EM&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;(Incidentally, CNET has a nice &lt;A class="" href="http://www.news.com/Behind-Redmonds-Tellme-deal/2100-1014_3-6167422.html" mce_href="http://www.news.com/Behind-Redmonds-Tellme-deal/2100-1014_3-6167422.html"&gt;inside look &lt;/A&gt;at the discussions in Building 34 on Superbowl day between Steve Ballmer&amp;nbsp;and Mike&amp;nbsp;McCue, Tellme CEO,&amp;nbsp;that led up to the deal.)&lt;/P&gt;
&lt;P&gt;&lt;A class="" href="http://www.microsoft.com/responsepoint/default.aspx" mce_href="http://www.microsoft.com/responsepoint/default.aspx"&gt;&lt;IMG style="WIDTH: 160px; HEIGHT: 76px" height=76 hspace=8 src="http://blogs.msdn.com/photos/spokenword/images/6923386/secondarythumb.aspx" width=160 align=left border=0 mce_src="http://blogs.msdn.com/photos/spokenword/images/6923386/secondarythumb.aspx"&gt;&lt;/A&gt;Also in March&amp;nbsp;- &lt;A class="" href="http://www.microsoft.com/responsepoint/default.aspx" mce_href="http://www.microsoft.com/responsepoint/default.aspx"&gt;&lt;STRONG&gt;Microsoft Response Point&lt;/STRONG&gt;&lt;/A&gt;&amp;nbsp;is&amp;nbsp;&lt;A class="" href="http://www.microsoft.com/presspass/press/2007/mar07/03-19MSResponsePointPR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/mar07/03-19MSResponsePointPR.mspx"&gt;launched&lt;/A&gt;&amp;nbsp;out of Microsoft Research. Response Point is a new way for small business to manage their phone systems - inexpensive, easy to set up and easy to use. All thanks to VoIP and the speech technology that underlies the user interface.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;&lt;A class="" href="http://www.microsoft.com/presspass/press/2007/may07/05-03TellmeClosePR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/may07/05-03TellmeClosePR.mspx"&gt;&lt;IMG title="tellme logo" style="WIDTH: 146px; HEIGHT: 80px" height=80 alt="tellme logo" hspace=8 src="http://www.tellme.com/images/site/tellme_logo_large.gif" width=146 align=right vspace=5 border=0 mce_src="http://www.tellme.com/images/site/tellme_logo_large.gif"&gt;&lt;/A&gt;May 2007&amp;nbsp;&lt;/EM&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;The &lt;A class="" href="http://www.microsoft.com/presspass/press/2007/may07/05-03TellmeClosePR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/may07/05-03TellmeClosePR.mspx"&gt;acquisition of TellMe closes&lt;/A&gt;.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;September 2007&lt;/EM&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;In the mobile space, &lt;A class="" href="http://www.microsoft.com/presspass/press/2007/Sep07/09-18SprintMobileSearchPR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/Sep07/09-18SprintMobileSearchPR.mspx"&gt;Tellme announces a deal with Sprint &lt;/A&gt;to incorporate&amp;nbsp;Tellme's voice search technology with Live Search into certain phones.&lt;/P&gt;
&lt;P&gt;&lt;A class="" href="http://www.syncmyride.com/#/home/" mce_href="http://www.syncmyride.com/#/home/"&gt;&lt;IMG style="WIDTH: 63px; HEIGHT: 33px" height=33 hspace=8 src="http://www.syncmyride.com/Own/img/Global/icons/iconUseSync.png" width=63 align=left vspace=5 border=0 mce_src="http://www.syncmyride.com/Own/img/Global/icons/iconUseSync.png"&gt;&lt;/A&gt;Meanwhile, the first Ford cars hit the market in the USA with &lt;A class="" href="http://www.syncmyride.com/#/home/" mce_href="http://www.syncmyride.com/#/home/"&gt;&lt;STRONG&gt;Sync&lt;/STRONG&gt;&lt;/A&gt; - hands-free speech technology for voice dialing, messaging&amp;nbsp;and media control within the car.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;October 2007&lt;/EM&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;A class="" href="http://www.microsoft.com/presspass/press/2007/oct07/10-16UC2LaunchPR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/oct07/10-16UC2LaunchPR.mspx"&gt;&lt;IMG style="WIDTH: 136px; HEIGHT: 151px" height=151 src="http://blogs.msdn.com/photos/spokenword/images/6923379/original.aspx" width=136 align=right vspace=3 border=0 mce_src="http://blogs.msdn.com/photos/spokenword/images/6923379/original.aspx"&gt;&lt;/A&gt;&lt;STRONG&gt;Office Communications Server 2007&lt;/STRONG&gt; is &lt;A class="" href="http://www.microsoft.com/presspass/press/2007/oct07/10-16UC2LaunchPR.mspx" mce_href="http://www.microsoft.com/presspass/press/2007/oct07/10-16UC2LaunchPR.mspx"&gt;released&lt;/A&gt; as the flagship of Microsoft's Unified Communications strategy. Bundled with OCS 2007 is the latest version of Microsoft&amp;nbsp;&lt;A class="" href="http://msdn2.microsoft.com/en-us/library/bb857803.aspx" mce_href="http://msdn2.microsoft.com/en-us/library/bb857803.aspx"&gt;Speech Server&lt;/A&gt; - now called &lt;EM&gt;Office Communications Server 2007 Speech Server&lt;/EM&gt; (oh yes). It's a significant upgrade from Speech Server 2004, including native VoIP support, graphical dialog editing, conversational grammars, and rich data mining and tuning tools. &lt;/P&gt;
&lt;P&gt;&lt;A class="" href="http://www.microsoft.com/presspass/press/2007/oct07/10-15OSBUpdatesPR.mspx?rss_fdn=Press%20Releases" mce_href="http://www.microsoft.com/presspass/press/2007/oct07/10-15OSBUpdatesPR.mspx?rss_fdn=Press%20Releases"&gt;&lt;IMG style="WIDTH: 160px; HEIGHT: 31px" height=31 hspace=8 src="http://blogs.msdn.com/photos/spokenword/images/6923373/secondarythumb.aspx" width=160 align=left vspace=5 border=0 mce_src="http://blogs.msdn.com/photos/spokenword/images/6923373/secondarythumb.aspx"&gt;&lt;/A&gt;And - what a month - &lt;STRONG&gt;Live Search for Windows Mobile&lt;/STRONG&gt; &lt;A class="" href="http://www.microsoft.com/presspass/press/2007/oct07/10-15OSBUpdatesPR.mspx?rss_fdn=Press%20Releases" mce_href="http://www.microsoft.com/presspass/press/2007/oct07/10-15OSBUpdatesPR.mspx?rss_fdn=Press%20Releases"&gt;goes live&lt;/A&gt; with speech recognition. The speech team blog has&amp;nbsp;&lt;A class="" href="http://blogs.msdn.com/speech/archive/2007/10/16/live-search-for-mobile-now-with-speech-recognition.aspx" mce_href="http://blogs.msdn.com/speech/archive/2007/10/16/live-search-for-mobile-now-with-speech-recognition.aspx"&gt;more details &lt;/A&gt;of the kinds of searches possible. And you don't even need a mobile phone to make &lt;A class="" href="http://www.livesearch411.com/" mce_href="http://www.livesearch411.com/"&gt;free 411 calls &lt;/A&gt;using the Live Search speech technology.&amp;nbsp;Insider details from Long Zheng's &lt;A class="" href="http://www.istartedsomething.com/20071101/live-search-mobile-voice-input/" mce_href="http://www.istartedsomething.com/20071101/live-search-mobile-voice-input/"&gt;interview with Program Manager Oliver Scholz&lt;/A&gt;. &lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;So what's to come in 2008?&lt;/EM&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;Let me say only that we have not been sitting around (well, actually, that's not quite true, I have been sitting around for the last month, since I was out on paternity leave. Only it wasn't really sitting around, there was a lot to do in terms of coping with the newborn's data streams&amp;nbsp;and all that, but I wasn't building software, that's what I meant, now let me rescue my point) - all the teams behind these releases have been planning and executing on the next waves since even before the dates above, so huge momentum has already built in a number of areas, old and new, and we'll start to see evidence of this as the year progresses.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;And - did I mention that we're hiring in a number of speech technology-related areas? Please &lt;A class="" href="http://blogs.msdn.com/spokenword/contact.aspx" mce_href="http://blogs.msdn.com/spokenword/contact.aspx"&gt;contact me &lt;/A&gt;for details if you're interested. &lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6890751" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/spokenword/archive/tags/Unified+Communications/default.aspx">Unified Communications</category><category domain="http://blogs.msdn.com/spokenword/archive/tags/Speech+Server+2007/default.aspx">Speech Server 2007</category><category domain="http://blogs.msdn.com/spokenword/archive/tags/Speech-to-text/default.aspx">Speech-to-text</category></item><item><title>Bored medical student impressed by speech recognition</title><link>http://blogs.msdn.com/spokenword/archive/2007/08/15/bored-medical-student-impressed-by-speech-recognition.aspx</link><pubDate>Thu, 16 Aug 2007 04:02:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:4407745</guid><dc:creator>Stephen Potter</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/spokenword/comments/4407745.aspx</comments><wfw:commentRss>http://blogs.msdn.com/spokenword/commentrss.aspx?PostID=4407745</wfw:commentRss><wfw:comment>http://blogs.msdn.com/spokenword/rsscomments.aspx?PostID=4407745</wfw:comment><description>&lt;P&gt;Sullen student, bored by x-rays of terrible chest diseases, is&amp;nbsp;&lt;A class="" href="http://www.marionsmits.net/2007/08/14/speech-recognition/" mce_href="http://www.marionsmits.net/2007/08/14/speech-recognition/"&gt;mesmerised by&amp;nbsp;speech recognition&lt;/A&gt;, mutters &lt;EM&gt;that's so cool&lt;/EM&gt;. &lt;/P&gt;
&lt;P&gt;Can't argue with that.&lt;/P&gt;
&lt;P&gt;(And can't resist real stories with a whiff of &lt;A class="" href="http://www.theonion.com/content/index" mce_href="http://www.theonion.com/content/index"&gt;The Onion&lt;/A&gt;.)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=4407745" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/spokenword/archive/tags/Speech-to-text/default.aspx">Speech-to-text</category></item><item><title>Thinking aloud</title><link>http://blogs.msdn.com/spokenword/archive/2007/08/01/thinking-aloud.aspx</link><pubDate>Thu, 02 Aug 2007 04:50:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:4180151</guid><dc:creator>Stephen Potter</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/spokenword/comments/4180151.aspx</comments><wfw:commentRss>http://blogs.msdn.com/spokenword/commentrss.aspx?PostID=4180151</wfw:commentRss><wfw:comment>http://blogs.msdn.com/spokenword/rsscomments.aspx?PostID=4180151</wfw:comment><description>&lt;P&gt;&lt;IMG title="Windows Vista Speech Toolbar" style="WIDTH: 335px; HEIGHT: 81px" height=81 alt="Windows Vista Speech Toolbar" hspace=5 src="http://blogs.msdn.com/photos/spokenword/images/4179946/original.aspx" width=335 align=right vspace=5 mce_src="http://blogs.msdn.com/photos/spokenword/images/4179946/original.aspx"&gt;Speech recognition in Vista works well for me. Dictation accuracy is very high - especially since I flicked the switch to train it on my emails and documents - and the correction experience is smooth and efficient. But I hardly use it. I find it very difficult to dictate to my computer.&lt;/P&gt;
&lt;P&gt;WTF? Typing is easier than speaking? &lt;/P&gt;
&lt;P&gt;Yes. I can't speak the way I write. If I'm typing, I'll begin a sentence without knowing how it's going to end, future phrases will form as I finish typing previous ones. I'll pause, go back, select text, delete it, write again. It feels almost as if hitting the keys is a direct extension of the thought process.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;If I'm dictating, there's no such flow. I'll begin a sentence (having got over the minor panic triggered by the&amp;nbsp;&lt;EM&gt;Listening&lt;/EM&gt; microphone UI over a blank document) and the phrase will appear - correctly - on my page, but saying it out loud has corrupted the thoughts that would have helped complete it. I have to step back and think hard about what should come next. Every phrase forces a little mental reboot. &lt;/P&gt;
&lt;P&gt;I think what's going on - in addition to my inability to think in complete sentences - is that my thought processes while typing have habituated themselves to the gated speeds of my motor operations. The extra time is valuable and they use it to do the work of sounding out the current phrase in context, thinking up the next phrase, and so on. So typing actually greases the wheels of forming the right words and sentences. That's missing when I dictate, and I find it quite paralyzing.&lt;/P&gt;
&lt;P&gt;Surely many keyboard users, even very slow typists, will be reluctant to move to speech recognition systems - even high accuracy systems - because of the advantages of the thinking on-the-side that we seem to do while hitting keys. The standard words-per-minute (wpm) metric for text input is usually measured over copying pre-existing texts. The text creation part is assumed to be equal to each. But my wpm drops dramatically when I dictate, because the creation processes that work with typing are unavailable&amp;nbsp;to me. But what does the wpm look like when you do have them available? &lt;/P&gt;
&lt;P&gt;Many users of speech recognition (and of transcription-taking secretaries) have obviously overcome this. Most visibly,&amp;nbsp;&lt;A class="" href="http://www2.english.uiuc.edu/powers/bib/index.htm" mce_href="http://www2.english.uiuc.edu/powers/bib/index.htm"&gt;Richard Powers&lt;/A&gt;, a novelist who wrote the 2006 National Book Award winner using speech recognition on his TabletPC, &lt;A class="" href="http://www.nytimes.com/2007/01/07/books/review/Powers2.t.html" mce_href="http://www.nytimes.com/2007/01/07/books/review/Powers2.t.html"&gt;trained himself&lt;/A&gt; after years of typing: &lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;I needed weeks to get over the oddness of auditioning myself in an empty room, to trust to the flow of speech, to learn to hear myself think all over again.&amp;nbsp; &lt;/EM&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;He broke through - and argues now that typing is the obstacle to the thought process: &lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;What could be less conducive to thought’s cadences than stopping every time your short-term memory fills to pass those large-scale musical phrases through your fingers, one tedious letter at a time? You’d be hard-pressed to invent a greater barrier to cognitive flow.&lt;/EM&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Is the grass really greener over there? I'd love to know. If anyone has done it, please send a comment. I'm also going to try it out for myself. Over the next few weeks, I'll be training myself to think aloud. &lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=4180151" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/spokenword/archive/tags/Speech-to-text/default.aspx">Speech-to-text</category></item><item><title>The novelist's method</title><link>http://blogs.msdn.com/spokenword/archive/2007/01/18/the-novelist-s-method.aspx</link><pubDate>Fri, 19 Jan 2007 05:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1491385</guid><dc:creator>Stephen Potter</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/spokenword/comments/1491385.aspx</comments><wfw:commentRss>http://blogs.msdn.com/spokenword/commentrss.aspx?PostID=1491385</wfw:commentRss><wfw:comment>http://blogs.msdn.com/spokenword/rsscomments.aspx?PostID=1491385</wfw:comment><description>&lt;P&gt;IBM engineer and blogger&amp;nbsp;Michael Tolva &lt;A class="" href="http://www.ascentstage.com/archives/2007/01/echoes_and_real.html" mce_href="http://www.ascentstage.com/archives/2007/01/echoes_and_real.html"&gt;contacted Richard Powers&lt;/A&gt; in disbelief about the dictation of his recent work (see &lt;A class="" href="http://blogs.msdn.com/spokenword/archive/2007/01/16/writing-with-speech.aspx" mce_href="http://blogs.msdn.com/spokenword/archive/2007/01/16/writing-with-speech.aspx"&gt;my last post&lt;/A&gt;). Here's &lt;A class="" href="http://www.ascentstage.com/archives/2007/01/echoes_and_real.html#347" mce_href="http://www.ascentstage.com/archives/2007/01/echoes_and_real.html#347"&gt;Mr Powers' reply&lt;/A&gt; about his correction UI: &lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;...the combination of stylus and speech is far more fluid and powerful for me than a keyboard ever was. Two taps are usually all it takes to fix a misheard word. Occasionally, I highlight a phrase and speak again. When a word isn’t in the speech lexicon, I handwrite it in. Using handwriting, I can easily change individual letters faster than it would take me to navigate and correct with the arrow keys, backspace, and letter keys. On rare occasions when it’s needed, I can also drop into spelling mode and speak the spelling of a word out loud. As a last resort, for short acronyms, foreign words, etc., I can peck things in with the on-screen keyboard. &lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;Poetry in motion?&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1491385" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/spokenword/archive/tags/Speech-to-text/default.aspx">Speech-to-text</category></item><item><title>Writing with speech</title><link>http://blogs.msdn.com/spokenword/archive/2007/01/16/writing-with-speech.aspx</link><pubDate>Wed, 17 Jan 2007 06:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1481424</guid><dc:creator>Stephen Potter</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/spokenword/comments/1481424.aspx</comments><wfw:commentRss>http://blogs.msdn.com/spokenword/commentrss.aspx?PostID=1481424</wfw:commentRss><wfw:comment>http://blogs.msdn.com/spokenword/rsscomments.aspx?PostID=1481424</wfw:comment><description>&lt;P&gt;&lt;A class="" href="http://www.nytimes.com/2007/01/07/books/review/Powers2.t.html" mce_href="http://www.nytimes.com/2007/01/07/books/review/Powers2.t.html"&gt;Speaking in last Sunday's New York Times&lt;/A&gt;, novelist Richard Powers tells how he has used speech recognition on his TabletPC to compose all his writings of recent years. &lt;/P&gt;
&lt;P&gt;It's a fine article - you could quote him almost anywhere on the obstacles of the keyboard, on SR speed and accuracy, on getting accustomed to talking to a machine, or on "speakos and mondegreens"... And it says so much about speech interface technology - it would have been inconceivable only 5 years ago. &lt;/P&gt;
&lt;P&gt;One factor behind this is the productivity of the TabletPC correction UI:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;I speak untethered, without a headset, into the slate’s microphone array. The words appear as fast as I can speak, or they wait out my long pauses. I touch them up with a stylus, scribbling or re-speaking as needed. Whole phrases die and revive, as quickly as I could have hit the backspace.&lt;/EM&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Stylus vs. Keyboard. Editing is more direct and efficient with the stylus than with mouse and keyboard, given the word-on-word nature of transcript editing. (And assuming Mr Powers is using TabletPC's built-in speech recognition UI - hats off to &lt;A class="" href="http://blogs.msdn.com/robch/default.aspx" mce_href="http://blogs.msdn.com/robch/default.aspx"&gt;Rob &lt;/A&gt;and team who put it there!)&lt;/P&gt;
&lt;P&gt;Having read, years ago, &lt;EM&gt;&lt;A class="" href="http://www.amazon.com/o/ASIN/0312423136/ref=s9_asin_image_1/102-9559785-7207307" mce_href="http://www.amazon.com/o/ASIN/0312423136/ref=s9_asin_image_1/102-9559785-7207307"&gt;Galatea 2.2&lt;/A&gt;&lt;/EM&gt;, I almost wish this article had gone behind the lines of the human interface and into the SR technology itself - the hum of the decoders, the scatter-charts of the acoustic models, the forced couplings of the words in the language models... If anyone can surface the poetry of a speech recognition engine, it's the writer who brought to life the neural networks of a fictional machine that is taught to listen and (heartbreakingly) to speak, in that book.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1481424" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/spokenword/archive/tags/Speech-to-text/default.aspx">Speech-to-text</category></item></channel></rss>