<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Jay's blog on text-to-speech (Now Defunct)</title><link>http://blogs.msdn.com/texttospeech/default.aspx</link><description>What's up with text-to-speech (TTS) and Microsoft? Heck, what's up with TTS in general these days? Speech, language, and technology. Cool stuff, indeed.</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Moving away from Text to Speech... but other great folks are still here</title><link>http://blogs.msdn.com/texttospeech/archive/2006/10/02/Moving-away-from-Text-to-Speech_2E002E002E00_-but-other-great-folks-are-still-here.aspx</link><pubDate>Tue, 03 Oct 2006 01:16:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:783008</guid><dc:creator>jaywaltm</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/783008.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=783008</wfw:commentRss><description>&lt;P&gt;I'm no longer blogging on&amp;nbsp;text to speech; however, there are still a lot of great people blogging on speech at Microsoft.&amp;nbsp;Your best contact is going to be Chuck Opperman who has a blog related to TTS and Speech API content:&lt;/P&gt;
&lt;P&gt;&lt;FONT color=#0033cc&gt;&lt;A href="http://blogs.msdn.com/chuckop/"&gt;&lt;STRONG&gt;http://blogs.msdn.com/chuckop/&lt;/STRONG&gt;&lt;/A&gt;&lt;/FONT&gt; &lt;/P&gt;
&lt;P&gt;Also, the Speech team participates on &lt;A href="http://blogs.msdn.com/sprague/archive/2006/06/21/642393.aspx"&gt;&lt;STRONG&gt;&lt;FONT color=#006bad&gt;newsgroups&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/A&gt; and on a &lt;A href="http://tech.groups.yahoo.com/group/ms-speech/"&gt;&lt;STRONG&gt;&lt;FONT color=#006bad&gt;Yahoo Speech forum&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/A&gt;&amp;nbsp;if you are looking for additional insight.&lt;/P&gt;
&lt;P&gt;Best, &lt;/P&gt;
&lt;P&gt;Jay&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=783008" width="1" height="1"&gt;</description></item><item><title>Model Talker will allow individuals to create their own custom TTS voice</title><link>http://blogs.msdn.com/texttospeech/archive/2006/07/12/662916.aspx</link><pubDate>Wed, 12 Jul 2006 05:03:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:662916</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/662916.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=662916</wfw:commentRss><description>&lt;P&gt;It was only a matter of time.... AgoraNet, Inc. appears to be in the process of introducing the &lt;A href="http://www.modeltalker.com/mt.php"&gt;Model Talker system of TTS&lt;/A&gt;, allowing an individual to create a TTS engine based on her voice&amp;nbsp;to be used with any Sapi 5.0 compliant system (Windows XP/Vista). The technology is targeted towards individuals loosing their ability to speak. Very cool.&lt;/P&gt;
&lt;P&gt;I can imagine that someday you'll check into a new company and as part of your new hire task list, you'll be ask to sit down at your desk for the first hour of each day recording a series of sentences. At the end of a few weeks, you'll upload your recordings to HR and they will run it through a tool and then send the result to your IT team. Now, when you send an email to someone in your company, they can listen to your email - with not just any TTS voice, but a TTS version of your very own voice.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=662916" width="1" height="1"&gt;</description></item><item><title>The Blizzard Challenge: Rate text to speech samples</title><link>http://blogs.msdn.com/texttospeech/archive/2006/06/30/652466.aspx</link><pubDate>Fri, 30 Jun 2006 19:43:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:652466</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/652466.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=652466</wfw:commentRss><description>&lt;P&gt;What's the Blizzard Challenge? In a shell's nut, it's a competition that provides researchers and speech labs with the same set of originally recorded waves (i.e., 5 hours of a male speaker), and then challenges teams to create the "best" (i.e., most natural and intelligible) TTS voice. If it were an architecture challenge, it would be like giving the same set of building materials to many different architects to come up with the most functional and beautiful building.&lt;/P&gt;
&lt;P&gt;And now&amp;nbsp;rating the results is open to the public here:&lt;/P&gt;
&lt;P&gt;&lt;A href="http://www.speech.cs.cmu.edu/blizzard2006/register-R.html"&gt;http://www.speech.cs.cmu.edu/blizzard2006/register-R.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;It took me about 20 minutes to complete the study. There are five tasks, with the first three scoring waves, and a transcription for the last two. It's probably a good indication of how good a TTS system you can get with 5 hours of recordings. I can't tell who are all of the participant groups at this point. Festvox has a little bit more &lt;A href="http://festvox.org/blizzard/"&gt;information on its site&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=652466" width="1" height="1"&gt;</description></item><item><title>Vista Beta 2 Screen Reader and Text to Speech</title><link>http://blogs.msdn.com/texttospeech/archive/2006/06/09/624203.aspx</link><pubDate>Fri, 09 Jun 2006 21:22:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:624203</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/624203.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=624203</wfw:commentRss><description>&lt;P&gt;If you are running Vista Beta 2, check out the new text to speech functionality using the Narrator screen reader program. (Narrator is also enabled in Windows XP; however, the Vista TTS voice is way better in terms of naturalness and intelligibility IMO compared to MS Sam).&lt;/P&gt;
&lt;P&gt;You can&amp;nbsp;start Narrator in one of two ways. &lt;/P&gt;
&lt;P&gt;1. If you have the "Run" command enabled in your start menu. then go to START &amp;gt; Run &amp;gt; and type "narrator.exe".&amp;nbsp;(Egads, no, the 'run' command&amp;nbsp;is not there by default. To get it there, right click on the toolbar and go to PROPERTIES &amp;gt; click on the START MENU tab &amp;gt; click on the Customize button and then find the check box for the &amp;nbsp;"Run-command on".)&lt;/P&gt;
&lt;P&gt;2. Or, simply go to the control panel and click on "Ease of Access". There you'll find the "Narrator" button to get you started.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=624203" width="1" height="1"&gt;</description></item><item><title>Text to Speech in Mission Impossible 3: A Dissection</title><link>http://blogs.msdn.com/texttospeech/archive/2006/05/12/596477.aspx</link><pubDate>Fri, 12 May 2006 23:47:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:596477</guid><dc:creator>jaywaltm</dc:creator><slash:comments>8</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/596477.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=596477</wfw:commentRss><description>&lt;P&gt;Besides being the best of the three MI movies, there were 2 instances of TTS in the movie that deserve some discussion (and clarification). One of the scenes was simple and plausable, while the second was a definite stretch (i.e., not doable by today's technology).&lt;/P&gt;
&lt;P&gt;In the first scene with TTS, one of the "good" guys was automating the descruction of a wharehouse full of "bad" guys, using vehicles equipped with large guns. When the automation started, the computer began speaking out some information using TTS. I'm pretty sure it was Mac OSX TTS. Definitely low on the naturalness scale, but intelligible nonetheless. (Can anyone confirm which TTS voice this was?)&lt;/P&gt;
&lt;P&gt;In the second scene(s), THE "good" guy (i.e., Tom Cruise's character), forces THE bad guy to read several syntactically but not semantically grammatical sentences off of a business sized card at gun point. Within seconds of completing the reading of the card, another "good" guy has intercepted the wave beneath the complex to generate a highly natural and intelligible TTS voice which is sent back to our protagonist in a bathroom who then can talk with the&amp;nbsp;"bad" guy's voice.&amp;nbsp; OK, so I'm actually quite forgiving in movies, giving the technology the benefit of the doubt (i.e., I pretend that I'm watching Sci-Fi and not a modern day action movie). So, if we assume that this was some other technology beyond TTS, great. No worries. However, if you are insisting that the movie follow current plausable technology, then here's what wrong with the TTS in this second scene:&lt;/P&gt;
&lt;P&gt;1) The TTS engine was generated from several sentences. Today, takes many many hours of recordings to generate a naturally sounding engine.&lt;/P&gt;
&lt;P&gt;2) The recording was done in a bathroom next to a loud party and then streamed to a nearby underground location. Not likely to result in the high quality recordings that one would need for TTS.&lt;/P&gt;
&lt;P&gt;3) The recording was streamed through rock. I'm imagining that some signal loss would be encountered in real life.&lt;/P&gt;
&lt;P&gt;4) The resulting TTS sounded almost EXACTLY (egads, as if it truly was the other actor speaking with Tom lip-synching) like the "Bad" guy! &amp;nbsp;Even on the BEST concatenative engines (i.e., based on 40+ hours of recording a person's voice), it won't sound just like the real person.&lt;/P&gt;
&lt;P&gt;Comments? Alternative takes?&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=596477" width="1" height="1"&gt;</description></item><item><title>Problems with French Liason in TTS - "A thread is a happy husband?"</title><link>http://blogs.msdn.com/texttospeech/archive/2006/03/23/559319.aspx</link><pubDate>Thu, 23 Mar 2006 23:49:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:559319</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/559319.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=559319</wfw:commentRss><description>&lt;P&gt;Recently I was talking to a colleague who is a native French speaker (hey, that's the great benefit of working here at Microsoft - lots of non-English speakers around) and she pointed out an interesting occurance of mispronunciation in French TTS that she found while casually taking advantage of speech technology (she knows I dig TTS).&amp;nbsp;The phrase is the following:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN lang=FR-CA style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial; mso-ansi-language: FR-CA"&gt;"Un fils et un mari heureux"&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN lang=FR-CA style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial; mso-ansi-language: FR-CA"&gt;&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;Literally, this translates in English as "A son and a happy husband"; however,&amp;nbsp;it comes out sounding like&amp;nbsp;"A&amp;nbsp;thread is a happy husband!"&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;Why is that? Well, the word for&amp;nbsp;"son"&amp;nbsp;should be pronounced as [f i s] but comes out as [f i l]. And "fil" means&amp;nbsp;string or thread. Also, the "et un" string should be [eh ah] as in "and a", but comes out incorrectly as [e tah] meaning "is a." (please forgive my phonetic convention). All you linguists know that this is all due to liason.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;TTS is cool... but only as good as its sons, threads, and husbands. &lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=559319" width="1" height="1"&gt;</description></item><item><title>Shakespeare video with text to speech</title><link>http://blogs.msdn.com/texttospeech/archive/2006/02/14/532068.aspx</link><pubDate>Wed, 15 Feb 2006 00:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:532068</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/532068.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=532068</wfw:commentRss><description>&lt;P&gt;Egads, it's quite clear that Shakespeare wasn't meant to be read with a text to speech voice... or at least not with the Micorosoft Mary and Microsoft Mike which are representative of older MS technology (old in terms of what users will get with Vista). At any rate, &lt;A href="http://medialab.ifc.com/film_detail.jsp?film_id=542&amp;amp;list=1"&gt;check out this video&lt;/A&gt; if you want to see TTS acting at its worst.&lt;/P&gt;
&lt;P&gt;This video suggests that having emotive TTS might be a feature worth supporting. I know that &lt;A href="http://www.loquendo.com"&gt;Loquendo &lt;/A&gt;has such a feature with its TTS, but it appears to be limited to a few pre-recorded emotive sayings.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=532068" width="1" height="1"&gt;</description></item><item><title>What is TTS? How does it work?</title><link>http://blogs.msdn.com/texttospeech/archive/2006/02/08/527214.aspx</link><pubDate>Wed, 08 Feb 2006 05:21:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:527214</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/527214.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=527214</wfw:commentRss><description>&lt;p&gt;I recently went over &lt;a href="http://en.wikipedia.org/wiki/Speech_synthesis"&gt;the entry&amp;nbsp;in Wikipedia on text-to-speech&lt;/a&gt;, and it turns out to be a really good overview of the technolgogy. If you are interested in its history and the various components that would have to be handled by any TTS system, I highly recommend this site. There are also a number of helpful links at the end of the article.&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=527214" width="1" height="1"&gt;</description></item><item><title>Another blogger on TTS</title><link>http://blogs.msdn.com/texttospeech/archive/2006/01/30/519580.aspx</link><pubDate>Mon, 30 Jan 2006 22:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:519580</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/519580.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=519580</wfw:commentRss><description>Check out &lt;a href="http://spaces.msn.com/bhandler/blog/cns!70F64BC910C9F7F3!727.entry"&gt;Blake's blog here &lt;/a&gt;and you can see some of the enthusiasm around text to speech. Blake's site, The Road to Know Where, is loaded with all sorts of tips&amp;nbsp;for Windows users. I'll know where to send my mom the next time she is looking for helpful user hints. :)&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=519580" width="1" height="1"&gt;</description></item><item><title>Get El Mundo in podcast in Spanish</title><link>http://blogs.msdn.com/texttospeech/archive/2006/01/27/518167.aspx</link><pubDate>Fri, 27 Jan 2006 06:25:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:518167</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/518167.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=518167</wfw:commentRss><description>&lt;P&gt;El Mundo is one of the more well known newspapers in Spain. And they are now offering an &lt;A href="http://www.elmundo.es/tts/rosa_portada.html"&gt;MP3 podcast download of the daily news&lt;/A&gt;. It appears that the voice of Rosa is the very&amp;nbsp;same &lt;A href="http://www.naturalvoices.att.com/demos/"&gt;Rosa of AT&amp;amp;T Natural Voices&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;I found it interesting that the podcast actually caveats its TTSness on start-up. That is, it says, &lt;EM&gt;many may be surprised by my tone of voice... but to keep in mind that I'm a machine. &lt;/EM&gt;My guess is that people are a lot less critical on a TTS voice if you tell them ahead of time that it's a computer rather than a real voice. That would be an interesting study if it hasn't already been done. &lt;/P&gt;
&lt;P&gt;(Special thanks to Enrique Lopez Diaz for pointing out the El Mundo TTS site.)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=518167" width="1" height="1"&gt;</description></item><item><title>Who else blogs on speech at Microsoft?</title><link>http://blogs.msdn.com/texttospeech/archive/2006/01/19/515048.aspx</link><pubDate>Fri, 20 Jan 2006 00:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:515048</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/515048.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=515048</wfw:commentRss><description>(No, it can't just be me... oh, and it isn't...) &lt;a href="https://blogs.msdn.com:443/spokenword/archive/2005/11/30/498709.aspx"&gt;Stephen Potter has an extensive list of bloggers&lt;/A&gt; across several teams that get a trill out talking Speech. And while I'm the only one focused on TTS (life beyond TTS?), a lot of interest around here is in Speech Recognition.&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=515048" width="1" height="1"&gt;</description></item><item><title>How do you pronounce the digit "1"? </title><link>http://blogs.msdn.com/texttospeech/archive/2006/01/17/513861.aspx</link><pubDate>Tue, 17 Jan 2006 20:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:513861</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/513861.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=513861</wfw:commentRss><description>&lt;P&gt;This is an interesting, if not challenging problem for TTS systems. So, how should you pronounce "1"? Should an English TTS system simply say "one" when it encounters "1"? (Okay, the answer is "no", and here's why:)&lt;/P&gt;
&lt;P&gt;Let's look at the following English sentences:&lt;/P&gt;
&lt;BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px"&gt;
&lt;P&gt;(a)&amp;nbsp; &lt;EM&gt;I have 1 friend&lt;/EM&gt;.&amp;nbsp; ("i have one friend")&lt;/P&gt;
&lt;P&gt;(b)&amp;nbsp; &lt;EM&gt;Can you meet me on 1/3/05?&lt;/EM&gt;&amp;nbsp; ("can you meet me on january third two thousand and five?")&lt;/P&gt;
&lt;P&gt;(c)&amp;nbsp; &lt;EM&gt;My birthday is on March 1. &lt;/EM&gt;("my birthday is on march first.")&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;In sentence (a), we can read the digit "1" as "one". But in sentence (b), the digit "1" can be spoken as "January" because it is in the context of a date. And in sentence (c), the "1" commonly takes on an ordinal reading by being pronounced as "first".&lt;/P&gt;
&lt;P&gt;This problem of properly disambiguating written text into correct word expansions is known as text normalization. For TTS systems, it's very problematic depending on the context. While digits are most often affected in terms of frequency of occurance, other kinds of patterns are also very problematic (I'll save this for another post).&lt;/P&gt;
&lt;P&gt;So, the digit "1" has a couple of pronounciations as just seen in English. But what about in another language such as Spanish?&lt;/P&gt;
&lt;P&gt;Let's look at similar Spanish sentences:&lt;/P&gt;
&lt;BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px"&gt;
&lt;P&gt;(d)&amp;nbsp;&amp;nbsp;&lt;EM&gt;Yo tengo&amp;nbsp;1 amigo.&lt;/EM&gt;&amp;nbsp; ("i have one friend") - "1" pronounced as "un"&lt;/P&gt;
&lt;P&gt;(e)&amp;nbsp; &lt;EM&gt;Yo tengo&amp;nbsp;1 amiga.&lt;/EM&gt;&amp;nbsp; ("i have one friend") - "1" pronounced as "una"&lt;/P&gt;
&lt;P&gt;(f)&amp;nbsp; &lt;EM&gt;Yo tengo 1. ("i have one")&amp;nbsp; &lt;/EM&gt;- "1" pronounced as "un"&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P dir=ltr style="MARGIN-RIGHT: 0px"&gt;In these examples, the "1" can take on three different pronounciations! It's not the semantic context (e.g., "date", "time", "fraction") that requires disambiguation, but rather, the context is the gender of the word that the&amp;nbsp;"1" modifies - or the part of speech of the "1".&amp;nbsp;So, in sentence (d) the pronunciation is "un" because the following noun is masculine, but in sentence (e) the pronunciation is "una" because "amiga" is feminine. But, in (f), the pronunciation is "uno" because the "1" is itself acting as a noun.&lt;/P&gt;
&lt;P dir=ltr style="MARGIN-RIGHT: 0px"&gt;Can you see why proper identification of part of speech is so important for text normalization?&lt;/P&gt;
&lt;P dir=ltr style="MARGIN-RIGHT: 0px"&gt;You might be thinking of how to write a rule to capture the distinctions in sentences (d), (e), and (f). (For example, if there is a noun following the digit "1" then read&amp;nbsp;the "1" as "una" or "un", otherwise, read the "1" as "uno".); however, can your rule account for the following long distance dependency?&lt;/P&gt;
&lt;BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px"&gt;
&lt;P dir=ltr style="MARGIN-RIGHT: 0px"&gt;(g)&amp;nbsp; De todas las chicas en la clase, hay 1 que me gusta.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P dir=ltr style="MARGIN-RIGHT: 0px"&gt;Does your TTS system get (g) correct? (Note, the "1" should be readout as "una".)&lt;/P&gt;
&lt;P dir=ltr style="MARGIN-RIGHT: 0px"&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=513861" width="1" height="1"&gt;</description></item><item><title>Listen to emails and word documents in MP3 format on Pocket PC</title><link>http://blogs.msdn.com/texttospeech/archive/2006/01/04/509409.aspx</link><pubDate>Thu, 05 Jan 2006 01:44:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:509409</guid><dc:creator>jaywaltm</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/509409.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=509409</wfw:commentRss><description>&lt;P&gt;Just when you thought you might be able to use your car commute time to sit back and relax, listen to NPR, and creep slowly across the Seattle bridge... TTS is there to readout that document that you didn't have time to read the night before.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://www.magnetictime.com/"&gt;Magnetictime &lt;/A&gt;has a product that I have not tested, but I thought I'd mention it here as TTS is core to its functionality. It also uses &lt;A href="http://demo.acapela-group.com/"&gt;Acapella &lt;/A&gt;engines. "MT1" is the name of the program and apparently it will readout your outlook mail as well as word docs on your pocket PC device. By creating a new folder in Outlook, text is converted into waves and then the program can transfer the waves directly to your device. Currently works for English, but soon to work with more languages.&lt;/P&gt;
&lt;P&gt;Seems like a cool app if you have a device that's not a smart phone. It doesn't appear that you'd get up to the minute info... you have to download to the device before you hit the road. And at least you aren't using precious airtime minutes to use it either.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=509409" width="1" height="1"&gt;</description></item><item><title>Deskbot adds cool animations with text to speech</title><link>http://blogs.msdn.com/texttospeech/archive/2005/12/29/508078.aspx</link><pubDate>Fri, 30 Dec 2005 00:35:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:508078</guid><dc:creator>jaywaltm</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/508078.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=508078</wfw:commentRss><description>&lt;P&gt;I recently found &lt;A href="http://www.mlive.com/entertainment/grpress/index.ssf?/base/entertainment-1/1135007108184530.xml&amp;amp;coll=6#continue"&gt;an article &lt;/A&gt;describing the &lt;A href="http://www.bellcraft.com/deskbot/"&gt;Deskbot &lt;/A&gt;which is one of the &lt;STRONG&gt;cooler&lt;/STRONG&gt; text to speech reader programs available IMHO. Oh, and it's free! (BTW, does your TTS engine correctly readout the string "IMHO" as "in my humble opinion"? -- it should). (BTW, Neospeech's Paul reads it right, but XP Sam does not).&lt;/P&gt;
&lt;P&gt;So what's Deskbot do? After it's installed, a small animated character appears on the screen (you can put the character anywhere you want), and then allows you to speak text in any window when it's pasted to the clipboard or when the the character is double clicked when there's highlighted text. It will also announce the time on specified intervals. And the best part is that you don't need to have loaded some special screen reader program.&lt;/P&gt;
&lt;P&gt;You can also change options like altering the voice (pitch, speed, volume). Oh, and you can even change the character by downloading an MS Agent compatible character through the &lt;A href="http://www.msagentring.org/chars.aspx"&gt;MS Agent Ring site&lt;/A&gt;.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=508078" width="1" height="1"&gt;</description></item><item><title>New Hindi text to speech voice. Are you a native Hindi speaker? What's your opinion?</title><link>http://blogs.msdn.com/texttospeech/archive/2005/12/13/503301.aspx</link><pubDate>Wed, 14 Dec 2005 00:18:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:503301</guid><dc:creator>jaywaltm</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/texttospeech/comments/503301.aspx</comments><wfw:commentRss>http://blogs.msdn.com/texttospeech/commentrss.aspx?PostID=503301</wfw:commentRss><description>&lt;P&gt;There was &lt;A href="http://www.thehindubusinessline.com/blnus/15061310.htm"&gt;an ariticle on web &lt;/A&gt;that describes a company with a TTS voice for Hindi. And for all you SAPI devs, the desktop version is SAPI compliant.&amp;nbsp;According the article, "'Vaachak', developed by &lt;A href="http://www.prologixsoft.com/vaachak.htm"&gt;Prologix Software Solutions&lt;/A&gt;, is the first, high-quality, Indian language TTS system, which converts Hindi and English text into natural and pleasant sounding speech."&lt;/P&gt;
&lt;P&gt;So, is it any good? I can tell you that the acoustic quality, at least from the demos, sounds a bit compressed and there is certainly some artifacts from concatenation. But still, who else out there has a Hindi TTS system? &lt;/P&gt;
&lt;P&gt;What do you think? Are you a native speaker of Hindi? I'd love to get your impressions of the &lt;A href="http://www.prologixsoft.com/downloads.htm"&gt;sample waves &lt;/A&gt;or you can use the &lt;A href="http://vaachak.mla.iitk.ac.in/vaachak/speak.htm"&gt;demo to enter in Hindi text and have it read out&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=503301" width="1" height="1"&gt;</description></item></channel></rss>