<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Office Natural Language Team Blog</title><link>http://blogs.msdn.com/b/naturallanguage/</link><description /><dc:language>en-US</dc:language><generator>Telligent Community 5.6.583.19199 (Build: 5.6.583.19199)</generator><item><title>Want to give input on proofing tools? Participate in the Proofing Tools Survey!</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2010/03/08/want-to-give-input-on-proofing-tools-participate-in-the-proofing-tools-survey.aspx</link><pubDate>Mon, 08 Mar 2010 23:34:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9975177</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>8</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9975177</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2010/03/08/want-to-give-input-on-proofing-tools-participate-in-the-proofing-tools-survey.aspx#comments</comments><description>&lt;P&gt;We are currently running a survey on how users edit and review content and how they use proofing tools such as the spell checker, the grammar checker or the thesaurus in Microsoft Office. We want to learn more about user preferences so we can improve the user experience in some future version of Office. The survey takes about 15 minutes. If you are interested in participating please go to &lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt 0.5in" class=MsoNormal&gt;&lt;FONT color=#0000ff size=3 face=Calibri&gt;&lt;A href="http://office.microsoft.com/en-us/word/FX100649251033.aspx" mce_href="http://office.microsoft.com/en-us/word/FX100649251033.aspx"&gt;http://office.microsoft.com/en-us/word/FX100649251033.aspx&lt;/A&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;and look for the "Help us&amp;nbsp;make Word Better!" &amp;nbsp;field in the lower section of the page. Click "Start the Survey" in order to participate. &lt;/P&gt;
&lt;P&gt;We are looking forward to hearing from you!&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Stefanie Schiller&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;Natural Language Group - Program Manager&lt;/EM&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt 0.5in" class=MsoNormal mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9975177" width="1" height="1"&gt;</description></item><item><title>Contextual spelling for French in Office 2010</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/07/17/contextual-spelling-for-french-in-office-2010.aspx</link><pubDate>Sat, 18 Jul 2009 00:05:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9837869</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>3</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9837869</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/07/17/contextual-spelling-for-french-in-office-2010.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;At the &lt;I&gt;Worldwide Partner Conference &lt;/I&gt;in New Orleans on 13 July 2009, we announced the launch of the Office 2010 Technical Preview. This technical preview can now be downloaded by thousands of customers. You can discover the innovations on the &lt;/FONT&gt;&lt;A href="http://blogs.technet.com/office2010/"&gt;&lt;FONT face=Calibri size=3&gt;Office 2010 blog&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; and watch really cool videos on &lt;/FONT&gt;&lt;A href="http://www.microsoft.com/office2010"&gt;&lt;FONT face=Calibri size=3&gt;www.microsoft.com/office2010&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. My colleague Stefanie Schiller wrote a few words &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2009/07/16/proofing-tools-in-office-2010.aspx"&gt;&lt;FONT face=Calibri size=3&gt;about the proofing tools integrated in this Technical Preview&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; and about some of the improvements we have made, specifically with respect to the English thesaurus.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;French-speaking users will also be delighted. We have talked on multiple occasions on this blog about the &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/tags/contextual+speller/default.aspx"&gt;&lt;FONT face=Calibri size=3&gt;English and Spanish contextual spellers&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; that we launched in Office 2007 (which also includes such a tool for German). We have improved them all in Office 2010 and we are happy to introduce a brand-new contextual speller for French which, when added to the French spellchecker and grammar checker, will greatly improve our French-speaking users’ proofing experience.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The French contextual speller in Office 2010 will detect a lot more mistakes which went so far unnoticed with the traditional proofing tools. Unlike the grammar checker, which is based upon a syntactic parser, our contextual speller is based upon statistical analyses of very large textual corpora and upon “language models” which enable the program to compare the user’s text with huge lists of sequences of words with their frequencies. Words that exist in the language but are used improperly in a given context can then be flagged. A blue squiggly line will appear under mistakes such as the following ones:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;Ils &lt;U&gt;on&lt;/U&gt; faim. (on &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; ont)&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;Elles &lt;U&gt;son&lt;/U&gt; malades. (son &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; sont)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;I&gt;&lt;U&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;Quand&lt;/SPAN&gt;&lt;/U&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt; à moi, j’avoue que je &lt;U&gt;sui&lt;/U&gt; fier de lui. (Quand &lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; Quant&amp;nbsp;; sui &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; suis)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;Si je &lt;U&gt;peu&lt;/U&gt; me permettre, dans son &lt;U&gt;fort&lt;/U&gt; intérieur, elle pense qu’elle a raison. (peu &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; peux&amp;nbsp;; fort &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; for)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;I&gt;&lt;U&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;Se&lt;/SPAN&gt;&lt;/U&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt; test montre que le correcteur ne fonctionne pas trop mal. (Se &lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; Ce)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;L’installation de la fosse &lt;U&gt;sceptique&lt;/U&gt; a pris plus de temps que prévu. (sceptique &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; septique)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;Il arrive cet &lt;U&gt;après midi&lt;/U&gt;.(après midi &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; après-midi)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;Mon frère &lt;U&gt;ma&lt;/U&gt; dit qu’il ne viendrait pas. (ma &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; m’a)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;Il y a &lt;U&gt;long temps&lt;/U&gt; que je&amp;nbsp;l’aime, jamais je ne l’oublierai… (chanson populaire) (long temps &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt; longtemps)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;I&gt;&lt;FONT face=Calibri&gt;&lt;FONT size=3&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;En &lt;U&gt;temps&lt;/U&gt; que client de l’hôtel, vous avez gratuitement accès à l’Internet. &lt;/SPAN&gt;(temps &lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;SPAN lang=FR style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: FR; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri&gt; tant)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT face=Calibri size=3&gt;The screenshot below shows this contextual speller in action&amp;nbsp;:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;?xml:namespace prefix = v ns = "urn:schemas-microsoft-com:vml" /&gt;&lt;v:shapetype id=_x0000_t75 stroked="f" filled="f" path="m@4@5l@4@11@9@11@9@5xe" o:preferrelative="t" o:spt="75" coordsize="21600,21600"&gt;&lt;v:stroke joinstyle="miter"&gt;&lt;/v:stroke&gt;&lt;v:formulas&gt;&lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 1 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum 0 0 @1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @2 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 0 1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @6 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @8 21600 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @10 21600 0"&gt;&lt;/v:f&gt;&lt;/v:formulas&gt;&lt;v:path o:connecttype="rect" gradientshapeok="t" o:extrusionok="f"&gt;&lt;/v:path&gt;&lt;o:lock aspectratio="t" v:ext="edit"&gt;&lt;/o:lock&gt;&lt;/v:shapetype&gt;&lt;/SPAN&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;IMG height=696 src="http://blogs.msdn.com/photos/naturallanguage/images/9834691/original.aspx" width=1310&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;What is a «&amp;nbsp;&lt;B&gt;contextual speller&lt;/B&gt;&amp;nbsp;»&amp;nbsp;? As you know, the traditional Office spellchecker flags the odd typo (omission of a letter, permutation of two letters, etc.) with a red squiggly line. The grammar checker deals with agreement mistakes (such as between a verb and its subject, or agreement in number and gender between a noun and an adjective in French). Mistakes related to words that are pronounced similarly but are spelled differently are very hard to detect, however. Anyone who knows a bit of French knows how frequently people (native and non-native speakers alike) mix up similarly-sounding words like &lt;I&gt;son/sont&lt;/I&gt; or &lt;I&gt;on/ont&lt;/I&gt;. If I write “&lt;I&gt;Ils on faim&lt;/I&gt;” (they are hungry), a grammar checker based upon a syntactic parser has difficulty detecting the mistake (“&lt;I&gt;on&lt;/I&gt;” should read “&lt;I&gt;ont&lt;/I&gt;”) because the erroneous sentence is made up of a pronoun (&lt;I&gt;Ils&lt;/I&gt;), followed by another pronoun (&lt;I&gt;on, &lt;/I&gt;instead of the correct verb form &lt;I&gt;ont&lt;/I&gt;) and a noun (&lt;I&gt;faim&lt;/I&gt;). It is hard to make sense of this structure, since it is not a traditional agreement problem as in “&lt;I&gt;Ils mange du pain&lt;/I&gt;” (they eat bread), where “&lt;I&gt;mange&lt;/I&gt;” (a singular verb) should read “&lt;I&gt;mangent&lt;/I&gt;” (plural form).&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Of course, you should not expect this tool to be able to flag any kind of mistake. No existing tool is able to do that and those that would be able to do so would probably create a lot of false flags, or false positives, which tend to irritate the user. I discussed the notion of precision and recall when I blogged about an &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2009/01/08/an-academic-evaluation-of-the-office-2007-contextual-spelling-checker.aspx"&gt;&lt;FONT face=Calibri size=3&gt;academic evaluation of our Office 2007 contextual speller&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. When we developed our tool, we constantly tried to avoid false flags and our tool has a very high precision, which means it rarely makes mistakes when it flags something (in fact, it is right nearly all the time, but there will of course always be mistakes that will not be detected). I tend to feel that this new contextual speller will quickly prove to be an indispensible tool for many an Office 2010 user who writes documents in French. It will certainly very usefully complement the range of proofing tools we make available to them.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Thierry Fontenelle&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT face=Calibri size=3&gt;Microsoft Natural Language Group – Program Manager&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9837869" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/French/">French</category></item><item><title>Proofing Tools in Office 2010</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/07/16/proofing-tools-in-office-2010.aspx</link><pubDate>Fri, 17 Jul 2009 01:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9836216</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>63</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9836216</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/07/16/proofing-tools-in-office-2010.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;On Monday this week, Microsoft launched the Office 2010 Technical Preview.&amp;nbsp; Thousands of customers can now download this latest version of Office to try out the cool new features and provide feedback. &lt;/FONT&gt;&lt;/P&gt;
&lt;H2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;SPAN style="mso-fareast-font-family: 'Times New Roman'"&gt;&lt;FONT color=#4f81bd&gt;&lt;FONT face=Cambria&gt;
&lt;H2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;SPAN class=MsoSubtleEmphasis&gt;&lt;EM&gt;&lt;FONT size=4&gt;&lt;FONT color=#808080&gt;Updates and Replacements&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/EM&gt;&lt;/SPAN&gt;&lt;/H2&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-fareast-font-family: 'Times New Roman'"&gt;&lt;FONT size=4&gt;&lt;FONT color=#4f81bd&gt;&lt;FONT face=Cambria&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The Technical Preview includes proofing tools for English as well as for French and Spanish. We are delighted to announce that these proofing tools include brand-new spell checkers, contextual spellers, hyphenators and thesauri. We are excited about the release of these new proofing tools and eager to receive your feedback. &lt;/FONT&gt;&lt;/P&gt;
&lt;H2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;SPAN style="mso-fareast-font-family: 'Times New Roman'"&gt;&lt;FONT color=#4f81bd&gt;&lt;FONT face=Cambria&gt;
&lt;H2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;SPAN class=MsoSubtleEmphasis&gt;&lt;EM&gt;&lt;FONT size=4&gt;&lt;FONT color=#808080&gt;Improvements in the English Thesaurus&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/EM&gt;&lt;/SPAN&gt;&lt;/H2&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-fareast-font-family: 'Times New Roman'"&gt;&lt;FONT size=4&gt;&lt;FONT color=#4f81bd&gt;&lt;FONT face=Cambria&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The English thesaurus in particular has seen some changes. You will now be able to find more synonyms in fewer steps. In Office 2007, if you look up the word &lt;I&gt;tables&lt;/I&gt;, the only suggestion is &lt;I&gt;table – &lt;/I&gt;the same as the lookup word, only in the singular. Next you have to look up the word &lt;I&gt;table&lt;/I&gt;, select the synonym you are looking for and manually add an ‘s’ to the word to convert it back to plural. Four steps that include manual edits for one simple lookup. All of this can be done in one simple step in Office 2010. The screenshot below shows the results of the thesaurus in Office 2010 for the word &lt;I&gt;tables&lt;/I&gt;:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt; TEXT-ALIGN: center" align=center&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&lt;/o:p&gt;&lt;IMG src="http://blogs.msdn.com/photos/naturallanguage/images/9837663/original.aspx" mce_src="http://blogs.msdn.com/photos/naturallanguage/images/9837663/original.aspx"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;We call this “inflectional morphology”.&amp;nbsp; Try the same with the word &lt;I&gt;smarter&lt;/I&gt; in your Office 2007 version, and you will see why we think that inflectional morphology is a big step forward. We also added additional data to provide more synonyms for your lookups. &lt;/FONT&gt;&lt;/P&gt;
&lt;H2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;SPAN style="mso-fareast-font-family: 'Times New Roman'"&gt;&lt;FONT color=#4f81bd&gt;&lt;FONT face=Cambria&gt;
&lt;H2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;SPAN class=MsoSubtleEmphasis&gt;&lt;EM&gt;&lt;FONT size=4&gt;&lt;FONT color=#808080&gt;Localized Versions and Language Packs&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/EM&gt;&lt;/SPAN&gt;&lt;/H2&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-fareast-font-family: 'Times New Roman'"&gt;&lt;FONT size=4&gt;&lt;FONT color=#4f81bd&gt;&lt;FONT face=Cambria&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/H2&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The current Technical Preview includes Proofing Tools for English, French and Spanish. If you install the Technical Preview, you will not be able to run proofing tools for any language other than English, French and Spanish. For this release we have made significant changes in the proofing infrastructure, therefore&amp;nbsp; the Language Packs from previous Office versions including Office 2007 are not compatible with Office 2010. Proofing tools for these languages will become available with the release of the localized versions or with the subsequent release of the Language Packs for Office 2010.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Stefanie Schiller&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Microsoft Natural Language Group – Program Manager&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9836216" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/proofing+tools/">proofing tools</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/English/">English</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Multi_2D00_Language+Pack/">Multi-Language Pack</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spanish/">Spanish</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/French/">French</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Office+2010/">Office 2010</category></item><item><title>Add the Microsoft Translator to the Office Research pane</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/06/08/add-the-microsoft-translator-to-the-office-research-pane.aspx</link><pubDate>Mon, 08 Jun 2009 23:24:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9709317</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>0</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9709317</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/06/08/add-the-microsoft-translator-to-the-office-research-pane.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;The Microsoft Translator from the Microsoft Research (MSR) is now available for download in the &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://go.microsoft.com/?linkid=9660314"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: blue; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;Microsoft Download Center&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;. Another translation service for Microsoft Office? Do you need it? Well, more choices are a good thing—especially in the area of languages and machine translation. After all, you don’t want to inadvertently say something to your grandmother or to your pen pal in a machine-translated letter that you don’t mean to say. &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;o:p&gt;&lt;FONT face=Calibri&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;The Microsoft Translator is free, available in many language pairs with more being added soon, and can be used with Microsoft Office 2003 and Microsoft Office 2007. So you don’t need to worry about having an older version of Microsoft Office. The best feature of this new service is the side-by-side comparison of the original document and the translated document. This allows you to easily compare the translation quality to the original document in a single browser window. &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 1in; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;o:p&gt;&lt;FONT face=Calibri&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;If you discover that you prefer one service for one language and another service for a different language, you can easily change the preferred service for each language pair: in the &lt;B&gt;Research&lt;/B&gt; pane, under the &lt;B&gt;Search for&lt;/B&gt; text box, selection &lt;B&gt;Translation&lt;/B&gt;, and then click &lt;B&gt;Translation options&lt;/B&gt;. In the &lt;B&gt;Translations options&lt;/B&gt; dialog box, under the &lt;B&gt;Machine Translation&lt;/B&gt; section, you can scroll to the right of the relevant language pair and select the service you want from the drop-down menu.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;o:p&gt;&lt;FONT face=Calibri&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;Ultimately, human translation is optimal—so you don’t say something to your grandmother or your pen pal that you need to apologize for later—but machine translation is the next best solution. Having more translation service choices available in the product that you are using to create your letter to your grandmother or your pen pal makes it easier for you to say exactly what you want to say, in the language that you want to say it. You can also quickly understand the gist of the text you have received.&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;You can find more information about this feature in the &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://office.microsoft.com/en-us/help/HA103514221033.aspx#6"&gt;&lt;SPAN style="FONT-SIZE: 11pt; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri color=#0000ff&gt;“Translate Text” Help asset&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;o:p&gt;&lt;FONT face=Calibri&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;&lt;EM&gt;Daj Oberg&lt;o:p&gt;&lt;/o:p&gt;&lt;/EM&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;FONT face=Calibri&gt;&lt;EM&gt;Office Content Publishing&lt;o:p&gt;&lt;/o:p&gt;&lt;/EM&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-pagination: widow-orphan"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-bidi-language: AR-SA; mso-font-kerning: 0pt; mso-fareast-language: ZH-CN"&gt;&lt;o:p&gt;&lt;FONT face=Calibri&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9709317" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/machine+translation/">machine translation</category></item><item><title>Why is chef-d’œuvre my favorite French word?</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/03/12/why-is-chef-d-uvre-my-favorite-french-word.aspx</link><pubDate>Thu, 12 Mar 2009 07:49:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9471326</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>7</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9471326</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/03/12/why-is-chef-d-uvre-my-favorite-french-word.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;In this post, I would like to share with you the reasons why I love the French word &lt;I&gt;chef-d’œuvre&lt;/I&gt; (=masterpiece). My interest for this word has nothing to do with its meaning. As a program manager working with computational linguists, I find it fascinating because it epitomizes the numerous difficult decisions one has to make when building natural language processing systems like word-breakers, tokenizers, spell-checkers, etc. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;One of the most vexing problems in NLP is to decide whether hyphens and apostrophes are breaking characters or not. The identification of word boundaries (tokenization) is indeed essential, as I have argued elsewhere in an attempt to show that &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/correcteurorthographiqueoffice/archive/2005/12/07/identifying-tokens-is-word-breaking-so-easy.aspx" mce_href="http://blogs.msdn.com/correcteurorthographiqueoffice/archive/2005/12/07/identifying-tokens-is-word-breaking-so-easy.aspx"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;word-breaking is not easy&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. Hyphens frequently separate distinct tokens, as in &lt;I&gt;le match France-Italie&lt;/I&gt; (nobody would argue that &lt;I&gt;France-Italie&lt;/I&gt; is one word and nobody would expect to find the whole string in a dictionary). In &lt;I&gt;chef-d’œuvre, &lt;/I&gt;however, the hyphen is part of the word and everyone will expect the whole string to be granted entry status in a dictionary. It should be considered as one token that has little to do with the word &lt;I&gt;chef&lt;/I&gt; (which typically refers to a person), unless one considers the etymology of the word. This means that, in a search scenario, a user would not consider a document containing the words &lt;I&gt;chef&lt;/I&gt; and &lt;I&gt;oeuvre&lt;/I&gt; used separately as relevant if that user typed the keyword &lt;I&gt;chef&lt;/I&gt;-&lt;I&gt;d’œuvre&lt;/I&gt;. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The apostrophe in &lt;I&gt;chef-d’œuvre &lt;/I&gt;is also interesting. Apostrophes are frequently used in French elided forms when a pronoun or a determiner is followed by a word that starts with a vowel (consider &lt;I&gt;l’école&lt;/I&gt; [the school), &lt;I&gt;je l’aime&lt;/I&gt; [I love her], &lt;I&gt;j’arrive&lt;/I&gt; [I’m coming]]. In such cases, it is normal to consider that the string is made up of two distinct tokens (&lt;I&gt;l’école -&amp;gt; l’ + école&lt;/I&gt;). The apostrophe in &lt;I&gt;chef&lt;/I&gt;-&lt;I&gt;d’œuvre&lt;/I&gt; has a distinct status and is an integral part of the word, very much like in other French words such as &lt;I&gt;aujourd’hui&lt;/I&gt;, &lt;I&gt;prud’homme&lt;/I&gt;, and a few others.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The word &lt;I&gt;chef-d’œuvre &lt;/I&gt;is also interesting because it includes a special character, &lt;I&gt;œ,&lt;/I&gt; known as a “ligature” (two or more letters joined together). Many other words in French include ligatures such as &lt;I&gt;œ&lt;/I&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;or &lt;I&gt;æ &lt;/I&gt;(&lt;I&gt;œuf &lt;/I&gt;[egg],&lt;I&gt; sœur &lt;/I&gt;[sister], &lt;I&gt;cœur &lt;/I&gt;[heart], &lt;I&gt;cæcum&lt;/I&gt; …) and many other languages use characters which are not traditionally found in English (the German β or the Spanish ñ are cases in point). This reminds us that many NLP projects started with applications developed for English initially and subsequently required specific changes to take into account the non-ascii characters found in many other languages. Until very recently, the OpenOffice.org French spell-checker used to flag forms with ligatures like&lt;I&gt; sœur or œuf &lt;/I&gt;as incorrect and only verified the incorrect spellings with two distinct characters (soeur, oeuf…). With the advent of Unicode, such problems are probably less frequent today, but it is clear that any multilingual project needs to consider idiosyncrasies such as the use of diacritics and other special characters some languages love so much…&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;From a morphological point of view, the word &lt;I&gt;chef-d’œuvre &lt;/I&gt;is also atypical. While regular nouns typically take a final “s” in the plural in French (singular &lt;I&gt;maison&lt;/I&gt; [house] -&amp;gt; plural &lt;I&gt;maisons&lt;/I&gt;), the form *&lt;I&gt;chef-d’œuvres &lt;/I&gt;is&lt;I&gt; &lt;/I&gt;incorrect and should be flagged by a spell-checker, even if &lt;I&gt;œuvres &lt;/I&gt;is correct on its own in other contexts. Rather, the plural is formed by adding an “s” at the end of the subtoken &lt;I&gt;chef&lt;/I&gt;: des &lt;I&gt;chefs-d’œuvre. &lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Finally, if you consider how the word is pronounced, it is clear that &lt;I&gt;chef-d’œuvre &lt;/I&gt;poses a number of challenges: why is it that the “f” is pronounced in &lt;I&gt;chef&lt;/I&gt;&lt;/FONT&gt;&lt;SPAN class=resultbody1&gt;&lt;SPAN style="mso-bidi-font-family: Tahoma"&gt;&lt;FONT face="MS Reference Sans Serif" color=#333333&gt; &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri&gt;[&lt;/FONT&gt;&lt;/FONT&gt;&lt;A href="http://fr.encarta.msn.com/encnet/features/dictionary/PronounceF.aspx?search=chef" mce_href="http://fr.encarta.msn.com/encnet/features/dictionary/PronounceF.aspx?search=chef"&gt;&lt;SPAN style="COLOR: windowtext; TEXT-DECORATION: none; text-underline: none"&gt;&lt;FONT face=Calibri size=3&gt;∫εf&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; ], but not in &lt;I&gt;chef-d’œuvre &lt;/I&gt;[ &lt;/FONT&gt;&lt;A href="http://fr.encarta.msn.com/encnet/features/dictionary/PronounceF.aspx?search=chef-d%e2%80%99%c5%93uvre" mce_href="http://fr.encarta.msn.com/encnet/features/dictionary/PronounceF.aspx?search=chef-d%e2%80%99%c5%93uvre"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="COLOR: windowtext; TEXT-DECORATION: none; text-underline: none"&gt;&lt;FONT size=3&gt;∫εdœv&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: windowtext; LINE-HEIGHT: 115%; TEXT-DECORATION: none; text-underline: none"&gt;R&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; ]? Interesting problem for my colleagues who create text-to-speech systems…&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;I&gt;Chef-d’œuvre &lt;/I&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;is certainly not the only complex word which encapsulates so many difficulties for those of us who create NLP applications. I could probably also have chosen &lt;I&gt;le&lt;/I&gt; &lt;I&gt;hors-d’œuvre &lt;/I&gt;[appetizer]&lt;I&gt;, &lt;/I&gt;which begins with an aspirated ‘h’ and does not admit elision, unlike &lt;I&gt;homme &lt;/I&gt;[man] -&amp;gt; &lt;I&gt;l’homme&lt;/I&gt;. &lt;I&gt;Main-d’œuvre&lt;/I&gt; [manpower] would probably have been a nice candidate, too. In fact, there is no dearth of thorny issues linguists need to deal with. Well, languages are difficult, aren’t they? That’s probably what makes my job here so challenging and so interesting …&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt" mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt; LINE-HEIGHT: normal"&gt;&lt;I&gt;&lt;SPAN lang=FR style="FONT-SIZE: 12pt; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ansi-language: FR"&gt;&lt;FONT face=Calibri&gt;-- Thierry Fontenelle (Program Manager)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9471326" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/French/">French</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/ligature/">ligature</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/word_2D00_breaking/">word-breaking</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/tokenization/">tokenization</category></item><item><title>Natural Language Group blog featured in Language Tech News</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/03/03/natural-language-group-blog-featured-in-language-tech-news.aspx</link><pubDate>Tue, 03 Mar 2009 06:59:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9455803</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>1</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9455803</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/03/03/natural-language-group-blog-featured-in-language-tech-news.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The post I wrote a few months ago about &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2008/09/11/can-i-remove-a-word-from-office-s-speller-dictionary.aspx"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;how users can remove a word from the main dictionary&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; of their Office speller has been reprinted in the latest edition of &lt;/FONT&gt;&lt;A href="http://www.ata-divisions.org/LTD/documents/newsletter/2008-4_LTDnewsletter.pdf"&gt;&lt;I&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri color=#0000ff&gt;Language Tech News&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt; (&lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.ata-divisions.org/LTD/documents/newsletter/2008-4_LTDnewsletter.pdf"&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri color=#0000ff&gt;vol.2, No.4, February 2009&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;), a publication of the &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.ata-divisions.org/LTD/"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;I&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;Language Technology Division&lt;/SPAN&gt;&lt;/I&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt; of the &lt;I&gt;American Translators Association&lt;/I&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt; (ATA). Besides some technical details about how to use these “exclude dictionaries”, which the editor of the Newsletter felt would interest his readers, this post was a nice opportunity to show how the new contextual spellers in Office 2007 reduce the need that some users feel to exclude certain words from their speller dictionary.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;This issue of &lt;I&gt;Language Tech News&lt;/I&gt; also includes interesting articles on Trados’s translation memory technology, terminology management tools, font converters, and statistical machine translation (SMT). It is interesting to read the following comment about the &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.windowslivetranslator.com/"&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri color=#0000ff&gt;Windows Live Translator&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;, the MT system created by our Microsoft Research colleagues: «&amp;nbsp;Perhaps the most successful MT application in the world today, the Microsoft Knowledge Base, used by hundreds of millions of users across the globe, is mostly a SMT-based effort&amp;nbsp;» (p.17).&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Thierry Fontenelle (Program Manager)&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt" mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9455803" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/exclude+dictionary/">exclude dictionary</category></item><item><title>Hotfix für die deutsche Silbentrennung</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/01/22/hotfix-f-r-die-deutsche-silbentrennung.aspx</link><pubDate>Thu, 22 Jan 2009 19:56:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9369448</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>4</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9369448</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/01/22/hotfix-f-r-die-deutsche-silbentrennung.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN lang=DE style="mso-ansi-language: DE"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Microsoft hat einen Hotfix für die deutsche Silbentrennung in Office 2007 veröffentlicht. Dieser Hotfix&amp;nbsp; verbessert eine Reihe von inkorrekten Silbentrennungen bezüglich der Trennung von Präfixen, wie z.B.&lt;I&gt; be-fasst, ein-lässt, auf-läuft , um-bringt&lt;/I&gt;,&lt;I&gt; &lt;/I&gt;und der Trennung von Suffixen, wie z.B. &lt;I&gt;Kennt-nis&lt;/I&gt;, sowie&lt;I&gt; &lt;/I&gt;der Trennung von zusammengesetzten Wörtern, wie z.B. &lt;I&gt;Haus-tür, Glas-tisch&lt;/I&gt;. Betroffen von dem Hotfix ist nur die Spracheinstellung Deutsch (Deutschland), wo die fehlerhaften Silbentrennungen aufgetreten waren. Deutsch (Österreich) und Deutsch (Schweiz) sind nicht betroffen.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN lang=DE style="mso-ansi-language: DE"&gt;&lt;FONT face=Calibri size=3&gt;Der Hotfix ist unter dem KB-Artikel &lt;A href="http://support.microsoft.com/kb/960500/en-us" mce_href="http://support.microsoft.com/kb/960500/en-us"&gt;http://support.microsoft.com/kb/960500/en-us&lt;/A&gt;&amp;nbsp;&lt;/FONT&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;zu finden. Weitere Informationen zu den Installationsanforderungen können Sie im KB-Artikel nachlesen. Der Hotfix wird auch über eines der nächsten Service Packs verfügbar sein.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN lang=DE style="mso-ansi-language: DE"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;EM&gt;Stefanie Schiller - Program Manager&lt;/EM&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9369448" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/German/">German</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/hyphenator/">hyphenator</category></item><item><title>A context-sensitive speller for Spanish in Office 2007</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/01/15/a-context-sensitive-speller-for-spanish-in-office-2007.aspx</link><pubDate>Thu, 15 Jan 2009 07:25:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9320046</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>2</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9320046</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/01/15/a-context-sensitive-speller-for-spanish-in-office-2007.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&lt;/o:p&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;A few weeks ago, Microsoft announced an &lt;/FONT&gt;&lt;A href="http://www.microsoft.com/Presspass/press/2008/dec08/12-05OfficeHolidayPR.mspx"&gt;&lt;FONT face=Calibri size=3&gt;initiative targeting the Hispanic community&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, with special offers for Microsoft Office 2007 and &lt;/FONT&gt;&lt;A href="http://www.microsoft.com/office/offers/hispano/default.aspx"&gt;&lt;FONT face=Calibri size=3&gt;Microsoft Office 2007 Language Pack in Spanish&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;It may be worth pointing out that the Spanish proofing tools in Office 2007 include a brand-new context-sensitive speller in addition to the regular spell-checker, thesaurus, hyphenator and grammar checker. We have discussed the &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/tags/contextual+speller/default.aspx"&gt;&lt;FONT face=Calibri size=3&gt;English context-sensitive speller&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; on multiple occasions on this blog and regular readers know that blue squiggly lines now appear under contextual mistakes (real-word errors which traditional spellers cannot flag) such as:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;Nobody knows &lt;B style="mso-bidi-font-weight: normal"&gt;&lt;U&gt;weather&lt;/U&gt;&lt;/B&gt; he is innocent or guilty. (&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt; whether)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;President Bush addresses &lt;B style="mso-bidi-font-weight: normal"&gt;&lt;U&gt;Untied&lt;/U&gt;&lt;/B&gt; Nations General Assembly. &lt;/SPAN&gt;&lt;SPAN style="mso-bidi-language: AR-SA"&gt;(&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN style="FONT-FAMILY: Wingdings; mso-bidi-language: AR-SA; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bidi-language: AR-SA"&gt; United)&lt;/SPAN&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;People say your &lt;B style="mso-bidi-font-weight: normal"&gt;&lt;U&gt;hole&lt;/U&gt;&lt;/B&gt; life flashes before your eyes before you die. (&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt; whole)&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;Life insurance plays a very essential part in our &lt;B style="mso-bidi-font-weight: normal"&gt;&lt;U&gt;every day&lt;/U&gt;&lt;/B&gt; life. (&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: Wingdings; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;&lt;SPAN style="mso-char-type: symbol; mso-symbol-font-family: Wingdings"&gt;à&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt; everyday)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;If you teach Spanish, if you are a translator, or if you write documents in Spanish regularly, you will most certainly understand how difficult it can be to spot the erroneous use of the word &lt;I&gt;echo&lt;/I&gt; in a context where &lt;I&gt;hecho&lt;/I&gt; should in fact be used. The blue squiggles under contextual mistakes testify to the progress made in this area in Office 2007 (if you right-click on the word &lt;B&gt;echo&lt;/B&gt; in the sentence “&lt;I&gt;Lo ha &lt;B&gt;&lt;U&gt;echo&lt;/U&gt;&lt;/B&gt; muy bien&lt;/I&gt;", you will see that the speller correctly suggests &lt;B&gt;hecho, &lt;/B&gt;as is shown in the examples below).&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;?xml:namespace prefix = v ns = "urn:schemas-microsoft-com:vml" /&gt;&lt;v:shapetype id=_x0000_t75 coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;&lt;v:stroke joinstyle="miter"&gt;&lt;/v:stroke&gt;&lt;v:formulas&gt;&lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 1 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum 0 0 @1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @2 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 0 1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @6 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @8 21600 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @10 21600 0"&gt;&lt;/v:f&gt;&lt;/v:formulas&gt;&lt;v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"&gt;&lt;/v:path&gt;&lt;o:lock v:ext="edit" aspectratio="t"&gt;&lt;/o:lock&gt;&lt;/v:shapetype&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;IMG src="http://blogs.msdn.com/photos/naturallanguage/images/8567277/original.aspx"&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Missing accents (as in &lt;B&gt;medico&lt;/B&gt;, which should be &lt;B&gt;médico&lt;/B&gt; in the sentence “&lt;I&gt;Fuimos al &lt;B&gt;&lt;U&gt;medico&lt;/U&gt;&lt;/B&gt; ayer&lt;/I&gt;”) are very frequent mistakes as well and this new tool should prove useful to identify these blatant errors before you publish your document or send your email.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;-- Thierry Fontenelle (Program Manager)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9320046" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spanish/">Spanish</category></item><item><title>An academic evaluation of the Office 2007 contextual spelling checker</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2009/01/08/an-academic-evaluation-of-the-office-2007-contextual-spelling-checker.aspx</link><pubDate>Thu, 08 Jan 2009 08:00:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9293846</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>8</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9293846</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2009/01/08/an-academic-evaluation-of-the-office-2007-contextual-spelling-checker.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;FONT face=Calibri size=3&gt;A few days ago, I discovered an analysis of our Office 2007 contextual speller carried out by Prof. Graeme Hirst, from the University of Toronto: &amp;nbsp;&lt;/FONT&gt;&lt;A href="http://ftp.cs.toronto.edu/pub/gh/Hirst-2008-Word.pdf"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;An Evaluation of the Contextual Spelling Checker of Microsoft Office Word 2007&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;We have discussed this new &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/tags/contextual+speller/default.aspx"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;context-sensitive speller&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; on several occasions on this blog (as well as &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/correcteurorthographiqueoffice/archive/2006/06/05/contextual-spelling-in-the-2007-microsoft-office-system.aspx"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;here&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;) and it is nice to see that it is attracting the attention of researchers in the academic world. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;It’s an interesting paper, which provides some food for thought, however, especially with respect to how “aggressive” we should be in our approach to recall.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;His conclusion nicely sums up our trade-offs and dilemmas (emphasis mine):&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; LINE-HEIGHT: normal"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="FONT-FAMILY: Times-Roman"&gt;In an evaluation on 1400 examples, &lt;B&gt;&lt;U&gt;it is found to have high precision but low recall&lt;/U&gt; &lt;/B&gt;— that is, it fails to find most errors, but when it does flag a possible error, it is almost always correct.&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; LINE-HEIGHT: normal"&gt;&lt;I&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; LINE-HEIGHT: normal"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;The contextual spelling corrector in Microsoft Office Word 2007 is &lt;B&gt;&lt;U&gt;a cautious (low recall) but believable (high precision) system&lt;/U&gt;&lt;/B&gt;. However, its overall performance, as measured by F, is much poorer than that of the trigram method of Mays et al (1991).&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; LINE-HEIGHT: normal"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;The trade-off between the two systems is a difficult one. In simple terms, better performance is better; but believability is an important attribute for a consumer-level system (“if Word says it’s wrong then it’s wrong”) and could well be considered worth sacrificing performance for.&amp;nbsp; The problem with this, however, is that as users become familiar with the system, their expectations will rise and believability will start to apply also to what Word fails to flag (“If Word says it’s right then it’s right”).&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.5in"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;A system that is more visibly error-prone might actually serve users better.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The methodology used by Prof. Hirst and his colleagues to evaluate the system deserves a few comments:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraph style="MARGIN: 0in 0in 10pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3&gt;·&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN dir=ltr&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;They automatically induced real-word errors by replacing words by any spelling variation found in the lexicon of the &lt;I&gt;ispell&lt;/I&gt; spelling checker. They limit the manipulation to an edit distance of 1 manipulation. So these errors are not natural mistakes.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraph style="MARGIN: 0in 0in 10pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3&gt;·&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN dir=ltr&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;They did not consider “malapropisms” (real-word mistakes) involving closed-class words and words formed by the insertion or deletion of an apostrophe or by splitting a word: this means they exclude pairs which we have found to be extremely frequent in real texts (&lt;I&gt;then/than; your/you’re; its/it’s; everyday/every day; to/too; their/there/they’re&lt;/I&gt;…). These pairs feature prominently in any analysis of real mistakes, especially in the literature devoted to English as a Second Language. Everyone knows that many native speakers of English have a lot of difficulty mastering these confusables, which is why we decided to specifically target them.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraph style="MARGIN: 0in 0in 10pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3&gt;·&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN dir=ltr&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;They did not include phonetic confusables such as &lt;I&gt;cymbal/symbol,&lt;/I&gt; &lt;I&gt;principle/principal, pear/pair, there/their&lt;/I&gt; which have an edit distance &amp;gt; 1. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;The categories they did not include in their tests are precisely those which we focused on because flagging these real and frequent mistakes is very useful for users of Office and Word. So assessing the “performance” of a system by ignoring these may be a bit unfair, at least if one equates “performance” and “usefulness” (will users find the system more useful if we flag “have not lost &lt;U&gt;monkey&lt;/U&gt;” (&lt;/FONT&gt;&lt;SPAN style="FONT-FAMILY: Wingdings"&gt;à&lt;/SPAN&gt;&lt;FONT face=Calibri&gt; money), a rare and unnatural mistake, or if we flag “it is &lt;U&gt;to&lt;/U&gt; expensive”, a mistake our data shows is very frequent and which we seem to be good at flagging?). Recall would be a lot higher if pairs involving closed-class words and the standard phonetic confusables above were taken into account (our own metrics based on a large corpus of real mistakes shows that our recall is in fact higher than the 20-25% found by Hirst, and is around 40%). The alternative methods which he proposes have even higher recall (50%), but their precision (50%) is way lower than our system’s (96%). Hirst clearly favors a recall-based performance. His assumption is: do people want to use a system like Microsoft’s, which only spots one mistake out of 5 (our metrics show it’s in fact closer to 2 out of 5, i.e. 40%) and is right nearly all the time? Our assumption is: would users really want a system based on the trigram method advocated by Prof. Hirst, which flags 50% of the mistakes but is wrong in 50% of the cases? The feedback we generally get indicates that our users tend to prefer unobtrusive tools and switch off a tool which they consider unreliable.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Interesting debate, isn’t it? I am really grateful to Prof. Hirst for making this discussion possible.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;So, what do &lt;B&gt;&lt;U&gt;you&lt;/U&gt;&lt;/B&gt; think? We are interested in hearing your opinion. Do you prefer a tool which casts the net as wide as possible and catches many mistakes, at the risk of being frequently wrong and of creating many false flags (false positives), or do you prefer a tool which does not catch all possible mistakes, but which you can trust when it does catch one? Do not hesitate to leave your comments below… &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Thierry Fontenelle – Program Manager&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9293846" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/proofing+tools/">proofing tools</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/English/">English</category></item><item><title>English Grammar Checker, Fragments, and Settings</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/10/22/english-grammar-checker-fragments-and-settings.aspx</link><pubDate>Thu, 23 Oct 2008 00:54:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9011791</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>5</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=9011791</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/10/22/english-grammar-checker-fragments-and-settings.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;A French translator was asking an interesting question the other day on the &lt;/FONT&gt;&lt;A href="http://www.microsoft.com/communities/newsgroups/en-us/default.aspx?&amp;amp;lang=fr&amp;amp;cr=FR&amp;amp;guid=&amp;amp;sloc=en-us&amp;amp;dg=microsoft.public.fr.word&amp;amp;p=1&amp;amp;tid=18f01f76-5e64-4ae9-9f91-8abeb5a085fa&amp;amp;mid=18f01f76-5e64-4ae9-9f91-8abeb5a085fa"&gt;&lt;FONT face=Calibri size=3&gt;Word community newsgroup&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. He wanted to know how he could switch off the grammar rule which flags “Fragments”, i.e. incomplete sentence fragments that the writer is invited to revise. The user did not want to turn off the English spell-checker, which he found very useful, or even the grammar checker, but only that particular rule, which he found particularly annoying when he was translating texts into English. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;As can be seen below, the English grammar checker flags sentences that are considered incomplete (consider the fragment “And oranges.”, which is green-squiggled, or a verb-less sentence like “He happy.”). If you right-click on the squiggled string, you will see that the grammar checker advises you to consider revising this fragment. &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;?xml:namespace prefix = v ns = "urn:schemas-microsoft-com:vml" /&gt;&lt;v:shapetype id=_x0000_t75 stroked="f" filled="f" path="m@4@5l@4@11@9@11@9@5xe" o:preferrelative="t" o:spt="75" coordsize="21600,21600"&gt;&lt;v:stroke joinstyle="miter"&gt;&lt;/v:stroke&gt;&lt;v:formulas&gt;&lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 1 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum 0 0 @1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @2 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 0 1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @6 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @8 21600 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @10 21600 0"&gt;&lt;/v:f&gt;&lt;/v:formulas&gt;&lt;v:path o:connecttype="rect" gradientshapeok="t" o:extrusionok="f"&gt;&lt;/v:path&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:lock aspectratio="t" v:ext="edit"&gt;&lt;/o:lock&gt;&lt;/v:shapetype&gt;&lt;/SPAN&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;IMG src="http://blogs.msdn.com/photos/naturallanguage/images/9005167/original.aspx"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The grammar checker is aware of the context in which the fragment has been used. A fragment such as “Flight to Paris” will not be flagged if it is not followed by a period (for instance in a title). The grammar checker will not say anything either if the same fragment is used in a bulleted or numbered list, as in the example below. In contexts other than titles, headings, or lists, however, it may seem reasonable to draw the user’s attention to a suspicious fragment. The user is of course free to ignore these flags.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Let us come back to our translator’s original question. It is indeed possible to customize the grammar checker and to prevent this rule from firing if one finds it useless. To do so, go to the &lt;B&gt;Word Options&lt;/B&gt; (via the Office button in the top-left corner). Click on &lt;B&gt;Proofing&lt;/B&gt;, then on &lt;B&gt;Settings&lt;/B&gt; in the section “&lt;B&gt;When correcting spelling and grammar in Word&lt;/B&gt;”. You will then see the list of grammar rules used by the English grammar checker, as displayed in the screenshot below. You just need to uncheck the 2&lt;SUP&gt;nd&lt;/SUP&gt; rule (&lt;B&gt;Fragments and Run-ons&lt;/B&gt;), between the “&lt;B&gt;Capitalization&lt;/B&gt;” rule and the “&lt;B&gt;Misused words&lt;/B&gt;” rule. Done! The green squiggle will no longer appear under the structure shown in the examples above.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;/SPAN&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;IMG src="http://blogs.msdn.com/photos/naturallanguage/images/9011779/original.aspx"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;If you are writing a document in multiple languages, the cursor needs to be placed in a text whose language has been set to English. The English grammar checker will only work if the language of the text is set to English. If the text is in French, the rules of the French grammar checker will obviously be very different (the same applies to any other language), and you can customize them using the method described above. The settings of the French grammar checker will then be displayed as follows:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;/SPAN&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;IMG src="http://blogs.msdn.com/photos/naturallanguage/images/9011782/original.aspx"&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;As you can see, users have some freedom to decide how they want to use their grammar checker. As far as I am concerned, I have unchecked the rule “&lt;STRONG&gt;Style – Contractions&lt;/STRONG&gt;” of my English grammar checker because I constantly use contracted forms like “I’ll” and “You’ll” in my interactions with my colleagues. I know that, in many businesses, however, users who write very official documents want to spot these contracted forms and replace them with more formal forms like “I will” and “you will”. That is the reason why this gamut of settings is offered to the users of these complex tools.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Do not hesitate to send your feedback or your comments.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;-- Thierry Fontenelle (Program Manager)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9011791" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/proofing+tools/">proofing tools</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/grammar+checker/">grammar checker</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/English/">English</category></item><item><title>Can I remove a word from Office’s speller dictionary?</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/09/11/can-i-remove-a-word-from-office-s-speller-dictionary.aspx</link><pubDate>Thu, 11 Sep 2008 08:50:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8943406</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>33</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=8943406</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/09/11/can-i-remove-a-word-from-office-s-speller-dictionary.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The other day, I was discussing a number of &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2008/09/04/suggested-improvements-to-microsoft-word-s-spell-checker.aspx" mce_href="http://blogs.msdn.com/naturallanguage/archive/2008/09/04/suggested-improvements-to-microsoft-word-s-spell-checker.aspx"&gt;&lt;FONT face=Calibri size=3&gt;suggestions to improve Office’s spell-checker&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. A customer was suggesting we should allow users to delete individual items from Word’s spell-checker lexicon. This feature is already available, in fact: if you want to specify a preferred spelling for a word and to exclude a given spelling from the main lexicon used by the Office speller, you need to use an “&lt;B&gt;exclusion dictionary&lt;/B&gt;”. Your speller comes with an empty exclusion dictionary and you can add words to it if you want them to be permanently red-squiggled. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;You first need to locate your exclusion dictionary, which, if you use Vista and Office 2007, can be found in the following folder:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;C:\Users\&lt;I&gt;User Name&lt;/I&gt;\AppData\Roaming\Microsoft\UProof\&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Each language has a specific dictionary whose name starts with “ExcludeDictionary”, followed by the language code (EN for English, FR for French, SP for Spanish, GE for German…), followed by the LCID (locale identification number). The extension is .lex. For instance:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT face=Calibri size=3&gt;English (US): &lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ExcludeDictionaryEN0409.lex&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT face=Calibri size=3&gt;English (UK): &lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ExcludeDictionaryEN0809.lex&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT face=Calibri size=3&gt;English (Australia)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ExcludeDictionaryEN0c09.lex&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT face=Calibri size=3&gt;English (Canada)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ExcludeDictionaryEN1009.lex&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT face=Calibri size=3&gt;French: &lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;ExcludeDictionaryFR040c.lex&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;FONT face=Calibri size=3&gt;You can open the file with Notepad or WordPad and add a word which you want the speller to flag as misspelled. Save and close the file. You are done!&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;You can type “exclude dictionary” or “exclusion dictionary” in the Office help to get more information about this feature.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Of course, caution should be exercised when you decide to remove a word from your Office speller. If you decide to remove the word &lt;I&gt;manger&lt;/I&gt; because you frequently type &lt;I&gt;program manger&lt;/I&gt; instead of &lt;I&gt;program manager, &lt;/I&gt;you should not be surprised when your speller flags &lt;I&gt;manger&lt;/I&gt; in a sentence like “&lt;I&gt;Jesus was laid in a manger&lt;/I&gt;”. This is why we have introduced a &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/tags/contextual+speller/default.aspx" mce_href="http://blogs.msdn.com/naturallanguage/archive/tags/contextual+speller/default.aspx"&gt;&lt;FONT face=Calibri size=3&gt;contextual speller&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, which tries to identify words which exist but are misspelled in a given context (see the post I was referring to, in which I showed &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2008/09/04/suggested-improvements-to-microsoft-word-s-spell-checker.aspx" mce_href="http://blogs.msdn.com/naturallanguage/archive/2008/09/04/suggested-improvements-to-microsoft-word-s-spell-checker.aspx"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;how Office 2007 flags some erroneous uses of &lt;I&gt;manger&lt;/I&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; in &lt;I&gt;program manger&lt;/I&gt;). &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;To give another example where contextual spelling might be preferred over exclusion, consider the user who had contacted the Word newsgroup to find out how to exclude the word “ahs” from the main speller lexicon. This user kept typing &lt;I&gt;ahs&lt;/I&gt; instead of &lt;I&gt;has&lt;/I&gt;. The new context-sensitive speller in Office 2007 flags a number of contexts where "ahs" should not be used, however, which should address this user's problem without having to remove the word altogether from the lexicon. You will see a blue squiggly line under "ahs" if you write something like "He &lt;B&gt;&lt;U style="text-underline: wavy-heavy"&gt;ahs&lt;/U&gt;&lt;/B&gt; never done it before", for instance. But you will not get any flag under "ahs" if you write "we definitely got oohs and &lt;B&gt;ahs&lt;/B&gt; all around when we launched this product". &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;EM&gt;Thierry Fontenelle – Program Manager&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8943406" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spelling/">Spelling</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/exclude+dictionary/">exclude dictionary</category></item><item><title>Suggested improvements to Microsoft Word’s spell-checker</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/09/04/suggested-improvements-to-microsoft-word-s-spell-checker.aspx</link><pubDate>Thu, 04 Sep 2008 07:30:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8923553</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>7</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=8923553</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/09/04/suggested-improvements-to-microsoft-word-s-spell-checker.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;A few days ago, James wrote about the &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2008/08/28/natural-language-group-in-the-local-news.aspx"&gt;&lt;FONT face=Calibri size=3&gt;articles the Seattle Times published about our Natural Language Group and the Office spell-checkers&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. One of these articles was encouraging the &lt;I&gt;Seattle Times&lt;/I&gt; readers to suggest improvements (&lt;/FONT&gt;&lt;A href="http://blog.seattletimes.nwsource.com/techtracks/2008/08/28/what_words_would_you_add_to_the_microsoft_spell_ch.html"&gt;&lt;FONT face=Calibri size=3&gt;What words would you add to the Microsoft spell-checker?&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;).&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;It is very interesting to read what our users consider pain points.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;The following suggestions from one of the readers (RogerKni) are interesting:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="BACKGROUND: white; MARGIN: 0in 0in 11.25pt 0.5in; LINE-HEIGHT: 14.25pt"&gt;&lt;SPAN style="FONT-SIZE: 8.5pt; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;Here are my four suggested improvements to Word's spell-checker:&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="BACKGROUND: white; MARGIN: 0in 0in 11.25pt 0.5in; LINE-HEIGHT: 14.25pt"&gt;&lt;SPAN style="FONT-SIZE: 8.5pt; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;1. Give the user the option to flag rare words with an orange wavy line. An example would be “manger,” which 99% of the time is a misspelling of “manager.” Or “fro”: usually a misspelling of “for.” Or “whey” for “why.” Allow users to delete or add individual items to the Word-provided list of rarities. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="BACKGROUND: white; MARGIN: 0in 0in 11.25pt 0.5in; LINE-HEIGHT: 14.25pt"&gt;&lt;SPAN style="FONT-SIZE: 8.5pt; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;2. Automatically add the apostrophized version of any noun added to a dictionary. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="BACKGROUND: white; MARGIN: 0in 0in 11.25pt 0.5in; LINE-HEIGHT: 14.25pt"&gt;&lt;SPAN style="FONT-SIZE: 8.5pt; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;3. Add a “picky-mode” option, which one could turn on to get the preferred spelling of certain words. Currently, if there are two options for spelling a word, Word flags neither. That’s usually what’s wanted—the user doesn’t want to be harassed about a minor issue. But sometimes, as when publishing a book, the user wants to pick the best spelling. (Second-best spellings could be flagged with a wavy purple line.)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="BACKGROUND: white; MARGIN: 0in 0in 11.25pt 0.5in; LINE-HEIGHT: 14.25pt"&gt;&lt;SPAN style="FONT-SIZE: 8.5pt; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;4. Give the user the option to tell Word to aggressively correct misspellings with what it thinks is the Best Match. Some would prefer this to Word’s cautious policy of merely flagging such words in red and making the user choose the correction.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;In past releases, we looked into solving the problem described in Suggestion (2) for our English users. The solution presupposes an identification of (singular) nouns. We could ask users to do so for us, opting into the &lt;I&gt;‘s&lt;/I&gt; form for words like &lt;I&gt;Palin&lt;/I&gt; (&lt;I&gt;Palin’s&lt;/I&gt;), and not for the plural noun &lt;I&gt;subproblems&lt;/I&gt; (*&lt;I&gt;subproblems’s),&lt;/I&gt; the adjective &lt;I&gt;semicontinuous &lt;/I&gt;or&lt;I&gt; dermatoglyphics&lt;/I&gt; (* &lt;I&gt;dermatoglyphics’s&lt;/I&gt; – see my colleague Mari Olsen’s &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2007/03/22/you-say-arkansas-i-say-arkansas-s-let-s-write-it-into-the-law.aspx"&gt;&lt;FONT face=Calibri size=3&gt;post on possessives and apostrophes&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;)&lt;I&gt; .&lt;/I&gt; Of course we could also make guesses based on our knowledge of words that we have (&lt;I&gt;continuous&lt;/I&gt; is an adjective; &lt;I&gt;problems&lt;/I&gt; is a plural noun), but the computation would have to be weighed against the user benefit.&lt;SPAN style="COLOR: #1f497d"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Suggestion (1) is in fact already covered in a large number of cases in Office 2007 (albeit not with an orange wavy line: we used a blue wavy line to signal &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/tags/contextual+speller/default.aspx"&gt;&lt;FONT face=Calibri size=3&gt;contextual mistakes, which have been mentioned on several occasions on this blog&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;). Look at the following screenshots, which show that the contextual speller in Office 2007 is able to flag words like “manger” or “fro” used in the wrong contexts:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="mso-bidi-language: JI; mso-no-proof: yes"&gt;&lt;?xml:namespace prefix = v ns = "urn:schemas-microsoft-com:vml" /&gt;&lt;v:shapetype id=_x0000_t75 coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;&lt;v:stroke joinstyle="miter"&gt;&lt;/v:stroke&gt;&lt;v:formulas&gt;&lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 1 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum 0 0 @1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @2 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 0 1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @6 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @8 21600 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @10 21600 0"&gt;&lt;/v:f&gt;&lt;/v:formulas&gt;&lt;v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"&gt;&lt;/v:path&gt;&lt;o:lock v:ext="edit" aspectratio="t"&gt;&lt;/o:lock&gt;&lt;/v:shapetype&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;SPAN minmax_bound="true"&gt;&lt;A href="http://blogs.msdn.com/photos/naturallanguage/images/8923544/original.aspx" minmax_bound="true"&gt;&lt;IMG id=ctl00___ctl00___ctl00_ctl00_bcr_PictureDetails1___detailsImage_SmallThumb8923544 height=67 alt=mangerCSS src="http://blogs.msdn.com/photos/naturallanguage/images/8923544/425x67.aspx" width=425 border=0 minmax_bound="true"&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;SPAN minmax_bound="true"&gt;&lt;/SPAN&gt;Note, too, that users have the possibility of deleting words from the Office speller lexicon. We will come back to this issue in a future post (if you are impatient, type “&lt;B&gt;exclude&lt;/B&gt; &lt;B&gt;dictionary&lt;/B&gt;” or “&lt;B&gt;exclusion&lt;/B&gt; &lt;B&gt;dictionary&lt;/B&gt;” in the Help file). You may use that feature to exclude the non-preferred variants (of course, this is an individual decision you need to make: we cannot impose your own preferred spelling onto everyone).&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Last but not least, suggestion (4) also exists: it is called internally AutoReplace. This feature “aggressively” corrects misspellings with what it thinks is the best match, as is suggested by RogerKni. Try typing “infomation”, for instance and you will see that Word automatically corrects it to &lt;I style="mso-bidi-font-style: normal"&gt;information&lt;/I&gt; as you as you hit the space bar. To activate that feature, go to the big Office button in the top-left corner of your Office application, click on &lt;B style="mso-bidi-font-weight: normal"&gt;Word Options&lt;/B&gt;, then on &lt;B style="mso-bidi-font-weight: normal"&gt;Proofing&lt;/B&gt;, then on &lt;B style="mso-bidi-font-weight: normal"&gt;AutoCorrect Options&lt;/B&gt;. At the bottom of the screen, you will see the option “&lt;B style="mso-bidi-font-weight: normal"&gt;Automatically use suggestions from the spelling checker&lt;/B&gt;”. Tick the box as in the screenshot below:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="mso-bidi-language: JI; mso-no-proof: yes"&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;SPAN minmax_bound="true"&gt;&lt;A href="http://blogs.msdn.com/photos/naturallanguage/images/8923532/original.aspx" minmax_bound="true"&gt;&lt;IMG id=ctl00___ctl00___ctl00_ctl00_bcr_PictureDetails1___detailsImage_SmallThumb8923532 height=425 alt=AutoReplace src="http://blogs.msdn.com/photos/naturallanguage/images/8923532/376x425.aspx" width=376 border=0 minmax_bound="true"&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;SPAN minmax_bound="true"&gt;&lt;A href="http://null/photos/naturallanguage/images/8923532/original.aspx" minmax_bound="true"&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;SPAN minmax_bound="true"&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;I hope you will find these tips useful.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;-- Thierry Fontenelle (Program Manager)&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/I&gt;&amp;nbsp;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8923553" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spelling/">Spelling</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/English/">English</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/possessives/">possessives</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/AutoReplace/">AutoReplace</category></item><item><title>Natural Language Group in the (Local) News</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/08/28/natural-language-group-in-the-local-news.aspx</link><pubDate>Thu, 28 Aug 2008 21:06:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8903890</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>2</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=8903890</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/08/28/natural-language-group-in-the-local-news.aspx#comments</comments><description>&lt;P&gt;Last week&amp;nbsp;our group was&amp;nbsp;visited by reporters from the Seattle Times, who took some photographs and talked to folks on the team about the challenges of developing software features that keep up with the world's &lt;A class="" href="http://blogs.msdn.com/naturallanguage/archive/tags/Spelling+Reform/default.aspx" target=_blank mce_href="http://blogs.msdn.com/naturallanguage/archive/tags/Spelling+Reform/default.aspx"&gt;rapidly changing&lt;/A&gt; languages.&amp;nbsp; This&amp;nbsp;morning, the front page of the&amp;nbsp;Business &amp;amp; Technology section&amp;nbsp;&lt;A class="" href="http://seattletimes.nwsource.com/html/businesstechnology/2008143381_spellchecker28.html" target=_blank mce_href="http://seattletimes.nwsource.com/html/businesstechnology/2008143381_spellchecker28.html"&gt;discusses&lt;/A&gt; those challenges, and a sidebar gives a pretty nice &lt;A class="" href="http://seattletimes.nwsource.com/html/businesstechnology/2008143384_spellcheckerbar28.html" target=_blank mce_href="http://seattletimes.nwsource.com/html/businesstechnology/2008143384_spellcheckerbar28.html"&gt;summary&lt;/A&gt; of how we select words to add or remove from our spellchecking dictionaries, not just for English but for &lt;A class="" href="http://blogs.msdn.com/naturallanguage/archive/tags/Language+Interface+Pack/default.aspx" target=_blank mce_href="http://blogs.msdn.com/naturallanguage/archive/tags/Language+Interface+Pack/default.aspx"&gt;lots&lt;/A&gt; and &lt;A class="" href="http://blogs.msdn.com/naturallanguage/archive/tags/Single+Language+Pack/default.aspx" target=_blank mce_href="http://blogs.msdn.com/naturallanguage/archive/tags/Single+Language+Pack/default.aspx"&gt;lots&lt;/A&gt; of languages.&amp;nbsp; Worth a look, if for no other reason than to get a glimpse of the terribly charming and attractive people who work on your spellcheckers.&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;-- James Lyle, Tester&lt;/EM&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8903890" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spelling/">Spelling</category></item><item><title>Lets Play Two?</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/04/04/lets-play-two.aspx</link><pubDate>Fri, 04 Apr 2008 02:41:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8355282</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>1</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=8355282</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/04/04/lets-play-two.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;For opening day this year, my beloved Chicago Cubs unveiled a statue of the immortal &lt;/FONT&gt;&lt;A href="http://en.wikipedia.org/wiki/Ernie_Banks"&gt;&lt;FONT face=Calibri size=3&gt;Ernie Banks&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;On its pedestal was engraved the catchphrase that made famous his enthusiasm for the game of baseball: “Let’s Play Two”.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Except that the sculptor forgot one little detail, as reported by the &lt;/FONT&gt;&lt;A href="http://www.chicagotribune.com/news/local/chi-schmich_both_02apr02,0,3247109.column"&gt;&lt;FONT face=Calibri size=3&gt;Chicago Tribune&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;: the apostrophe!&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;He should have checked it out in Microsoft Word, first. The contextual spellchecker would have saved him a lot of grief:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN minmax_bound="true"&gt;&lt;IMG style="WIDTH: 425px; HEIGHT: 332px" height=332 src="http://blogs.msdn.com/photos/naturallanguage/images/8355274/425x332.aspx" width=425 mce_src="http://blogs.msdn.com/photos/naturallanguage/images/8355274/425x332.aspx"&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN minmax_bound="true"&gt;&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN minmax_bound="true"&gt;-- &lt;EM&gt;James Lyle, Tester&lt;/EM&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8355282" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/baseball/">baseball</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/English/">English</category></item><item><title>The Grammar Checker and Rain Man</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/03/06/the-grammar-checker-and-rain-man.aspx</link><pubDate>Thu, 06 Mar 2008 19:22:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8074299</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>2</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=8074299</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/03/06/the-grammar-checker-and-rain-man.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Thinking about a couple of recent Language Log posts&amp;nbsp;about&amp;nbsp;grammar checkers, &lt;A class="" href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005431.html" target=_blank mce_href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005431.html"&gt;passive sentences&lt;/A&gt;,&amp;nbsp;and &lt;/FONT&gt;&lt;A class="" href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005404.html" target=_blank mce_href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005404.html"&gt;&lt;FONT face=Calibri size=3&gt;grammatical Cupertinos&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, it occurred to me that grammar checkers are a bit like &lt;/FONT&gt;&lt;A class="" href="http://www.imdb.com/title/tt0095953/" target=_blank mce_href="http://www.imdb.com/title/tt0095953/"&gt;&lt;FONT face=Calibri size=3&gt;Rain Man&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;.&amp;nbsp; For the benefit of younger readers (no one alive in 1988 could have missed the hype surrounding this movie), Dustin Hoffman plays Raymond (“Rain Man”), an autistic man who is also a mathematical savant.&amp;nbsp; Tom Cruise is his hotshot younger brother who finds himself with the difficult task of caring for Raymond, whose inability to relate to the real world leads them into bizarre situations.&amp;nbsp; Tom takes them to Las Vegas, where Raymond’s uncanny mathematical aptitude makes them big winners until casino security accuses them of cheating.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;What does this have to do with grammar checkers?&amp;nbsp; A computer grammar checker is like a grammatical Rain Man—a savant with an encyclopedic knowledge of grammar rules, but no common sense understanding of the real world, which sometimes leads to strange behavior.&amp;nbsp; Take the famous example “&lt;/FONT&gt;&lt;A class="" href="http://search.live.com/results.aspx?q=%22time+flies+like+an+arrow%22+ambiguity+computer&amp;amp;form=QBRE" target=_blank mce_href="http://search.live.com/results.aspx?q=%22time+flies+like+an+arrow%22+ambiguity+computer&amp;amp;form=QBRE"&gt;&lt;FONT face=Calibri size=3&gt;Time flies like an arrow&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;”.&amp;nbsp; This sentence is notable because it has a lot of possible but quite strange interpretations, in addition to its straightforward meaning of “time moves fast the way an arrow moves fast”.&amp;nbsp; It could mean, for example, something like: whenever you encounter a fly, time it in the same way you would time an arrow (using a stopwatch, perhaps).&amp;nbsp; We humans naturally disregard such interpretations (they may never even occur to us).&amp;nbsp; But our grammatical Rain Man sees every such possibility at the same time, and has to decide among them.&amp;nbsp; And it doesn’t have our common sense to help it.&amp;nbsp; Instead, it has to figure out the intended meaning of sentence by purely computational means, basically by reasoning only about things that can be counted.&amp;nbsp; Such as, say, the number of times it has seen “time” used as an imperative verb vs. the number of times it’s seen it used as a noun subject, and so on.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Most of the time this all works as advertized, but it can &lt;I&gt;occasionally&lt;/I&gt; lead to some odd decisions. Once in a while the grammar checker will make a suggestion that just looks weird, so if you blindly accept all its recommendations, you end up with those &lt;/FONT&gt;&lt;A class="" href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005404.html" mce_href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005404.html"&gt;&lt;FONT face=Calibri size=3&gt;cupertinos&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri&gt;&lt;FONT size=3&gt;.&amp;nbsp; Letting Rain Man call the shots may sound like an easy way to get rich, but like Tom at the casino, if you’re not careful you can end up in trouble.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: EN-US; mso-fareast-font-family: SimSun; mso-bidi-font-family: 'Times New Roman'; mso-fareast-language: ZH-CN; mso-bidi-language: HE; mso-fareast-theme-font: minor-fareast"&gt;So should you take advice on grammar from a Rain Man?&amp;nbsp; Well, sure, if you need it—after all, it knows &lt;B&gt;all&lt;/B&gt; the rules, and does a pretty good job most of the time.&amp;nbsp; But take it only as &lt;B&gt;advice&lt;/B&gt;, and look over its shoulder while it counts the cards.&amp;nbsp; Somebody has to be watching the dealer’s eyes.&amp;nbsp; That job is better left to a human.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: EN-US; mso-fareast-font-family: SimSun; mso-bidi-font-family: 'Times New Roman'; mso-fareast-language: ZH-CN; mso-bidi-language: HE; mso-fareast-theme-font: minor-fareast"&gt;-- &lt;EM&gt;James Lyle, Tester&lt;/EM&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8074299" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/grammar+checker/">grammar checker</category></item><item><title>Spellchecking ain't easy</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/02/15/spellchecking-ain-t-easy.aspx</link><pubDate>Sat, 16 Feb 2008 00:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7722206</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>12</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=7722206</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/02/15/spellchecking-ain-t-easy.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;A customer wrote to us recently with the observation that our English spellchecker doesn’t recognize the word &lt;I&gt;ain’t&lt;/I&gt;, a fact which struck this customer as a tad, well, old-fashioned.&amp;nbsp; Pedantic, perhaps.&amp;nbsp; The words “uptight” and “shortsighted” might have been used.&amp;nbsp; Yikes!&amp;nbsp; I’ve been accused of a lot of things, but…&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;First, let’s admit that this is not a mistake.&amp;nbsp; Yes, we deliberately excluded &lt;I&gt;ain’t&lt;/I&gt;.&amp;nbsp; You can tell, because we made sure to get just the right set of words to suggest in its place (&lt;I&gt;isn’t, am not, aren’t&lt;/I&gt;) despite these words being pretty far away from &lt;I&gt;ain’t &lt;/I&gt;in &lt;/FONT&gt;&lt;A href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005355.html" mce_href="http://itre.cis.upenn.edu/~myl/languagelog/archives/005355.html"&gt;&lt;FONT face=Calibri size=3&gt;edit distance&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;.&amp;nbsp; You could say the same about &lt;I&gt;gonna&lt;/I&gt; and &lt;I&gt;wanna&lt;/I&gt;, which are also excluded, but for which we suggest &lt;I&gt;going to&lt;/I&gt; and &lt;I&gt;want to&lt;/I&gt;, respectively.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;This is one of those &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2007/03/22/you-say-arkansas-i-say-arkansas-s-let-s-write-it-into-the-law.aspx" mce_href="http://blogs.msdn.com/naturallanguage/archive/2007/03/22/you-say-arkansas-i-say-arkansas-s-let-s-write-it-into-the-law.aspx"&gt;&lt;FONT face=Calibri size=3&gt;tough&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2006/08/03/687635.aspx" mce_href="http://blogs.msdn.com/naturallanguage/archive/2006/08/03/687635.aspx"&gt;&lt;FONT face=Calibri size=3&gt;calls&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; we encounter when building a spellchecker.&amp;nbsp; As a linguist, I have no inherent objection to the word “ain’t” on any moral, intellectual, or even aesthetic grounds.&amp;nbsp; It’s a part of my own spoken idiolect, and I tend to use it unselfconsciously in informal contexts.&amp;nbsp; Ain’t no reason not to, usually.&amp;nbsp; But clearly, it is still universally regarded as nonstandard, and people naturally want to avoid using it in formal writing.&amp;nbsp; That goes for me, too--I doubt I’d want to use it when writing to the boss, even when the boss is a pretty &lt;/FONT&gt;&lt;A href="http://en.wikipedia.org/wiki/Image:Bill_Gates_Master_Chief.jpg" mce_href="http://en.wikipedia.org/wiki/Image:Bill_Gates_Master_Chief.jpg"&gt;&lt;FONT face=Calibri size=3&gt;informal&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; kind of guy (Hi Bill!).&amp;nbsp; So from that point of view, flagging this word as an error is a good thing for customers, a lot of whom are using MS Office at work.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;On the other hand, people also have lots of &lt;/FONT&gt;&lt;A href="http://people.ischool.berkeley.edu/~nunberg/aint.html" mce_href="http://people.ischool.berkeley.edu/~nunberg/aint.html"&gt;&lt;FONT face=Calibri size=3&gt;reasons&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; to &lt;B&gt;want&lt;/B&gt; to write &lt;I&gt;ain’t&lt;/I&gt;--to be deliberately jocular, say, or to sound folksy, or just because it’s natural.&amp;nbsp; Or because they’re a &lt;/FONT&gt;&lt;A href="http://www.crossmyt.com/hc/linghebr/aus-aint.html" mce_href="http://www.crossmyt.com/hc/linghebr/aus-aint.html"&gt;&lt;FONT face=Calibri size=3&gt;Jane Austen&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; character.&amp;nbsp; And in those cases, they’d prefer to spell it right--there’s only one correct way to spell &lt;I&gt;ain’t&lt;/I&gt;, after all, and your fingers can stumble on it just as easily as any other word.&amp;nbsp; By not recognizing &lt;I&gt;ain’t,&lt;/I&gt; we sure ain’t helping folks in those situations.&amp;nbsp; And that goes for &lt;I&gt;gonna&lt;/I&gt; and &lt;I&gt;wanna&lt;/I&gt;, too.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;So what do you think?&amp;nbsp; Should there be a red squiggle under &lt;I&gt;ain’t&lt;/I&gt;, or not?&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;EM&gt;&lt;FONT face=Calibri size=3&gt;-- James Lyle, Tester&lt;/FONT&gt;&lt;/EM&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=7722206" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spelling/">Spelling</category></item><item><title>Microsoft Office 2008 for Mac now supports the French spelling reform</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2008/01/20/microsoft-office-2008-for-mac-now-supports-the-french-spelling-reform.aspx</link><pubDate>Mon, 21 Jan 2008 02:58:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7177608</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>7</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=7177608</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2008/01/20/microsoft-office-2008-for-mac-now-supports-the-french-spelling-reform.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;A href="http://www.macoffice2008.com/"&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Microsoft Office 2008 for Mac&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; was released this week. Users who are interested in the French language will notice a change in the French spell-checker, which now takes into account the French spelling reform, which is recommended by official bodies such as the &lt;/FONT&gt;&lt;A href="http://www.academie-francaise.fr/dictionnaire/index.html"&gt;&lt;I&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Académie Française&lt;/FONT&gt;&lt;/I&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, the &lt;I&gt;Conseil Supérieur de la langue française&lt;/I&gt;, the &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/correcteurorthographiqueoffice/archive/2007/04/26/les-programmes-scolaires-du-minist-re-fran-ais-de-l-ducation-nationale-et-les-consignes-relatives-l-orthographe-rectifi-e.aspx"&gt;&lt;I&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;French Ministry of Education&lt;/FONT&gt;&lt;/I&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, the &lt;/FONT&gt;&lt;A href="http://www.renouvo.org/gqmnf"&gt;&lt;I&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Groupe québécois pour la modernisation de la norme du français&lt;/FONT&gt;&lt;/I&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, the &lt;/FONT&gt;&lt;A href="http://www.renouvo.org/"&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Réseau pour la nouvelle orthographe du français&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, etc. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The official texts make it abundantly clear that both the traditional (��old’) spelling and the ‘new’ (recommended) spelling are valid. As can be seen below, the Mac Office 2008 speller accepts both forms by default (traditional and new spellings). Users of the &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2006/07/05/old-vs-new-spelling-in-french-a-new-speller-based-on-the-french-spelling-reform.aspx"&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Office 2007 French speller&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; are already familiar with the three options which enable them to change this default mode. As of this week, Mac Office users can now also decide whether they want their speller to:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraph style="MARGIN: 0in 0in 10pt 0.75in; TEXT-INDENT: -0.5in; TEXT-ALIGN: justify"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Verdana','sans-serif'"&gt;(a)&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Verdana','sans-serif'"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Verdana','sans-serif'"&gt;&amp;nbsp;consider the old and new forms as valid (which is the default option)&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoListParagraph style="MARGIN: 0in 0in 10pt 0.75in; TEXT-INDENT: -0.5in; TEXT-ALIGN: justify"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Verdana','sans-serif'"&gt;(b) apply only the traditional (‘old’) spelling (i.e. ‘new’ forms will be squiggled)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoListParagraph style="MARGIN: 0in 0in 10pt 0.75in; TEXT-INDENT: -0.5in; TEXT-ALIGN: justify"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Verdana','sans-serif'"&gt;(c)&amp;nbsp;apply the ‘new’ (rectified) spelling only (i.e. the ‘old’ forms will be squiggled)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;If you are interested in a very brief description of the kinds of changes you might notice with this new speller, you can check out &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/correcteurorthographiqueoffice/archive/2005/10/16/481531.aspx"&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;this pos&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;t, which was written when we released the new speller for Office users a while ago.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&amp;nbsp;&lt;IMG src="http://blogs.msdn.com/photos/naturallanguage/images/7177513/original.aspx"&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&amp;nbsp;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;EM&gt;-- Thierry Fontenelle (Program Manager)&amp;nbsp;&lt;/EM&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=7177608" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spelling+Reform/">Spelling Reform</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Mac+Office/">Mac Office</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/French/">French</category></item><item><title>Building corpora with the Live Search API</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/12/22/building-corpora-with-the-live-search-api.aspx</link><pubDate>Sat, 22 Dec 2007 03:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6831141</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>1</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=6831141</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/12/22/building-corpora-with-the-live-search-api.aspx#comments</comments><description>&lt;FONT face=Calibri size=3&gt;&lt;SPAN style="FONT-FAMILY: 'Times New Roman','serif'"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;I just read &lt;A href="http://www.uclouvain.be/13406.html?fct=document&amp;amp;documentID=1008699"&gt;&lt;SPAN style="COLOR: purple"&gt;Building and Exploring Web Corpora&lt;/SPAN&gt;&lt;/A&gt;, which includes the Proceedings of the &lt;I style="mso-bidi-font-style: normal"&gt;3&lt;SUP&gt;rd&lt;/SUP&gt; Web as Corpus Workshop&lt;/I&gt; (WAC3-2007) held at the University of Louvain-la-Neuve in September 2007. A number of papers describe how computational linguists have been using Microsoft’s &lt;A href="http://www.live.com/"&gt;&lt;SPAN style="COLOR: purple"&gt;Live Search&lt;/SPAN&gt;&lt;/A&gt; Application Programming Interface (API) to build and clean corpora to be used in natural language processing.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;One of the papers (Leturia &lt;I&gt;et al&lt;/I&gt;.) describes the &lt;I style="mso-bidi-font-style: normal"&gt;CorpEus&lt;/I&gt; tool, which uses the Live Search (LS) API and which the authors designed to create web corpora for the Basque language. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;Another &lt;A href="http://webascorpus.org/wac3/wac3-WHFletcher-revised.pdf" mce_href="http://webascorpus.org/wac3/wac3-WHFletcher-revised.pdf"&gt;&lt;SPAN style="COLOR: purple"&gt;very interesting paper, by William Fletcher&lt;/SPAN&gt;&lt;/A&gt;, describes the various reasons why that API was found to meet the linguists’ requirements to be able to generate concordances for linguistic research. Let me quote Fletcher here:&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: Symbol; mso-ansi-language: EN; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;·&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: Symbol"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;Of the Search Engines which provide free APIs to developers, Live Search is the most generous by far: it allows 10,000 queries per application id (AppID) per IP address per day; &lt;SPAN lang=EN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"&gt;[TF: Fletcher mentions 10,000 queries per day while Leturia et al. indicate that the API allows 25,000 queries per day. The latter figure is the correct one, in fact, which makes it even more generous]&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: Symbol; mso-ansi-language: EN; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;·&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: Symbol"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;LS provides high-quality search results, with relatively few pages from link farms or “scraper sites”, which repeat content from or link to other pages merely for advertising revenue;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: Symbol; mso-ansi-language: EN; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;·&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: Symbol"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;It also supports search by location, i.e. by country or even latitude and longitude;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: Symbol; mso-ansi-language: EN; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;·&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: Symbol"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;Live Search is more responsive to changes on the Web: there is faster turnover in the top hits returned for a given query than with Google or Yahoo!, and documents in the cache tend to be “fresher”, i.e. updated more frequently;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: Symbol; mso-ansi-language: EN; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;·&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: Symbol"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;The LS cache provides quick, reliable access to the original texts. In documents retrieved from the cache, LS generally detects the character set encoding accurately and converts it to UTF-8, thereby eliminating a potential source of variability and errors;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: Symbol; mso-ansi-language: EN; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;·&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: Symbol"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;LS also converts Adobe Acrobat PDF documents to HTML which closely reflects the formatting of the original;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: Symbol; mso-ansi-language: EN; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;·&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 7pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: Symbol"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;The Live Search API provides direct links to the cache, and the site responds rapidly and at a high transfer rate, permitting very efficient data collection without delays, redirections or dead links.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt; mso-layout-grid-align: none"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;Here are the full references of that paper, in case you want to read the whole story: &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;William H. Fletcher: &lt;A href="http://webascorpus.org/wac3/wac3-WHFletcher-revised.pdf" mce_href="http://webascorpus.org/wac3/wac3-WHFletcher-revised.pdf"&gt;&lt;SPAN style="COLOR: purple"&gt;Implementing a BNC-Compare-able Web Corpus&lt;/SPAN&gt;&lt;/A&gt;, in Fairon, C., Naets, H., Kilgarriff, A., de Schrijver, G-M (eds): &lt;I style="mso-bidi-font-style: normal"&gt;Building and Exploring Web Corpora – Proceedings of the 3&lt;SUP&gt;rd&lt;/SUP&gt; Web as Corpus Workshop, Incorporating Cleaneval&lt;/I&gt; (WAC3-2007, September 2007), UCL, Presses Universitaires de Louvain, 2007. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;It’s cool to see that this search engine is found useful by some members of the computational linguistics communities and not just because it allows 25,000 queries a day. I’m sure more can be done to meet linguists’ needs, but it’s definitely encouraging to read that feedback. If you are interested in this Live Search API, you can check out its “Terms of Use” here: &lt;A href="http://dev.live.com/livesearch/"&gt;&lt;FONT color=#800080&gt;http://dev.live.com/livesearch/&lt;/FONT&gt;&lt;/A&gt;. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/SPAN&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;/SPAN&gt;&lt;/I&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;/SPAN&gt;&lt;/I&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 18pt"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;-- Thierry Fontenelle (Program Manager)&lt;/SPAN&gt;&lt;/I&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;EM&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&lt;/EM&gt;&lt;/P&gt;&lt;/o:p&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;EM&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt" mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6831141" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Web+as+Corpus/">Web as Corpus</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Live+Search/">Live Search</category></item><item><title>Untied Nations or United Nations?</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/12/20/untied-nations-or-united-nations.aspx</link><pubDate>Thu, 20 Dec 2007 04:14:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6811758</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>2</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=6811758</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/12/20/untied-nations-or-united-nations.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="COLOR: #1f497d"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;During my vacation in December 2007, I had a chance to visit a friend of mine who works for the United Nations in Bangkok. &lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="COLOR: #1f497d"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;On a Friday evening right before Secretary-General Ban Ki-moon's visit to the UN Bangkok office, I chatted with his colleagues in the UN building over beer and wine. &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="COLOR: #1f497d"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Many of them said they have a big problem with Office 2003 English spellchecker because it doesn’t correct the most common spelling error in their organization: “the &lt;U&gt;Untied&lt;/U&gt; Nations”. &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="COLOR: #1f497d"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;They were fascinated with the idea of upgrading it to Office 2007 which can correctly identify the error and suggest “United” thanks to its Contextual Spellchecker.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="COLOR: #1f497d"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="COLOR: #1f497d"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;-- Kazami Uchida (International Product Engineer)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6811758" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category></item><item><title>MSR blog on the Microsoft Research Machine Translation system</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/11/20/msr-blog-on-the-microsoft-research-machine-translation-system.aspx</link><pubDate>Wed, 21 Nov 2007 01:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6445520</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>2</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=6445520</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/11/20/msr-blog-on-the-microsoft-research-machine-translation-system.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Our colleagues from the Microsoft Research (MSR) group have started blogging about the statistical machine translation (MSR-MT) system they are developing. We announced the &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2007/09/13/windows-live-translator-microsoft-s-new-machine-translation-web-service.aspx"&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Windows Live Translator&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; when it was launched in September. &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/translation/default.aspx"&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Check out their blog&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; if you want to know all the details about this system. For instance, you will discover how to install and use the &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/translation/archive/2007/11/11/new-on-the-windows-live-gallery-the-toolbar-translator-button.aspx"&gt;&lt;FONT face=Calibri color=#800080 size=3&gt;Windows Live Toolbar Translator Button&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, which is now available in (and for) 12 languages.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&lt;EM&gt;-- Thierry Fontenelle (Program Manager)&lt;/EM&gt;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6445520" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/machine+translation/">machine translation</category></item><item><title>New Blog on Enterprise Search</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/11/16/new-blog-on-enterprise-search.aspx</link><pubDate>Fri, 16 Nov 2007 20:02:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6310247</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>1</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=6310247</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/11/16/new-blog-on-enterprise-search.aspx#comments</comments><description>&lt;P&gt;There's a new blog out there from the team that's&amp;nbsp;working on Enterprise Search for MOSS (Microsoft Office SharePoint Server). They've got tips and tricks for administrators and will be posting info on features. It's a&amp;nbsp;team that we work closely with, delivering query spelling suggestions and tokenization with morphological analysis. Some of the most recent news there is around a new release of Search. Check it out. &lt;A href="http://blogs.msdn.com/enterprisesearch/" mce_href="http://blogs.msdn.com/enterprisesearch/"&gt;http://blogs.msdn.com/enterprisesearch/&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;-- Jay Waltmunson (Program Manager)&lt;/EM&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6310247" width="1" height="1"&gt;</description></item><item><title>The French spelling reform in the Canadian press</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/11/11/the-french-spelling-reform-in-the-canadian-press.aspx</link><pubDate>Sun, 11 Nov 2007 05:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:6081578</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>4</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=6081578</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/11/11/the-french-spelling-reform-in-the-canadian-press.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"&gt;For readers who are interested in the French spelling reform, two very recent articles published in Canadian newspapers in Montreal a few days ago discuss the penetration of the spelling reform, its slow but increasing adoption by teachers and the press, in Canada, Belgium, Switzerland and France. Both articles, which quote Chantal Contant from the &lt;I style="mso-bidi-font-style: normal"&gt;Groupe québécois pour la modernisation de la norme du français&lt;/I&gt;, list the reference dictionaries and computerized tools that take the new spelling into account and, in both cases, the Microsoft Office speller is listed as a tool which covers 100% of the new forms.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in"&gt;&lt;SPAN lang=FR-CA style="FONT-SIZE: 11pt; COLOR: black; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR-CA; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;Chantal Contant fait valoir que certains dictionnaires, tels le Hachette, le Littré et le Bescherelle, ont adopté intégralement les rectifications, tout comme les logiciels de correction &lt;I&gt;Antidote&lt;/I&gt;, &lt;I&gt;Myriade&lt;/I&gt;, &lt;I&gt;ProLexis&lt;/I&gt;, &lt;I&gt;Cordial&lt;/I&gt; et le correcteur de Word.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: 12pt"&gt;&lt;SPAN lang=FR-CA style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR-CA; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; TEXT-INDENT: 0.5in; LINE-HEIGHT: 12pt"&gt;&lt;SPAN lang=FR style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;(L'Actualité, no. Vol: 32 No: 16, 15 octobre 2007, p. 70: Débat - &lt;/SPAN&gt;&lt;A href="http://www.lactualite.com/education/article.jsp?content=20070921_165020_5136"&gt;&lt;SPAN lang=FR style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT color=#800080&gt;Le français frisote&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN lang=FR style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN-LEFT: 0.5in; TEXT-ALIGN: justify"&gt;&lt;SPAN lang=FR-CA style="FONT-SIZE: 11pt; COLOR: windowtext; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR-CA; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;Du côté des ouvrages de référence, Le Petit Robert et Le Petit Larousse sont plus réticents que le dictionnaire Hachette, le Nouveau Littré, les correcteurs Antidote, ProLexis et Word, ou les grammaires Bescherelle et Grevisse, qui intègrent 100 % des changements. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 1.5in; TEXT-INDENT: -1in; mso-line-height-alt: 8.0pt"&gt;&lt;SPAN lang=FR style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;(Le Devoir, LES ACTUALITÉS, mardi 2 octobre 2007, p. A4&amp;nbsp;: &lt;A href="http://www.ledevoir.com/2007/10/02/159095.html"&gt;&lt;FONT color=#800080&gt;Rectifications de l'orthographe&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 1.5in; TEXT-INDENT: -1in; mso-line-height-alt: 8.0pt"&gt;&lt;SPAN class=MsoHyperlink&gt;&lt;B&gt;&lt;SPAN lang=FR style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;A href="http://www.ledevoir.com/2007/10/02/159095.html"&gt;&lt;FONT color=#800080&gt;Les graphies font peu à peu leur chemin&lt;/FONT&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/SPAN&gt;&lt;B&gt;&lt;SPAN lang=FR style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'; mso-ansi-language: FR; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN lang=FR style="FONT-SIZE: 10pt; FONT-FAMILY: 'Trebuchet MS','sans-serif'; mso-ansi-language: FR"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;I blogged a few months ago about the &lt;/SPAN&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2006/07/05/656998.aspx"&gt;&lt;SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT color=#800080&gt;three options offered to the users of the Office 2007 speller&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;. You can also find a brief description of the differences between the traditional spelling and the “new” spelling (which is now also recommended by the French Ministry of Education in its official curriculum) &lt;/SPAN&gt;&lt;A href="http://blogs.msdn.com/correcteurorthographiqueoffice/archive/2005/10/16/481531.aspx"&gt;&lt;SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT color=#800080&gt;here&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 11pt; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Trebuchet MS','sans-serif'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN lang=FR style="mso-ansi-language: FR"&gt;&lt;FONT size=3&gt;&lt;FONT face="Times New Roman"&gt;Thierry Fontenelle – Program Manager&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=6081578" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/Spelling+Reform/">Spelling Reform</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/French/">French</category></item><item><title>Contextual spelling: US English only?</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/10/24/contextual-spelling-us-english-only.aspx</link><pubDate>Wed, 24 Oct 2007 20:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:5654563</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>3</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=5654563</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/10/24/contextual-spelling-us-english-only.aspx#comments</comments><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;Laurie asked us via the Email/Contact link:&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.5in"&gt;&lt;TT&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;I was always under the impression that the Contextual Spell Checker only works if your language is set to English (US) rather than English (UK). However, I have recently seen the blue squiggly lines appear for English (UK).&lt;/SPAN&gt;&lt;/I&gt;&lt;/TT&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;&lt;BR&gt;&lt;BR&gt;&lt;TT&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'; mso-bidi-font-size: 12.0pt; mso-ansi-font-size: 12.0pt"&gt;Can you confirm whether this has come about as a result of a specific Office update?&lt;/SPAN&gt;&lt;/TT&gt;&lt;BR style="mso-special-character: line-break"&gt;&lt;BR style="mso-special-character: line-break"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;TT&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;The contextual speller works for all varieties of English (UK, US, Australian, Canadian). This has been the case since the launch of Office 2007 and there has not been any specific update for that version of Office. If you write something like this:&lt;/SPAN&gt;&lt;/TT&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpFirst style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; COLOR: black; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;SPAN style="mso-list: Ignore"&gt;(a)&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN lang=EN style="FONT-SIZE: 12pt; COLOR: black; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN"&gt;When inserting an Excel chart into a Word document, the chart &lt;U style="text-underline: #0070C0 wavy-heavy"&gt;looses&lt;/U&gt; its &lt;STRONG&gt;color&lt;/STRONG&gt; when the focus is set to the document.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpLast style="MARGIN: 0in 0in 10pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN lang=EN-GB style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN-GB; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;SPAN style="mso-list: Ignore"&gt;(b)&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN lang=EN-GB style="FONT-SIZE: 12pt; COLOR: black; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN-GB"&gt;When inserting an Excel chart into a Word document, the chart &lt;U style="text-underline: #0070C0 wavy-heavy"&gt;looses&lt;/U&gt; its &lt;STRONG&gt;colour&lt;/STRONG&gt; when the focus is set to the document.&lt;/SPAN&gt;&lt;SPAN lang=EN-GB style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'; mso-ansi-language: EN-GB"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;TT&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;You will see the blue squiggles under &lt;I style="mso-bidi-font-style: normal"&gt;looses&lt;/I&gt; whether you are in US English or in UK English mode. If you set (b) to UK English to make sure &lt;I style="mso-bidi-font-style: normal"&gt;colour&lt;/I&gt; is not red-squiggled, &lt;I style="mso-bidi-font-style: normal"&gt;looses&lt;/I&gt; will nevertheless be flagged as a contextual mistake and the contextual speller will suggest &lt;I style="mso-bidi-font-style: normal"&gt;loses&lt;/I&gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/TT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;TT&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;Thanks for giving us the opportunity to dispel that rumo(u)r, Laurie.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/TT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;TT&gt;&lt;SPAN style="FONT-SIZE: 12pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Times New Roman','serif'"&gt;&lt;o:p&gt;&lt;EM&gt;-- &amp;nbsp;Thierry Fontenelle (Program Manager)&lt;/EM&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/TT&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=5654563" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/contextual+speller/">contextual speller</category><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/English/">English</category></item><item><title>When Languages Die</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/10/17/when-languages-die.aspx</link><pubDate>Wed, 17 Oct 2007 03:40:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:5478885</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>12</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=5478885</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/10/17/when-languages-die.aspx#comments</comments><description>&lt;FONT face=Calibri&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;James was talking about &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/naturallanguage/archive/2007/09/21/on-endangered-languages.aspx"&gt;&lt;FONT color=#800080 size=3&gt;endangered languages&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt; the other day. I have just finished reading David Harrison’s new book on “&lt;/FONT&gt;&lt;A href="http://www.oup.com/us/catalog/general/subject/Linguistics/SociolinguisticsAnthropologicalL/?view=usa&amp;amp;ci=9780195181920"&gt;&lt;FONT color=#800080 size=3&gt;When Languages Die – The Extinction of the World’s Languages and the Erosion of Human Knowledge&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt;”, which I discovered via Michael Kaplan’s &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/michkap/archive/2007/06/09/3186017.aspx"&gt;&lt;FONT color=#800080 size=3&gt;blog&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt;. It’s a fascinating account of language disappearance, which takes place because thousands of languages are gradually “crowded out” by bigger languages. Six years ago, there were an estimated 6,900 distinct languages and Harrison points out that by the end of our 21&lt;SUP&gt;st&lt;/SUP&gt; century, only about half of these languages may still be spoken because their speakers will have abandoned them to turn to more dominant, more prestigious or more widely known languages. Harrison brilliantly demonstrates what language death or language extinction means for us. He focuses on the vast body of knowledge that will soon be lost and explores various knowledge systems (moon phases, folk taxonomies, knowledge encoded in traditional calendars, topographic naming systems…) to show how cultural knowledge is packaged in languages and cannot be transferred when people stop using their language. I found the discussion about number systems enlightening and captivating. He points out that counting systems provide a window into human cognition and that a lot is lost when the speakers of a language decide to move to the decimal counting system. His demonstration is simply superb. Harrison argues that it is urgent to document languages and to do whatever we can to preserve them and to encourage their speakers to go on using them. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;Everyone must play their part there. As a software company, we have a number of initiatives to help linguistic communities (see, for instance, the &lt;/FONT&gt;&lt;A href="http://www.microsoft.com/industry/publicsector/government/locallanguage/default.aspx"&gt;&lt;FONT color=#800080 size=3&gt;Microsoft Local Language Program&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt; which provides Language Interface Packs (LIPs) in a wide range of languages, or the &lt;/FONT&gt;&lt;A href="http://members.microsoft.com/wincg/"&gt;&lt;FONT color=#800080 size=3&gt;community glossaries&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt; of IT terms which are built by local volunteers with the aim of helping local groups promote and preserve their languages – I also talked recently, in French, about a &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/correcteurorthographiqueoffice/archive/2007/09/18/un-correcteur-orthographique-et-grammatical-breton-pour-office-2007.aspx"&gt;&lt;FONT color=#800080 size=3&gt;new Breton speller for Office 2007&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt; which was created by a Breton-speaking volunteer who devotes a lot of time and energy to the preservation of his language). We have talked a lot on this blog about proofing tools and building word lists for spellers and other types of tools such as thesauri or word-breakers is certainly something that needs to be done if one wishes to help communities access technology in their languages. To some extent, I feel that Harrison and a group like ours (and several other groups in the company, of course) share a common passion for languages and a common goal: “what scientists can do is to capture an accurate record in the form of recordings and analyses”, he writes. Our technology can certainly help and I hope we will be able to offer even more in the future to help communities preserve their languages. At the same time, Harrison points out that no one but speakers themselves can preserve languages, since there is no such thing as a living human language without speakers (p.10). My sincere hope is that we’ll manage to create the synergies that are necessary to preserve language diversity and perhaps to prevent some languages from dying. Meanwhile, I definitely encourage you to read David Harrison’s book. You won’t regret it.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;FONT size=3&gt;Thierry Fontenelle – Program Manager&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;/FONT&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&lt;/FONT&gt;&lt;/o:p&gt;&amp;nbsp;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=5478885" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/b/naturallanguage/archive/tags/endangered+languages/">endangered languages</category></item><item><title>Fellow linguist blogger in Windows International</title><link>http://blogs.msdn.com/b/naturallanguage/archive/2007/10/01/fellow-linguist-blogger-in-windows-international.aspx</link><pubDate>Mon, 01 Oct 2007 22:40:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:5227216</guid><dc:creator>Natural Language Group Microsoft Office</dc:creator><slash:comments>1</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://blogs.msdn.com/b/naturallanguage/rsscomments.aspx?WeblogPostID=5227216</wfw:commentRss><comments>http://blogs.msdn.com/b/naturallanguage/archive/2007/10/01/fellow-linguist-blogger-in-windows-international.aspx#comments</comments><description>&lt;P&gt;Kieran is a fellow linguist on the Windows International team, working closely with the&amp;nbsp;team delivering Windows Desktop Search. She's got some great insight into language and technology on her "Loneliness of the Long Distance Linguist" blog. Check her out here: &lt;A href="http://blogs.msdn.com/kierans/"&gt;http://blogs.msdn.com/kierans/&lt;/A&gt;. Linguists, we are everywhere! :)&lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;-- Jay Waltmunson (Program Manager)&lt;/EM&gt;&lt;/P&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=5227216" width="1" height="1"&gt;</description></item></channel></rss>