Nous sommes un groupe de linguistes informaticiens francophones (français, belges, canadiens) de Redmond (USA). Nous développons des outils linguistiques pour Microsoft. Ce blog est pour vous: nous attendons vos commentaires et suggestions. A vos plumes!

Browse by Tags

Tagged Content List
  • Blog Post: Identifying tokens: Is word-breaking so easy?

    What is a word? It’s basically a question we linguists have to answer when we develop spell-checkers, grammar checkers, when we do automatic dictionary look-up, when we try to interpret (and expand) queries for a search engine, etc… I recently wrote a paper to show that doing word-breaking and tokenization...
Page 1 of 1 (1 items)