Microsoft Web N-Gram

Bringing you web-scale language model data. Web N-Gram is joint project between Microsoft Bing and Microsoft Research.

Browse by Tags

Tagged Content List
  • Blog Post: Microsoft Research Speller Challenge is open for business

    After a few bumps here and there we have the site up and running. If you prefer a write-up by a professional writer on the subject, I'll refer you to this announcement . Some of you may be wondering why certain design choices were made in the process of designing this challenge. I hope to address...
  • Blog Post: Wordbreakingisacinchwithdata

    For the task of word-breaking, many different approaches exist. Today we're writing about a purely data-driven approach, and it's actually quite straightforward — all we do is a consider every character boundary as a potential for a word boundary, and compare the relative joint probabilities, with...
  • Blog Post: Who doesn't like models?

    If there ever was an overloaded term in Computer Science, it's models. For instance, my colleagues in the eXtreme Computing Group have this terrific ambition to model the entire world ! What we're talking about here is much simpler: it is a representation of a particular corpus. One of the key insights...
Page 1 of 1 (3 items)