Top 100K words for Apr10 body stream is now available for analysis.
Generative-Mode API gives you new insight to the language data of the web
A quick tutorial on the MicrosoftNgram Python library.
Announcing new datasets from Spring 2010. Now serving 5-grams!
Different models reflect different writing styles on the web.
A very brief introduction of the Microsoft Web N-Gram service
Language Modeling 101: An introduction to conditional probabilities in the context of language data
Language Modeling 102: A lesson on joint probabilities
One use of our service is to break words based on n-gram probability info. No linguistic knowledge necessary.
Learn about some of the details of tokenization in our service
Some simple performance tips that may speed up your WCF application.
What happens when you encounter the unknown?
Introducing the Speller Challenge, a contest from Microsoft Research and Bing.
Working with large lexicons means engineering trade-offs become necessary.
Some additional FAQs for the now-open Microsoft Research Speller Challenge.