Learn about some of the details of tokenization in our service
One use of our service is to break words based on n-gram probability info. No linguistic knowledge necessary.
Top 100K words for Apr10 body stream is now available for analysis.
A quick tutorial on the MicrosoftNgram Python library.
Different models reflect different writing styles on the web.