Today's post was delayed slightly but we have good news — announcing the availability of additional language model datasets.  As always, the easiest way to get a list is to simply navigate to http://web-ngram.research.microsoft.com/rest/lookup.svc.  Shown below are the new items, in URN form:

urn:ngram:bing-title:apr10:1
urn:ngram:bing-title:apr10:2
urn:ngram:bing-title:apr10:3
urn:ngram:bing-title:apr10:4
urn:ngram:bing-title:apr10:5
urn:ngram:bing-anchor:apr10:1
urn:ngram:bing-anchor:apr10:2
urn:ngram:bing-anchor:apr10:3
urn:ngram:bing-anchor:apr10:4
urn:ngram:bing-anchor:apr10:5
urn:ngram:bing-body:apr10:1
urn:ngram:bing-body:apr10:2
urn:ngram:bing-body:apr10:3
urn:ngram:bing-body:apr10:4
urn:ngram:bing-body:apr10:5

For those of you familiar with the naming scheme will notice right away that we're now supporting 5-grams for the three main streams.  What's not captured in the naming scheme is that unlike the jun09 dataset for the body stream, the apr10 dataset has a cutoff of 10.  The title and anchor stream still have a cutoff of 0, as did all of the jun09 streams.