I had a requirement to crawl Japanese content and we had an English Install of Sharepoint 2007 (MOSS). Now for some reason the locale was not being identified by the crawler for the HTML (i.e. all the Publishing pages) as 1041 (i.e. the Japanese locale),