Dirty words in Lithuanian

One of the fun things about working in the speech and natural language group is getting access to all the tons of cool language data. Last week we just got a new data set called "Offensive Word List" and it's available in almost every language you can think of. My mother is Lithuanian, so I'm glad that I now have a handy list of terms that I had better not say in front of her, including:
  • pro*isys
  • by*io
  • de*ilas
  • juod*vernis
  • nepi* prot*
  • per*i
  • piz*a
Don't worry, when we have our speech recognition system working in Lithuanian, we'll be sure to handle these words correctly.
Published 08 March 05 05:54 by sprague

Comments

# Sushant Bhatia said on March 8, 2005 10:05 PM:
LOL. Ahh yes..the way we all learn a new language is to first learn the cuss words :-)
# Laurynas said on March 8, 2005 11:09 PM:
I'm a native Lithuanian and I'm not sure how "juodaskvernis" is offensive. It could be translated as "black-skirted". However the rest of this list is quite good WRT being offensive :)
# Romualdas Stonkus said on March 9, 2005 4:18 AM:
Quite a good collection though, but I'd think it's not very good place for these words :)
There is an entire site related to the dirty and pretty insulting words. There is a site on the net about it. If You cannot find it over google, let me know in the email: First name At hotmail.
# Adrian Florea said on March 9, 2005 4:46 AM:
Oh, nice discovery of the day: for "pizda" there is the same word in Lithuanian as in Romanian :-)
# Mikhail Arkhipov (MSFT) said on March 9, 2005 11:03 PM:
As is in Russian.
# Richard Sprague WebLog said on September 5, 2006 8:23 PM:
This new Blackberry software (reported by WashPost and CNet) from MobileVoiceControl sounds...
New Comments to this post are disabled

Search

This Blog

Syndication

Page view tracker