Terry Zink's Cyber Security Blog

Discussing Internet security in (mostly) plain English

My great idea–quashed by political correctness

My great idea–quashed by political correctness

  • Comments 5

I had a great idea at work.

We’re looking at a future feature wherein we allow users to block email by language.  If you’re an English speaker and never receive mail in Chinese, you can set an option to block all Chinese mail.  Want no Spanish?  Set the option to block Spanish.  Voila, you’ve just achieved foreign language spam filtering at no extra cost.

Most filters do this by doing charset analysis – certain charsets correspond to certain languages.  This is true of Russian, Japanese, Chinese, Korean and Turkish.  Unfortunately, other languages like French, Spanish, German and Portuguese all use the same charset so you can’t use those charsets to block those languages without incurring a lot of false positives.  English also overlaps with a lot of other charsets, and every language uses UTF-8 sometimes to encode it.  In other words, you could block the Russian charset but you wouldn’t block all Russian spam because some Russian spam is encoded in UTF-8; but if you blocked UTF-8, you would block a lot of legitimate English language mail.

I’m not going to give away the feature we’re looking at just yet, but it will be much more accurate than simple charset analysis.

In terms of how to expose this to the user, I had a great idea.  The current blocked encodings dialog box in Outlook 2010 looks like this:

image

You can see that all the languages are in alphabetical order.  That’s not too bad because if you look at the slider bar, there doesn’t look like there’s that many languages and you can navigate through it pretty easily if you had to.

Our new feature will let you select more languages – something like 90 (I didn’t even know that there were that many detectable languages).  If you want to pick and choose the ones you don’t want to receive anymore, you have to navigate a long list of checkboxes.  And navigating a long list of checkboxes is difficult and non-intuitive, especially if the languages start with letters that aren’t close to each other.  It’s easy for your eye to miss, and if you ever want to check your settings, you have to scroll through and mentally keep track of what you picked because everything won’t fit in one window.

I had an idea to make this better.

Instead of putting all 90 languages (or more) in alphabetical order, why not put the top ten most commonly complained about languages at the top of the list?  The bulk of our user base (for now) speaks English and most of them want ways to block spam in languages other than English. Instead of putting everything in alphabetical order, why not put the top 10 most requested-to-be-blocked languages at the top of the list followed by a dotted line, followed by the rest of the languages?

This way, a user could block a language and the ones that they are most likely to pick (19 times out of 20) are right there in front of them with no navigating through a list of choices.  They don’t have to scroll through the list, we’ve thought ahead to put their choices right there in front of them.  It’s like this:

I want to block all mail in Chinese, Russian and Japanese.  I get all this spam in those languages and don’t want it anymore.  Let’s see, how do I do this?

Oh, okay, there’s the Languages button.  Let’s click that… Hey!  The languages I want to block are right there!  <click> <click> <click>  Hmm, I get a lot of Spanish spam and I don’t ever speak to anyone in Spanish, let’s click that one, too.  <click>

Done. <click OK>

The whole process takes 10 seconds or so.  Easy.  But if you have to hunt through and find the languages in alphabetical order, it takes you a second or two (or most likely, 8-10 seconds) to find the languages, and then the helpful suggestions aren’t there either.  Instead of being easy-peasey, it’s just a basic interface.  And how do you check to see what you’ve picked?  You have to memorize the list as you scroll through it, remembering what you clicked.

Thus, my idea is to make it simple for the user by giving the options that almost everyone picks.

I ran this idea past a couple of folks in Marketing and Legal.  Their opinion was no dice.  Why?  Because if you were a user in a particular language that we singled out, this could be offensive or upsetting to them.  A Chinese user might say “Hey, why is my language on the top of this list?  Why am I being singled out?”

I have no defense for that.  Yes, if you’re in a particular country and speak a particular language, the language you speak might just be on the list of top languages that people want to block.  That’s the reality.  And you might find that offensive, it’s true.

But it’s not your fault.  It’s spammers’ fault.  They’re the ones abusing your language.  And lots of English speakers want to block Chinese spam, or Spanish spam, or Russian spam.  But lots of Spanish speakers might want to block English spam.  Lots of German users might want to block Turkish spam, and so forth.  And many other languages won’t make the cut because spammers don’t abuse them, and people never complain about them.  For example, I would not put Swedish on that top 10 list.  I have seen Swedish spam, but it is very rare.  The same goes for Finnish and Catalonian. 

Spammers pick the languages that people use more often and there is a long tail of languages that people use, but not all languages are used equally.  That’s a fact.

But it’s a battle I’m not going to win.  Maaaaaah.

Leave a Comment
  • Please add 2 and 3 and type the answer here:
  • Post
  • Interesting idea.

    How about making a set of 'top ten' languages so that ever country have _their_ top ten most common languages readily available to block? An algorithm maybe? Or at least exclude the language of the country you're in?

    Or mayby do it the other way round, list languages you _want_ to receive mail in on top and default to block all others? After all few can read more than 7 languages anyway.

  • How about inverting it. "I only want email in Japanese or English" instead of "I don't want Chinese, Korean, Russian, that squiggly language on Star Trek or Froolian Beep Speak."

    Surely less pain involved in using a highlighter rather than a thick black marker.

  • Per Siden,

    There are many countries that speak more than one language. For example, India, Switzerland, and so on.

  • Our products compare language with country of IP address of the sender and block if there is a mismatch: spammers often send spam from compromised machines worldwide, in a variety of languages. It is rare for legitimate email to originate in one country and be in a different language.

  • Just filter country-level domains (i.e. .cn, .ru, etc.) and a lot spam won't reach your inbox.

Page 1 of 1 (5 items)