Terry Zink's Cyber Security Blog

Discussing Internet security in (mostly) plain English

The other half of accurate metrics

The other half of accurate metrics

  • Comments 5

Referring back to my previous post on accurate metrics referring to spam-in-the-inbox, spam is one side while false positives are the other.

Whereas we measure spam as a proportion of what the user sees, we can measure false positives as a proportion of the user's legitimate mail stream.  I have seen many organizations say that their FP rate is 1/250,000 messages, but that is quite vague.  Is that 250,000 total messages received, spam + nonspam, or is it 1 FP per 250,000 legitimate messages?  If it is per total messages received, then it is pretty easy to hit that metric as spam keeps going up but a person's legitimate mail stream stays the same.

Thus, that leaves us with how many false positives occur per legitimate mail.  I would say that 1/100,000 should be the minimum goal to shoot for.  This corresponds to an FP rate of 0.001%.

The bonus of acquiring both SITI and the FP rate is that we can plot the two metrics on a scatter plot and calculate the correlation coefficient between the two to see if any existing trends exist (ie, does a higher SITI correspond to a lower FP rate?).

Once we attain FP rates and SITI, we need to figure out how bad FPs affect SITI.  For example, suppose we have the following:

FP rate = 1/22,000

SITI = 8%

That's a decent spam metric, but a high false positive rate.  If we baseline the FP rate to 1/100,000, how does that affect (increase) the spam-in-the-inbox number?  One way we could look at it is the following:

Baseline = 1/100,000

FP rate = 1/22,000

100,000 / 22,000 = 4.54

Equivalent SITI = 4.54 x 8% = 36%

That's one way of looking at it, but it assumes that an increase in the FP rate corresponds to a proportional increase in SITI, and that is something I just pulled out of the air and probably not reflective of reality.  More work needs to be done in this area to refine this model.

Leave a Comment
  • Please add 2 and 1 and type the answer here:
  • Post
  • > Is that 250,000 total messages received,

    > spam + nonspam, or is it 1 FP per 250,000

    > legitimate messages?  If it is per total

    > messages received, then it is pretty easy

    > to hit that metric as spam keeps going up

    > but a person's legitimate mail stream stays

    > the same.

    >

    > Thus, that leaves us with how many false

    > positives occur per legitimate mail.

    > I would say that 1/100,000 should be the

    > minimum goal to shoot for. This corresponds

    > to an FP rate of 0.001%.

    I don't understand that. What do you mean with "1/100,000 should be the minimum goal"? Per 100K received messages (spam and ham together) you expect one error (either FP or FN)?

    I know many different companies here in Europe and I am not aware of any of them reaching this "minimum" with their used solution (either self made or from a external supplier).

    And a lot of them use rule based filtering and it is damn easy to trigger a simple FP by using certain words in the mail body.

    So a 1FP per 100K messages appears very good to me. Does Hotmail offer this kind of accuracy?

    // Steve

  • In my experience every vendor who quotes a FP figure bases it on the total number of inbound messages (including those that get 5xx-rejected).

    On the one hand... well, they would wouldn't they? It creates the smallest, most attractive number.

    On the other hand, it is arguably the fairest way to measure FPs, as it reflects the total workload of the spam filter. All those messages have to go through the filter, so it makes sense to reflect them in the calculations.

    Personally, I can see both sides of the argument, but the pragmatic fact is that "the market" measures FPs as a proportion of *total* email, so arguments that they should do otherwise are a bit academic.

  • They count all the inbound traffic? Really? I did not know that. Thanks for explaining.

  • Richi, Stevan,

    Yes, it's counted as FPs per inbound traffic.  That's the way it has been done traditionally but I don't think it's a relevant statistic.  In my next post, I will explain why.

  • Detto [URL=http://www.radtechnik.info/halo-2] halo 2 lo [/URL] quel.

Page 1 of 1 (5 items)