• The Old New Thing

    What happens when you specify RegexOptions.ECMAScript?

    • 8 Comments

    The RegexOptions.ECMAScript flag changes the behavior of .NET regular expressions. One of the changes I had discussed earlier was with respect to matching digits. For those who want to know more, a summary of the differences is documented in MSDN under the devious title "ECMAScript vs. Canonical Matching Behavior".

    Apparently some people had trouble finding that page, so I figured I'd point to it explicitly.

  • The Old New Thing

    A visual history of spam (and virus) email

    • 156 Comments

    I have kept every single piece of spam and virus email since mid-1997. Occasionally, it comes in handy, for example, to add naïve Bayesian spam filter to my custom-written email filter. And occasionally I use it to build a chart of spam and virus email.

    The following chart plots every single piece of spam and virus email that arrived at my work email address since April 1997. Blue dots are spam and red dots are email viruses. The horizontal axis is time, and the vertical axis is size of mail (on a logarithmic scale). Darker dots represent more messages. (Messages larger than 1MB have been treated as if they were 1MB.)

    Note that this chart is not scientific. Only mail which makes it past the corporate spam and virus filters show up on the chart.

    Why does so much spam and virus mail get through the filters? Because corporate mail filters cannot take the risk of accidentally classifying valid business email as spam. Consequently, the filters have to make sure to remove something only if they has extremely high confidence that the message is unwanted.

    Okay, enough dawdling. Let's see the chart.

    Overall statistics and extrema:

    • First message in chart: April 22, 1997.
    • Last message in chart: September 10, 2004.
    • Smallest message: 372 bytes, received March 11, 1998.
      From: 15841.
      To: 15841.
      Subject: About your account...
      Content-Type: text/plain; charset=ISO-8859-1
      Content-Transfer-Encoding: 7bit
      
      P
      
    • Largest message: 1,406,967 bytes, received January 8, 2004. HTML mail with a lot of text including 41 large images. A slightly smaller version was received the previous day. (I guess they figured that their first version wasn't big enough, so they sent out an updated version the next day.)
    • Single worst spam day by volume: January 8, 2004. That one monster message sealed the deal.
    • Single worst spam day by number of messages: August 22, 2002. 67 pieces of spam. The vertical blue line.
    • Single worst virus day: August 24, 2003. This is the winner both by volume (1.7MB) and by number (49). The red splotch.
    • Totals: 227.6MB of spam in roughly 19,000 messages. 61.8MB of viruses in roughly 3500 messages.

    Things you can see on the chart:

    • Spam went ballistic starting in 2002. You could see it growing in 2001, but 2002 was when it really took off.
    • Vertical blue lines are "bad spam days". Vertical red lines are "bad virus days".
    • Horizontal red lines let you watch the lifetime of a particular email virus. (This works only for viruses with a fixed-size payload. Viruses with variable-size payload are smeared vertically.)
    • The big red splotch in August 2003 around the 100K mark is the Sobig virus.
    • The horizontal line in 2004 that wanders around the 2K mark is the Netsky virus.
    • For most of this time, the company policy on spam filtering was not to filter it out at all, because all the filters they tried had too high a false-positive rate. (I.e., they were rejecting too many valid messages as spam.) You can see that in late 2003, the blue dot density diminished considerably. That's when mail administrators found a filter whose false-positive rate was low enough to be acceptable.

    As a comparison, here's the same chart based on email received at one of my inactive personal email addresses.

    This particular email address has been inactive since 1995; all the mail it gets is therefore from harvesting done prior to 1995. (That's why you don't see any red dots: None of my friends have this address in their address book since it is inactive.) The graph doesn't go back as far because I didn't start saving spam from this address until late 2000.

    Overall statistics and extrema:

    • First message in chart: September 2, 2000.
    • Last message in chart: September 10, 2004.
    • Smallest message: 256 bytes, received July 24, 2004.
      Received: from dhcp065-025-005-032.neo.rr.com ([65.25.5.32]) by ...
               Sat, 24 Jul 2004 12:30:35 -0700
      X-Message-Info: 10
      
    • Largest message: 3,661,900 bytes, received April 11, 2003. Mail with four large bitmap attachments, each of which is a Windows screenshot of Word with a document open, each bitmap showing a different page of the document. Perhaps one of the most inefficient ways of distributing a four-page document.
    • Single worst spam day by volume: April 11, 2003. Again, the monster message drowns out the competition.
    • Single worst spam day by number of messages: October 3, 2003. 74 pieces of spam.
    • Totals: 237MB of spam in roughly 35,000 messages.

    I cannot explain the mysterious "quiet period" at the beginning of 2004. Perhaps my ISP instituted a filter for a while? Perhaps I didn't log on often enough to pick up my spam and it expired on the server? I don't know.

    One theory is that the lull was due to uncertainty created by the CAN-SPAM Act, which took effect on January 1, 2004. I don't buy this theory since there was no significant corresponding lull at my other email account, and follow-up reports indicate that CAN-SPAM was widely disregarded. Even in its heyday, compliance was only 3%.

    Curiously, the trend in spam size for this particular account is that it has been going down since 2002. In the previous chart, you could see a clear upward trend since 1997. My theory is that since this second dataset is more focused on current trends, it missed out on the growth trend in the late 1990's and instead is seeing the shift in spam from text to <IMG> tags.

  • The Old New Thing

    Interlocked operations don't solve everything

    • 17 Comments

    Interlocked operations are a high-performance way of updating DWORD-sized or pointer-sized values in an atomic manner. Note, however, that this doesn't mean that you can avoid the critical section.

    For example, suppose you have a critical section that protects a variable, and in some other part of the code, you want to update the variable atomically. "Well," you say, "this is a simple imcrement, so I can skip the critical section and just do a direct InterlockedIncrement. Woo-hoo, I avoided the critical section bottleneck."

    Well, except that the purpose of that critical section was to ensure that nobody changed the value of the variable while the protected section of code was running. You just ran in and changed the value behind that code's back.

    Conversely, some people suggested emulating complex interlocked operations by having a critical section whose job it was to protect the variable. For example, you might have an InterlockedMultiply that goes like this:

    // Wrong!
    LONG InterlockedMultiply(volatile LONG *plMultiplicand, LONG lMultiplier)
    {
      EnterCriticalSection(&SomeCriticalSection);
      LONG lResult = *plMultiplicand *= lMultiplier;
      LeaveCriticalSection(&SomeCriticalSection);
      return lResult;
    }
    

    While this code does protect against two threads performing an InterlockedMultiply against the same variable simultaneously, it fails to protect against other code performing a simple atomic write to the variable. Consider the following:

    int x = 2;
    Thread1()
    {
      InterlockedIncrement(&x);
    }
    
    Thread2()
    {
      InterlockedMultiply(&x, 5);
    }
    

    If the InterlockedMultiply were truly interlocked, the only valid results would be x=15 (if the interlocked increment beat the interlocked multiply) or x=11 (if the interlocked multiply beat the interlocked increment). But since it isn't truly interlocked, you can get other weird values:

    Thread 1Thread 2
    x = 2 at start
    InterlockedMultiply(&x, 5)
    EnterCriticalSection
    load x (loads 2)
    InterlockedIncrement(&x);
    x is now 3
    multiply by 5 (result: 10)
    store x (stores 10)
    LeaveCriticalSection
    x = 10 at end

    Oh no, our interlocked multiply isn't very interlocked after all! How can we fix it?

    If the operation you want to perform is a function solely of the starting numerical value and the other function parameters (with no dependencies on any other memory locations), you can write your own interlocked-style operation with the help of InterlockedCompareExchange.

    LONG InterlockedMultiply(volatile LONG *plMultiplicand, LONG lMultiplier)
    {
      LONG lOriginal, lResult;
      do {
        lOriginal = *plMultiplicand;
        lResult = lOriginal * lMultiplier;
      } while (InterlockedCompareExchange(plMultiplicand,
                                          lResult, lOriginal) != lOriginal);
      return lResult;
    }
    

    [Typo in algorithm fixed 9:00am.]

    To perform a complicated function on the multiplicand, we perform three steps.

    First, capture the value from memory: lOriginal = *plMultiplicand;

    Second, compute the desired result from the captured value: lResult = lOriginal * lMultiplier;

    Third, store the result provided the value in memory has not changed: InterlockedCompareExchange(plMultiplicand, lResult, lOriginal)

    If the value did change, then this means that the interlocked operation was unsucessful because somebody else changed the value while we were busy doing our computation. In that case, loop back and try again.

    If you walk through the scenario above with this new InterlockedMultiply function, you will see that after the interloping InterlockedIncrement, the loop will detect that the value of "x" has changed and restart. Since the final update of "x" is performed by an InterlockedCompareExchange operation, the result of the computation is trusted only if "x" did not change value.

    Note that this technique works only if the operation being performed is a pure function of the memory value and the function parameters. If you have to access other memory as part of the computation, then this technique will not work! That's because those other memory locations might have changed during the computation and you would have no way of knowing, since InterlockedCompareExchange checks only the memory value being updated.

    Failure to heed the above note results in problems such as the so-called "ABA Problem". I'll leave you to google on that term and read about it. Fortunately, everybody who talks about it also talks about how to solve the ABA Problem, so I'll leave you to read that, too.

    Once you've read about the ABA Problem and its solution, you should be aware that the solution has already been implemented for you, via the Interlocked SList functions.

  • The Old New Thing

    The x86 architecture is the weirdo

    • 67 Comments

    The x86 architecture does things that almost no other modern architecture does, but due to its overwhelming popularity, people think that the x86 way is the normal way and that everybody else is weird.

    Let's get one thing straight: The x86 architecture is the weirdo.

    The x86 has a small number (8) of general-purpose registers; the other modern processors have far more. (PPC, MIPS, and Alpha each have 32; ia64 has 128.)

    The x86 uses the stack to pass function parameters; the others use registers.

    The x86 forgives access to unaligned data, silently fixing up the misalignment. The others raise a misalignment exception, which can optionally be emulated by the supervisor at an amazingly huge performance penalty.

    The x86 has variable-sized instructions. The others use fixed-sized instructions. (PPC, MIPS, and Alpha each have fixed-sized 32-bit instructions; ia64 has fixed-sized 41-bit instructions. Yes, 41-bit instructions.)

    The x86 has a strict memory model, where external memory access matches the order in which memory accesses are issued by the code stream. The others have weak memory models, requiring explicit memory barriers to ensure that issues to the bus are made (and completed) in a specific order.

    The x86 supports atomic load-modify-store operations. None of the others do.

    The x86 passes function return addresses on the stack. The others use a link register.

    Bear this in mind when you write what you think is portable code. Like many things, the culture you grow up with is the one that feels "normal" to you, even if, in the grand scheme of things, it is one of the more bizarre ones out there.

  • The Old New Thing

    How does Windows exploit hyperthreading?

    • 42 Comments

    It depends which version of Windows you're asking about.

    For Windows 95, Windows 98, and Windows Me, the answer is simple: Not at all. These are not multiprocessor operating systems.

    For Windows NT and Windows 2000, the answer is "It doesn't even know." These operating systems are not hyperthreading-aware because they were written before hyperthreading was invented. If you enable hyperthreading, then each of your CPUs looks like two separate CPUs to these operating systems. (And will get charged as two separate CPUs for licensing purposes.) Since the scheduler doesn't realize the connection between the virtual CPUs, it can end up doing a worse job than if you had never enabled hyperthreading to begin with.

    Consider a dual-hyperthreaded-processor machine. There are two physical processors A and B, each with two virtual hyperthreaded processors, call them A1, A2, B1, and B2.

    Suppose you have two CPU-intensive tasks. As far as the Windows NT and Windows 2000 schedulers are concerned, all four processors are equivalent, so it figure it doesn't matter which two it uses. And if you're unlucky, it'll pick A1 and A2, forcing one physical processor to shoulder two heavy loads (each of which will probably run at something between half-speed and three-quarter speed), leaving physical processor B idle; completely unaware that it could have done a better job by putting one on A1 and the other on B1.

    Windows XP and Windows Server 2003 are hyperthreading-aware. When faced with the above scenario, those schedulers will know that it is better to put one task on one of the A's and the other on one of the B's.

    Note that even with a hyperthreading-aware processor, you can concoct pathological scenarios where hyperthreading ends up a net loss. (For example, if you have four tasks, two of which rely heavily on L2 cache and two of which don't, you'd be better off putting each of the L2-intensive tasks on separate processors, since the L2 cache is shared by the two virtual processors. Putting them both on the same processor would result in a lot of L2-cache misses as the two tasks fight over L2 cache slots.)

    When you go to the expensive end of the scale (the Datacenter Servers, the Enterprise Servers), things get tricky again. I refer still-interested parties to the Windows Support for Hyper-Threading Technology white paper.

    Update 06/2007: The white paper appears to have moved.

  • The Old New Thing

    Sometimes the bug isn't apparent until late in the game

    • 42 Comments

    I didn't debug it personally, but I know the people who did. During Windows XP development, a bug arrived on a computer game that crashed only after you got to one of the higher levels.

    After many saved and restored games, the problem was finally identified.

    The program does its video work in an offscreen buffer and transfers it to the screen when it's done. When it draws text with a shadow, it first draws the text in black, offset down one and right one pixel, then draws it again in the foreground color.

    So far so good.

    Except that it didn't check whether moving down and right one pixel was going to go beyond the end of the screen buffer.

    That's why it took until one of the higher levels before the bug manifested itself. Not until then did you accomplish a mission whose name contained a lowercase letter with a descender! Shifting the descender down one pixel caused the bottom row of pixels in the character to extend past the video buffer and start corrupting memory.

    Once the problem was identified, fixing it was comparatively easy. The application compatibility team has a bag of tricks, and one of them is called "HeapPadAllocation". This particular compatibility fix adds padding to every heap allocation so that when a program overruns a heap buffer, all that gets corrupted is the padding. Enable that fix for the bad program (specifying the amount of padding necessary, in this case, one row's worth of pixels), and run through the game again. No crash this time.

    What made this interesting to me was that you had to play the game for hours before the bug finally surfaced.

  • The Old New Thing

    Storsjöodjur hunting season will opening soon

    • 4 Comments

    Scotland doesn't have the corner on monsters in lakes. You'll also find them in Norway, in Sweden (read about a recent expedition), and in Canada, among many, many others. Anywhere there are lakes, there's bound to be a legend about a monster in one of them.

    It appears, however that Sweden's Storsjöodjur is about to lose its protected species status, owing to an inquiry inspired by a man's request to harvest the creature's eggs so he can hatch them.

    As a result, it will soon be open season on Storsjöodjuret. Happy hunting.

    (I find the Swedish word odjur somewhat poetic. It translates as "monster" but literally means "un-animal".)

  • The Old New Thing

    Why isn't the original window order always preserved when you undo a Show Desktop?

    • 52 Comments

    A commenter asked why the original window order is not always preserved when you undo a Show Desktop.

    The answer is "Because the alternative is worse."

    Guaranteeing that the window order is restored can result in Explorer hanging.

    When the windows are restored when you undo a Show Desktop, Explorer goes through and asks each window that it had minimized to restore itself. If each window is quick to respond, then the windows are restored and the order is preserved.

    However, if there is a window that is slow to respond (or even hung), then it loses its chance and Explorer moves on to the next window in the list. That way, a hung window doesn't cause Explorer to hang, too. But it does mean that the windows restore out of order.

  • The Old New Thing

    Why is the page size on ia64 8K?

    • 31 Comments

    On x86 machines, Windows chooses a page size of 4K because that was the only page size supported by that architecture at the time the operating system was designed. (4MB pages were added to the CPU later, in the Pentium as I recall, but clearly that is too large for everyday use.)

    For the ia64, Windows chose a page size of 8K. Why 8K?

    It's a balance between two competing objectives. Large page sizes allow more efficient I/O since you are reading twice as much data at one go. However large page sizes also increase the likelihood that the extra I/O you perform is wasted because of poor locality.

    Experiments were run on the ia64 with various page sizes (even with 64K pages, which were seriously considered at one point), and 8K provided the best balance.

    Note that changing the page size creates all sorts of problems for compatibility. There are large numbers of programs out there that blindly assume that the page size is 4K. Boy are they in for a surprise.

  • The Old New Thing

    Converting a byte[] to a System.String

    • 7 Comments

    For some reason, this question gets asked a lot. How do I convert a byte[] to a System.String? (Yes, this is a CLR question. Sorry.)

    You can use String System.Text.UnicodeEncoding.GetString() which takes a byte[] array and produces a string.

    Note that this is not the same as just blindly copying the bytes from the byte[] array into a hunk of memory and calling it a string. The GetString() method must validate the bytes and forbid invalid surrogates, for example.

    You might be tempted to create a string and just mash the bytes into it, but that violates string immutability and can lead to subtle problems.

Page 380 of 431 (4,310 items) «378379380381382»