Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Concurrency, part 11 - Hidden scalability issues

Concurrency, part 11 - Hidden scalability issues

  • Comments 21
So you're writing a server.  You've done your research, and you've designed your system to be as scalable as you possibly can.

All your linked lists are interlocked lists, you're app uses only one thread per CPU core, you're using fibers to manage your scheduling so that you make full use of your quanta, you've set your thread's processor affinity so that it's locked to a single CPU core, etc.

So you're done, right?

Well, no.  The odds are pretty good that you've STILL got concurrency issues.  But they were hidden from you because the concurrency issues aren't in your application, they're elsewhere in the system.

This is what makes programming for scalability SO darned hard.

So here are some of the common issues where scalability issues are hidden.

The biggest one (from my standpoint, although the relevant people on the base team get on my case whenever I mention it) is the NT heap manager.  When you create a heap with HeapCreate, unless you specify the HEAP_NO_SERIALIZE flag, the heap will have a critical section associated with it (and the process heap is a serialized heap).

What this means is that every time you call LocalAlloc() (or HeapAlloc, or HeapFree, or any other heap APIs), you're entering a critical section.  If your application performs a large number of allocations, then you're going to be acquiring and releasing this critical section a LOT.  It turns out that this single critical section can quickly become the hottest critical section in your process.   And the consequences of this can be absolutely huge.  When I accidentally checked in a change to the Exchange store's heap manager that reduced the number of heaps used by the Exchange store from 5 to 1, the overall performance of the store dropped by 15%.  That 15% reduction in performance was directly caused by serialization on the heap critical section.

The good news is that the base team knows that this is a big deal, and they've done a huge amount of work to reduce contentions on the heap.   For Windows Server 2003, the base team added support for the "low fragmentation heap", which can be enabled with a call to HeapSetInformation.  One of the benefits of switching to the low fragmentation heap (along with the obvious benefit of reducing heap fragmentation) is that the LFH is significantly more scalable than the base heap.

And there are other sources of contention that can occur below your application.  In fact, many of the base system services have internal locks and synchronization structures that could cause your application to block - for instance, if you didn't open your file handles for overlapped I/O, then the I/O subsystem acquires an auto-reset event across all file operations on the file.  This is done entirely under the covers, but can potentially cause scalability issues.

And there are scalability issues that come from physics as well.  For example, yesterday, Jeff Parker asked about ripping CDs from Windows Media Player.  It turns out that there's no point in dedicating more than one thread to reading data from the CD, because the CDROM drive has only one head - it can't read from two locations simultaneously (and on CDROM drives, head motion is particularly expensive).  The same laws of physics hold true for all physical media - I touched on this in the answers to the Whats wrong with this code, part 9 post - you can't speed up hard disk copies by throwing more threads or overlapped I/O at the problem, because file copy speed is ultimately limited by the physical speed of the underlying media - and with only one spindle, it can only read or write to the drive one operation at a time.

But even if you've identified all the bottlenecks in your application, and added disks to ensure that your I/O is as fast as possible, there STILL may be bottlenecks that you've not yet seen.

Next time, I'll talk about those bottlenecks...

  • If a heap's critical section is a bottleneck, maybe the solution is to use less dynamic memory allocation in performance-critical code.
  • Runtime, sure - avoiding the bottleneck is always a good.

    But sometimes it's not possible to avoid the bottleneck. And the LFH helps hugely in that scenario.
  • Thanks Larry you confirmed my suspicions. I had wondered if somehow you were overcoming the laws of physics or just found a work around. I had taken CD Roms apart before and I know there is just one head. I just take hardware apart constantly to see what made them tick and to see if there were any cool parts I could use for anything else. For Example on my desk is a really nice clock on a circuit board but the face of the clock is platters from a hard drive. Make new things out of old things.
  • I'm going offtopic again, but your saying of "added disks to ensure that your I/O is as fast as possible" reminds me that I've always wanted to know about what happens when in XP you have pagefiles manually set on multiple places (on different physical HDD's of course). Does the OS look the amount of IO in queue and use that to determine (when possible - ie. writing new data) which pagefile to use? As its easy to tell that with only single HDD there is a lot of head seeking going on if app is trying to read in new data but at same time old data has to be paged. Of course when possible one would have a HDD with only a pagefile and some backup/archival type of data with little access on them.
  • That's ok Joku.

    First off, on XP, you can legally have only one pagefile.

    Having said that, I'm not sure. I don't believe there are weighting algorithms specially tuned to the paging file, instead, I believe there are generic disk queuing algorithms. You ALWAYS want your paging file to be somewhere other than your server's data drives though.
  • You can *definitely* have more than one pagefile on XP - I'm running two right now.

    Read all about it:;en-us;237740

    Or just go Control Panel / System /Advanced /Performance / Advanced / Virtual Memory
  • There's more to having separate harddrives. Most consumer PCs use IDE drives, which run the bus at the speed of the slowest device and synchronize operations. So if you connect your second harddrive to the same IDE channel you do not gain almost anything, as the I/O on those two drives will be synchronized. And if you connect your second harddrive together with your CD/DVD drive it gets even worse as the I/O operations to that drive are synchronized with operations on your CD/DVD drive. More physical drives make sense on SCSI or at least SATA controllers, not on IDE. Unless you get a separate channel for each device (which is what SATA is doing).

    Oh and XP Pro can have multiple page files, the limit is one per partition (not per drive). Also, the partition has to be mounted with a drive letter, you can't put a page file on a partition that's mounted into a folder.
  • Wow. The issues on concurrency is already gone very much deeper than what I can think of in the beginning. (The lots I can think of is only on deadlock problems, which is programming logic that can be implemented on any language that support multithreading)

    I've learnt new techniques here, but most of them seems to require the use of C++ or whatever which doesn't hide the details from the programmer. Is there any level of control I can put on if I'm using other .NET language such as VB or C#?
  • M Hotchkin, I was unaware we'd documented the registry keys for more than one paging file - that's why I said "legally".

    Jerry - I'd forgotten that - of course more than one pagefile per spindle is a total waste, so...

    Cheong, I'm hoping to talk about .Net but since I'm not a .Net programmer I unfortunately don't have the knowledge to go into this deeply. However most of the issues I mentioned (except for the reference counting issues) are absolutely valid for managed code.
  • > First off, on XP, you can legally have only one pagefile.

    WTF, Were the heck did that coming from?

    Joku, for everything you ever wanted to know (or didnt) on pagefiles and other low level details read some of DriverGuru posts at

    That guy knows his stuff.
  • minor spaz: "you're app" should be "your app"
  • M Knight: I KNOW that you can have more than one paging file. But to my knowledge, when I wrote that, the mechanism for having more than one paging file per drive wasn't documented.

    My mistake was "more than one paging file" should have been "more than one paging file per drive".
  • Take a look at KB 237740 again. From the last sentence of the first paragraph (summary section): "you can create multiple paging files on a single drive by placing them in separate folders"
    I don't know why anyone would want to do that, though.

    If Exchange was bottlenecked on the NT heap, did you look into using a different heap? I thought I remembered some team using Rockall, but I never spent the time learning about it. Not sure what info on it is public. Rockalldll.dll is evidently a publicly-known file (released with what product?), so I figured it was ok to mention Rockall at least.
  • Drew, of course we used our own heap - at one time, it was the mpheap SDK sample, but it's been tweaked HEAVILY for Exchange's use (Exchange has a huge issue with heap fragmentation, so it uses a primitive version of the LFH, for example).

  • 3/4/2005 4:58 PM M. Hotchin

    > Or just go Control Panel / System /Advanced /Performance / Advanced / Virtual Memory

    Yes, that makes me wonder why Mr. Osterman was concerned about whether registry keys were documented or not.

    Now if anyone knows how to make XP obey the settings that can be specified in Control Panel / System /Advanced /Performance / Advanced / Virtual Memory, please say. In my experience it is possible to have 1 or more pagefiles on partitions as specified, until about 2 reboots later. After about 2 reboots later, you suddenly have just 1 pagefile and it's on your C drive, even when your C drive is a miniature little thing holding just 4 files and emergency use from MS-DOS boot floppies and doesn't have room to hold an entire image of your RAM. A different Knowledge Base article admits to the problem and says to download a patch from Intel, but Intel doesn't provide patches either for the Intel ICH5(NON-R) chipset or for an Acer Labs chipset. If anyone knows how to make Windows XP obey the settings, please say.

    3/4/2005 11:35 PM M Knight

    > Joku, for everything you ever wanted to know
    > (or didnt) on pagefiles and other low level
    > details read some of DriverGuru posts at

    Any chance you might have any specific links you can post? I don't see an obvious way to search that site.
Page 1 of 2 (21 items) 12