I've been asked by a few folks to describe the Power Pack 1 algorithms for "auto-migration", duplication, and balancing. In short the algorithms are the same with a few minor adjustments made based on customer feedback from our first release and the various Power Pack betas.

For reference there is a document located here with more info on the pre-Power Pack 1

http://download.microsoft.com/download/2/F/C/2FC09C20-587F-4F16-AA33-C6C4C75FB3DD/Windows_Home_Server_Drive_Extender.pdf

The decision of which volume a file will be located is based on one of three questions

  1. Did the file get created for the first time (auto-migration)
  2. Did the user ask for the file to be duplicated (duplication)
  3. Does enough free space one volume allow another volume to offload some files (balancing)

Often caused by:

  • one of the servers volumes filling up
  • bunch of files get deleted
  • new hard disk added or removed

Auto-migration:

When a file is first created it's said to be "auto-migrated"  this basically means a volume with sufficient free space was chosen to hold the file. The decision of which volume is designed to keep related files on the same volume in case of disk failure in case the folder was not chosen by the user for duplication. If you imagine a situation where your music isn't duplicated and you have a physical disk failure. You'd probably prefer to have one or two CD's go missing to having one or two songs from several albums go missing.

This is the algorithm Drive Extender uses to decide which volume to place your files

i.Use the volume with the least amount of free space but greater than 10GB

ii.Use the volume with the most amount of free space so long as it has more space than the Primary Volume (D volume)

iii.Use the Primary Volume (D volume)

If you play this out in your head with imaginary file copies I think you'll see how it ends up clustering files together. You'll end up filling up the volume with the least free space first and then moving to the most used filling it up next.

If we would have used a simple "most free" algorithm we'd get into a situation where once all volumes had the same amount of free space we'd interleave every other song/picture across several volumes.

Duplication:

Auto-migration doesn't duplicate your files or balance your files, it just chooses where the file first goes. Duplication occurs (in PP1) every hour with the goal of making sure your chosen files have multiple copies. The algorithm is to inspect every file looking for change since last duplication.  If the file has changed or if the file has not yet been duplicated we create a duplicate copy using the below algorithm.

Duplication preferences for destination volumes

i.Most empty non-primary volume

ii.Primary volume

The algorithm here is a little different from auto-migration because it will end up interleaving between volumes. We did this because the problem of locality isn't as important since we're in the process of duplicating.

Balancing:

Balancing solve the problem of how to handle disk space imbalances. Just like duplication, balancing occurs every hour in PP1. The need to balance occurs typically with addition of a new volume or when the user deletes a bunch of files. The goal of balancing is to move files off any volume that contains less than 10GB of free space. If this condition happens we say the volume has reached a "danger" level because it's possible to have a situation where files cannot be extended (imagine your outlook .pst file getting bigger and bigger over time).  [EDIT - thanks to Brett Pound at Microsoft for asking me to clarify.  I mentioned an Outlook PST file while thinking of running Outlook locally on the Home Server.  Brett kindly pointed out that using an Outlook PST file over the network is'nt a good idea.  A better example would have been pasting in lots of photos into a Word document or growing a video file by adding in a new video feed.]

The PP1 balancing algorithm starts only when a volume contains less than 10GB of free space. When this occurs the goal is to start moving files to volumes with more space until 20GB is reached. Think of this as mowing your yard. You wait until your significant other tells you the yard is out of hand and you cut back the yard enough so it looks good and so that you don't have to do it again for a while - it's the same idea, in balancing we start moving files when the volume has less than 10GB free and we stop at 20GB so we don't have to come back for a while.

The algorithm for finding free space is the same as in migration with the only difference being we won't push another volume into an unhappy state in order to achieve self happiness. A quick example would be if during balancing a volume with 5 GB of free space we wont push another volume of its comfort range by making it go under 10GB.

Defining these numbers is tough as there are tradeoffs in all directions. If you've got huge hard disks with lots and lots of free space and gigantic files you may prefer bigger numbers (maybe 30GB and 60GB) but if you've got just two drives you may prefer smaller numbers. We used our extensive beta program to measure 10GB and 20GB as good numbers.

[error - apologies but I'm wrong here, these keys did not make the cut for the final release of Power Pack 1.  Maybe we can get them into the next update]

However if you understand the description above you can reconfigure the 10GB and 20GB lines with the registry keys

HKLM\Software\Microsoft\DriveExtender (both are DWORDS)

"SecondaryFreeSpaceDangerLevel" (defaults to 10)

"SecondaryFreeSpaceWarningLevel" (defaults to 20)

both values are measured in GB