If you haven't heard of LUN Resync or are not sure why your provider should implement this useful feature, be sure to read Dinesh's excellent introduction post over at the File Cabinet.

Now that you've decided to implement and use Resync, you may want to watch out for these common mistakes / gotchas sent in to us from the field.

Page 80h support is no longer required

Once upon a time, VSS required the implementation of Page 80 identifiers for matching LUNs. This requirement was dropped (and replaced with page 83) in Windows Server 2008. However, this dependency somehow crept back into early implementations of LUN Resync. Luckily one of our partners has spotted this problem, and it has been fixed for the Release Candidate of Windows Server 2008 R2.

Affected builds: Server 2k8 R2 Beta and before
Fixed in: Server 2k8 R2 RC

Page 83h support IS required

While page 80 identifiers are no longer required, page 83 identifiers are. Since any given block on the resync destination can change, VSS must rely on the identifiers that the hardware provider gives us. Without this, we would have no way of knowing which disk was the destination once the resync operation had completed.

Open handles

VSS requires opens an exclusive handle to the destination disks involved with the resynchronization operation. Because of this, you may run into the following event log:

The LUN resynchronization operation failed because the destination disk could not be found or because another application holds an exclusive handle to the destination LUN.

Make sure that all applications have released their handles to the LUN and retry the operation.

More often than not, this message is thrown in the very beginning phase of LUN Resync, before the provider is asked to perform the actual resynchronization operation. The most common culprit has been Disk Management UI. A fix has gone in to late RC builds and is there for RTM. Now, we retry opening disk handles for 5 seconds before failing. In most cases, this should be enough to avoid failure (it is for Disk Management case).

Affected builds: Server 2k8 R2 RC and before
Fixed in: Server 2k8 R2 RTM

Snapshot creation may fail after Resync

There is a known bug that shipped in Server 2008 R2 that affects snapshot creation. Unfortunately it was discovered late in the release and could not be fixed in time. The scenario is as follows:

  1. Perform a resync from volume S' to S
  2. Create a snapshot of volume T
  3. Fails with VSS_E_PROVIDER_VETO, VSS_E_NO_SNAPSHOTS_IMPORTED, maybe others? Sometimes there's an event log complaining about corrupt XML. (Error and event log depends on several factors, one being provider implementation).

There is some improper cleanup on the part of the Resync operation which can lead to some corruption of the internal backup components document. There is no risk of actual data corruption. The problem is that the volumes involved in the resync operation will continue over to the snapshot creation operation. In the example above, volume S will be included with volume T in step 2. We will try to create the snapshot set with S and T, but either one of two things will happen. Either a) the hardware provider returns VETO in LocateLuns call, or b) VSS will time out waiting for an extra volume that will never come.

Workarounds:

  1. Retry the operation. Even though the previous attempt failed, VSS has now cleaned up its internal state and is good to go.
  2. Restart the VSS service. In an elevated command prompt: "net stop vss" "net start vss" and then retry the operation.

Affected builds: Server 2k8 R2 RC and beyond (including RTM)