Failover Clustering and Network Load Balancing Team Blog
Cluster Shared Volumes (CSV) is a layer of abstraction on either the ReFS or NTFS file system (which is used to format the underlying private cloud storage). Just as with a non-CSV volume, at times it may be necessary to run ChkDsk and Defrag on the file system. In this blog, I am going to first address the recommended procedure to run Defrag on your CSV, in Windows Server 2012 R2. I will then discuss how ChkDsk is run on your CSVs.
Fragmentation of files on a CSV can impact the perceived file system performance by increasing the seek time to retrieve file system metadata. It is therefore recommended to periodically run Defrag on your CSV volume. Fragmentation is primarily a concern when running dynamic VHDs and less prevalent with static VHDs. On a stand-alone server defrag runs as part of the “Maintenance Task”, so it runs automatically. However, on a CSV volume it will never run automatically, so you need to run it manually or script it to run (potentially using a Clustered Scheduled Task). It is recommended to conduct this process during non-peak production times, as performance may be impacted. The following are the steps to defragment your CSV:
1. Determine if defragmentation is required for your CSV by running the following on an elevated command prompt:
Defrag.exe <CSV Mount Point> /A /U /V
/A Perform analysis on the specified volumes
/U Print the progress of the operation on the screen
/V Print verbose output containing the fragmentation statistics
2. If defragmentation is required for your CSV, put the CSV into redirected mode. This can be achieve in either of the following ways:
a. Using Windows PowerShell© open a new elevated Windows PowerShell console and run the following:
Suspend-ClusterResource <Cluster Disk Name> -RedirectedAccess
b. Using the Failover Cluster Manager right-click on the CSV and select “Turn On Redirected Access”:
CSVFS failed operation as volume is not in redirected mode. (0x8007174F)
3. Run defrag on your CSV by running the following on an elevated command prompt:
Defrag.exe <CSV Mount Point>
Defrag.exe <CSV Mount Point>
4. Once defrag has completed, revert the CSV back into direct mode by using either of the follow methods:
Resume-ClusterResource <Cluster Disk Name>
b. Using the Failover Cluster Manager right-click on the CSV and select “Turn Off Redirected Access”:
During the lifecycle of your file system corruptions may occur which require resolution through ChkDsk. As you are aware, CSVs in Windows Server 2012 R2 also supports the ReFS file system. However, the ReFS filesystem achieves self-healing through integrity checks on metadata. As a consequence, ChkDsk does not need to be run for CSV volumes with the ReFS file system. Thus, this discussion is scoped to corruptions in CSV with the NTFS file system. Also, note the redesigned ChkDsk operation introduced with Windows Server 2012, which separates the ChkDsk scan for errors (online operation) and the ChkDsk fix (offline operation). This results in higher availability for your Private Cloud storage since you only need to take your storage offline to fix corruptions in your storage (which is a significantly faster process than the scan for corruptions). In Windows Server 2012, we integrated ChkDsk /SpotFix into the cluster IsAlive health check for the Physical Disk Resource corresponding to the CSV. As a consequence we will now attempt to fix corruptions in your CSV without any perceptible downtime for your application.
The following is the workflow on Windows Server 2012 R2 systems to scan for NTFS corruptions:
chkdsk.exe <CSV mount point name> /scan
chkdsk.exe <CSV mount point name> /scan
The following is the CSV workflow in Windows Server 2012 R2 to fix corruptions:
chkdsk.exe <CSV mount point name> /SpotFix
Running Defrag or ChkDsk on your CSV, through the Repair-ClusterSharedVolume, is deprecated. It is instead highly encouraged to directly use either Defrag.exe or ChkDsk.exe for your CSV, using the procedure indicated in the preceding sections. The use of the Repair-ClusterSharedVolume cmdlet, however is still supported by Microsoft. To use this cmdlet to run chkdsk or defrag, run the following on a new elevated Windows PowerShell console:
Repair-ClusterSharedVolume <Cluster Disk Name> -ChkDsk –Parameters <ChkDsk parameters>
Repair-ClusterSharedVolume <Cluster Disk Name> –Defrag –Parameters <Defrag parameters>
You can determine the Cluster Disk Name corresponding to your CSV using the Get-ClusterSharedVolume cmdlet by running the following:
Get-ClusterSharedVolume | fl *
Subhasish BhattacharyaProgram ManagerClustering and High AvailabilityMicrosoft
Thank you Subhasish !
FYI we have an Other case open with NetApp: A 700 Go quick format take about 5 minutes : very strange...
Hello Subhasish Bhattacharya
As you already stated, defrag has a bug for analyse process when disk is thin provisioned so we have to switch disk to redirected mode.
In my case it is worse that I am not possible to do defrag /a or Optimize-Volume -analyse in any mode except "Maintenance Mode" what is unacceptable for Hyper-V CSV disks.
When I run defrag /a in redirected mode I received "Slab Analysis: 0% complete..." and then "Incorrect function. (0x80070001)"
Powershell equivalent shows "The specified extrinsic Method does not exist."
Disk is thin provisioned.
When I tried Defrag GUI, analyse process doesn't work same way as optimize when in redirected mode.
Difference is for defrag.exe and powershell command where defrag itself works in redirected mode!
Problem is for SCOM or regular analyse checks which can never pass sucessfully on disks also in redirected mode.
Is this a known bug?
Thanks for your answer
Which OS version are you running? Are you up-to date on your hotfixes? Which storage array do you have?
To make sure I understand correctly...
- When you use defrag.exe or PowerShell you are able to run analyze in redirected mode
- When you use SCOM you have to use maintenance mode? What do you mean by "regular" analyze checks?
Hi Subhasish and sorry for my late answer.
OS is 2012 R2 with latest updates.
Storage array is EMC Symmetrix VMAX, thin provisioned.
When I am trying to use defrag via defrag.exe or Optimize-Volume and disk is in redirected mode, if I try to -ANALYZE only, it will fail. Slab Analysis (which is run as part of -analyze process) will return "The specified extrinsic Method does not exist" so command is cancelled.
If I tried to use -Defrag param instead of -Analyze, where slab analysis is not used I suppose (-Verbose doesn't mention SLAB Analysis is run in this case), command completes successfully.
In short, defrag analysis (Optimize-Volume -Analyze) is not possible in normal or redirected mode. Only maintenance mode allows to pass defrag analysis successfully. But if I use defragging without analyzing disk (Optimize-Volume -Defrag), it completed successfully, perhaps because there is no "Slab Analysis" used which seems to be root cause.
If Defrag GUI is used, it is not possible to Analyze and also Defrag disks in Redirected mode. It seems it is because GUI forces -analyze all the time before "-defrag only" is applied so GUI is useless here.
About SCOM: I mentioned it because it checks disks for fragmentation so (Optimize-Volume -Analyze equivalent) is used to detect if disk is fragmented. This can't complete successfully because of the bug mentioned above, also in case disk is in "redirected mode".
Thanks for any feedback
Thanks for the information!
I believe we have identified the cause of this issue and will be rolling out a fix for WS2012 R2 (and possibly WS2012). I don't have an estimate for when the QFE will be available but I expect within the next 3 months.
The NetApp array has options that allow it to be Thinly provisioned, but report differently. To check this use the command 'lun show -vserver vsa1 -volume vol2 -lun lunname -fields space-allocation. It will tell you if space allocation reporting is on. use the LUN modify option to change the setting. With this disabled, the LUN will not report back as thin to the OS, and defrag may do the wrong things.
Another option that works more broadly is to use the NetApp Powershell Toolkit (free download) and works on CSVs or regular volumes, Win2008r2-2012r2, and can be used to execute a the same trim type operations. Command is 'invoke-nahostspacereclaim C:\ClusterStorage\Volume1' or whatever volume you want thined out.
For more specific help, are you running 7-Mode or C-Mode, and what version of the Array code (8.1, 8.2, etc)
any update on QFE ?
I have the same 'defrag /a in redirected mode received "Slab Analysis: 0% complete..." and then "Incorrect function. (0x80070001)"'
and would appreciate a solution :)
@AndrzejP, Thanks for your note. A QFE for this issue is currently slated for an October 2014 release...
i also have this error on a 2012R2 Hyper-V Cluster:
The slab consolidation / trim operation cannot be performed because the volume alignment is invalid. (0x89000029)
CSV LUNs are all formatted with 64k. Storage: 3PAR StorServ.
Any hints how to fix this? I´ve never seen this before since 2008.
You might need to contact your storage vendor HP to help you fix this.
and tell them the violation is that OptimalUnmapGranularity or UnmapGranularityAlignment aren’t multiples of cluster size. If you know the slab size, you can choose a cluster size that meets these requirements.
So the options for you might be to
1) Contact HP, find out the values of the slab size and slab alignment change your cluster size to be a multiple of this.
2) Contact HP to change the value of your slab size and slab alignment to be conformant.
Hope this helps!
Good info there.
Been doing a search on both the Optimize-Volume command and the defrag command for use on CSVs and a couple of things stand out. At least for me.
1) Nobody tells you what to expect. Optimize-Volume did three passes on a 13.8TB Volume. All about the same speed taking a total of a seven days. Defrag did 5 passes with the first one taking about four days and the last four completing in about 10 hours total. This volume was listed as being 52% fragmented.
2) Yes, you can put a running CSV into redirected mode and no VM machines on that volume will stop working and it really is as simple as that.
I have no idea if these results are the norm or we just mutated into some kind of twilight zone.
Now for the followup:
After running both Optimize-Volume and Defrag to actually do the defrag, the volume is still 52% fragmented even though both runs reported success.
Makes me wonder what I missed or if this is really the way to do defragementation. Has anyone else encountered this last of expected results?