Windows Store for developers blog
Windows 8 app developer blog
IE Blog
The Windows Blog
Inside SkyDrive blog
Download Windows 8 Release Preview
Windows Dev Center
Follow us @BuildWindows8
The //build/ conference
Windows 8 Release Preview forums
Developer forums
We’ve written about tons of improvements in the OS kernel, networking, and file system. While for most client PCs, the tried and true chkdsk utility is one we rarely use anymore except in very rare circumstances, we are using Window 8 as an opportunity to improve this utility. We wanted to focus on rethinking how the utility works to increase availability and reduce downtime due to chkdsk operations. In looking at the real world usage of chkdsk, we note that corruptions are exceedingly rare though running chkdsk is not. While we’ve worked hard to reduce the manual invocation of disk tools (like defrag) we know many prefer to run them manually “just in case” and so we worked to improve the overall throughput of chkdsk, since running it reduces availability of the machine. With disk capacities becoming extremely large and multi-disk systems more common, we wanted to improve the utility. Kiran Bangalore, a program manager on our core system team, authored this post. --Steven
In this blog post, I’ll talk about the new NTFS health model for Windows 8 and our redesigned tool for disk corruption detection and fixing, the chkdsk utility.
We’ve all experienced the frustration that can be caused by an unexpected chkdsk that pops up while restarting a computer at home or a server at the office. Beyond the surprise, there’s the interruption while waiting for the process to complete and Windows to be available. With Windows 8, we provide quick resolution to these problems when they arise, putting the user in control and making systems more available and more scalable.
One of our key design goals for Windows 8 was to increase availability and reduce the overall down-time of systems; this feature, along with other storage features such as Storage Spaces and the new ReFS file system, helps reduce the complexity of fixing corruptions and increases the overall availability of the entire system.
While exceedingly rare, there are a variety of unique causes for disk corruption today. Whether they are caused by media errors from the hard disk or transient memory errors, corruptions can happen in file system metadata (the information used to map physical blocks to that vacation photo you took last year). To maintain access to your data, Windows must isolate and correct these errors, and the way to do this is by running the chkdsk utility.
In past versions, NTFS implemented a simpler health model, where the file system volume was either healthy or not. In that model, the volume was taken offline for as long as necessary to fix the file system corruptions and bring the volume back to a healthy state. Downtime was directly proportional to the number of files in the volume.
Reliable telemetry data from systems all over the world have shown us that, although corruptions are quite rare, when chkdsk is needed, it can take between a few seconds to a few hours to run, depending on the number of files in the drive–and even longer for larger storage servers.
In Windows Vista and Windows 7, we made significant optimizations to the speed of chkdsk but, as hard disk capacities have continued to double every 18 months and the number of files per volume is increasing at an equal rate, chkdsk has taken longer and longer to complete (even with speed improvements) .
So in Windows 8, we’ve changed the way we approach the health model of NTFS and changed the way we fix corruptions so as to minimize the downtime due to chkdsk. We’ve also introduced a new file system for the future, ReFS, which does not require an offline chkdsk to repair corruptions.
The incredible growth in storage capacity and user data files has necessitated the redesign of the NTFS health model and chkdsk.
There were three important requirements for file system health that our customers made clear:
Our design included changes both in the file system and the chkdsk utility to ensure the best availability. The new design splits the process into the following phases to ensure a coordinated, rapid, and transparent resolution to the corruption.
We developed a new method of communication that describes types of corruptions as “verbs” that act upon the key components and points of the design – the file system driver (NTFS), the self-healing module, the spot-verification service, and the chkdsk utility. All file system corruptions are classified as needing one of 18 different “verbs” that we’ve defined in Windows 8. We have also left room for possible new verb definitions that can help us diagnose issues even better in the future.
Comparison of Windows Server: chkdsk /f vs chkdsk /spotfix
In the new health model, the file system health status transitions through four states – some that are simply informational, and others that require you to act. The health states are:
Windows 8 file system health states
For more advanced users who want to avoid restarting their system to fix a non-system volume corruption, they can open the Properties dialog for the affected volume, and on the Tools tab, they’ll see an option to check the drive for file system errors. Corruption on drives that are not currently in use can be fixed without needing a full restart of the computer.
In Windows 8, we have made the detection and correction of file system errors more transparent and less intrusive. We believe these changes will be a welcome enhancement for you and we look forward to hearing your feedback.
-- Kiran Bangalore Senior Program Manager, Windows Core Storage and File Systems
Your browser doesn't support HTML5 video. Download this video to view it in your favorite media player: High quality MP4 | Lower quality MP4
Q) Will the new health model work on removable drives? Yes, this works on removable drives that report fixed media, like most external hard drives.
Q) How do I enable the new file system health model? You don’t need to do a thing—the new file system health model is enabled by default.
Q) Will the new file system health model apply to Windows Server? Yes, the health model is identical for both server and client. One thing that will be different by default is that the data drives will not be checked or fixed during boot of the system – this maintenance will be left to the administrator when time permits.
Q) Can I move between Windows 8 and Windows 7 and not affect the file system health model? Yes, the file system health model will adapt to whichever operating system version it is mounted on.
Q) Will ReFS need to run chkdsk? ReFS follows a different model for resiliency and does not need to run the traditional chkdsk utility.
Q) Will I ever need to run the old chkdsk /f? There are cases where failing hardware can produce such severe corruption as to make the file system un-mountable; in these cases, you should perform a full, offline chkdsk to fix the file system. If for some reason this fails, we recommend that you restore from a backup.
Q) Is a reboot absolutely required to fix non-system volumes? No, but the Action Center generally provides the simplest experience. If you’re an advanced user, you can fix non-system volumes by opening the properties of the drive, or by running chkdsk \scan <volume>: and chkdsk \spotfix <volume>: from the command line.
Q) I run chkdsk /f often to check the status of our drives, is that needed anymore? No, the system will inform you when a corruption is found, and you can then choose to run the chkdsk /scan to detect all the issues. An online chkdsk /scan will not take away from the availability of the drive or system.
Q) I run read-only chkdsk today to check the status of our drives; do I still need to do this? No, we recommend you run chkdsk/scan instead, since this will also perform all possible online repairs and will also prepare for a spotfix, if needed.