Engineering Windows 7

Welcome to our blog dedicated to the engineering of Microsoft Windows 7

Disk Space

Disk Space

This post is about disk space and the disk space “consumed” by Windows 7. Disk space is the sort of thing where everyone wants to use less, but the cost of using a bit more relative to the benefits has generally been a positive tradeoff. Things have changed recently with the availability of solid-state drives in capacities significantly smaller than the trend in spinning drives. Traditionally most all software, including Windows, would not hesitate to consume a 100MB on a specific (justified) need when looking at a 60GB (or 1,500GB) drive; with desirable machines shipping with 16GB of solid-state storage, we are looking carefully at the disk space used by Windows—both at setup time and also as a PC “ages”. We also had a specific session at WinHEC on solid-state drives that might be interesting to folks. This post is authored by Michael Beck, a program manager in the core OS deployment feature team. --Steven

Let’s talk about “footprint”. For the purposes of this post, when I say “footprint” I’m talking about the total amount of physical disk space used by Windows. This includes not only the Windows binaries, but all disk space consumed or reserved for system operations. Later in this entry, I’ll discuss in detail how the disk footprint is consumed by various Windows technologies.

A number of comments have asked about disk footprint and what to expect in terms of Windows 7’s usage of disk space. Like many of the design issues we have talked about, disk space is also one where there are tradeoffs involved so this post goes into the details of some of those tradeoffs and also discusses some of the feedback we have received. It should be noted, that we are not at the point where we are committing to system requirements for Windows 7, so consider this background and engineering focus.

To structure this post we’ll take two important points of feedback or questions we have received:

  • What does the WinSxS directory contains and why is it so big, and can I just delete it?
  • Where does all the disk space go for Windows components?

We’ll then talk about the focus and engineering of Windows 7.

WinSxS directory

We definitely get a lot of questions about the new (to Vista) Windows SxS directory (%System Root%\winsxs) and many folks believe this is a big consumer of disk space as just bringing up the properties on a newly installed system shows over 3000 files and over 3.5 GB of disk consumed. Over time this directory grows to even higher numbers. Yikes--below is an example from a Steven's home PC.

Example properties sheet for WinSxS directory.

“Modularizing” the operating system was an engineering goal in Windows Vista. This was to solve a number of issues in legacy Windows related to installation, servicing and reliability. The Windows SxS directory represents the “installation and servicing state” of all system components. But in reality it doesn’t actually consume as much disk space as it appears when using the built-in tools (DIR and Explorer) to measure disk space used. The fact that we make it tricky for you to know how much space is actually consumed in a directory is definitely a fair point!

In practice, nearly every file in the WinSxS directory is a “hard link” to the physical files elsewhere on the system—meaning that the files are not actually in this directory. For instance in the WinSxS there might be a file called advapi32.dll that takes up >700K however what’s being reported is a hard link to the actual file that lives in the Windows\System32, and it will be counted twice (or more) when simply looking at the individual directories from Windows Explorer.

The value of this is that the servicing platform (the tools that deliver patches and service packs) in Windows can query the WinSxS directory to determine a number of key details about the state of the system, like what’s installed, or available to be installed (optional components, more on those later), what versions, and what updates are on the system to help determine applicability of Windows patches to your specific system. This functionality gives us increased servicing reliability and performance, and supports future engineering efforts providing additional system layering and great configurability.

The WinSxS directory also enables offline servicing, and makes Windows Vista “safe for imaging”. Prior to Windows Vista, inbox deployment support was through “Setup” only. IT professionals would install a single system, and then leverage any number of 3rd party tools to capture the installed state as a general image they then deployed to multiple systems. Windows wasn’t built to be “image aware”. This meant that greater than 80% of systems were deployed and serviced using a technology that wasn’t supported natively, and required IT departments to create custom solutions to deploy and manage Windows effectively. In addition, state stored in the WinSxS directory can be queried “offline”, meaning the image doesn’t have to be booted or running, and patches can be applied to it. These two features of WinSxS give great flexibility and cost reductions to IT departments who deploy Windows Vista, making it easier to create and then service standard corporate images offline.

While it’s true that WinSxS does consume some disk space by simply existing, and there are a number of metadata files, folders, manifests, and catalogs in it, it’s significantly smaller than reported. The actual amount of storage consumed varies, but on a typical system it is about 400MB. While that is not small, we think the robustness provided for servicing is a reasonable tradeoff.

So why does the shell report hard links the way it does? Hard links work to optimize disk footprint for duplicate files all over the system. Application developers can use this functionality to optimize the disk consumption of their applications as well. It’s critical that any path expected by an application appear as a physical file in the file system to support the appropriate loading of the actual file. In this case, the shell is just another application reporting on the files it sees. As a result of this confusion and a desire to reduce disk footprint, many folks have endeavored to just delete this directory to save space.

There have been several blogs and even some “underground” tools that tell you it’s ok to delete the WinSxS directory, and it’s certainly true that after installation, you can remove it from the system and it will appear that the system boots and runs fine. But as described above, this is a very bad practice, as you’re removing the ability to reliably service, all operating system components and the ability to update or configure optional components on your system. Windows Vista only supports the WinSxS directory on the physical drive in its originally installed location. The risks far outweigh the gains removing it or relocating it from the system, given the data described above.

Where does the disk space go?

As we all know adding new functionality consumes additional disk space--in Windows or any software. In reality, “code” takes up a relatively small percentage of the overall Windows footprint.  The actual code required for a Windows Vista Ultimate install is just over 2GB, with the rest of the footprint going to “data” broadly defined. Let’s dig deeper into the use of storage in a Windows Vista installation and what we mean by "data".

Reliability and security were core considerations during the engineering process that built Windows Vista. Much of the growth in footprint comes from a number of core reliability features that users depend on for system recovery, performance, data protection, and troubleshooting. Some of these include system restore, hibernation, page file, registry back up, and logging. Each of these represent “backup state” that is available to the system to recover from any number of situations, some planned and others not. Because we know that different customers will want to make different tradeoffs of disk space relative to recovery (especially on small footprint devices) with Windows 7 we want to make sure you have more control than you currently do to decide ahead of time how much disk space to use for these mechanisms, and we will also tune our defaults to be more sensitive to overall consumption due to the changing nature of storage.

System restore and hibernation are features that help users to confidently recover their system and prevent data loss, in a number of situations such as low battery (hibernation), bad application installation or other machine corruption (system restore). Combined, these features consume a large percentage of the footprint. Because of the amount of space these use, they are easy to identify and make decisions regarding.

System restore protects users by taking snapshots of the system prior to changes and on regular intervals. In Windows Vista, system restore, is configured to consume 300mb minimally, and up to 15% of the physical disk. As the amount of space fills up with restore points, System Restore will delete older restore points to make room for new ones. The more space you have, the greater the number of restore points you have available to “roll back” to. We have definitely heard the feedback from Windows Vista customers around system restore and recognize that the it takes significant space and is not easy to tune. Some have already seen the pre-beta for Windows 7 provides an interface to manage the space better.

Hibernate is primarily used on mobile PCs and saves your work to the hard disk and puts the computer in an extremely low power state.  Hibernate is used on mobile PCs when the battery drains below a certain threshold or when turning the computer off without using Shut Down to extend battery life as much as possible.  On Windows Vista, Hibernate is also automatically used with Sleep on desktop PCs to keep a backup copy of open programs and work. This feature is called Hybrid Sleep and is used to save state to the hard disk in case power fails while the computer is sleeping.  Hibernate writes all of the content in memory (RAM) to a file on the hard drive named Hiberfil.sys.  Therefore, the size of the reserved Hiberfil.sys is equal to the amount of RAM in the machine.  In the Windows Vista timeframe, the amount of RAM being built into computers has increased significantly, thus the disk footprint of Hibernate is more noticeable than before. This space must be reserved up front to guarantee that in a critical low battery situation, the system can easily write memory contents to the disk.  Any mobile PC user that has experienced their computer automatically entering Hibernate when the battery is critically low can appreciate the peace of mind this footprint growth provides. While we're talking about RAM and disk footprint in the same paragraph, Mark Russinovich has a post this week on virtual memory and how big the swapfile could/should/can be that you might find interesting.

Now it’s clear that in the description above, I don’t account for the entire footprint required by Windows Vista. For instance, we also include many sample files, videos, high resolution backgrounds that allow users to easily customize their experience, and try out new features, but we’ve covered a couple of the more common questions out there.

It’s important that we consider more than just the size of the system once deployed, but we must also look at how the system grows over time as services write logs, updates, and service packs are installed, system snapshots are taken etc. For many, the “growth” over time of the installation proves to be the most perplexing—and we hear that and need to do better to (a) make smarter choices and (b) make it clearer what space is being consumed and can be reclaimed.

The following table provides one view of the installation footprint of a Windows Vista Premium/Ultimate installation. This includes the full installation, but to make it digestible this has been broken down into some logical categories and also highlights some specific features. Part of the reason to highlight specific feature is to illustrate the “costs” for items that have been raised as questions (or questionable).

Table of disk space utilization of Windows Vists SP1.

Here are some items worth calling out:

  • ~1GB driver support. Windows Vista works with thousands and thousands of different devices. The ability to plug in almost any device, even your old printer and have it get recognized and install automatically is something customers have come to expect from Windows. We receive lots of feedback wanting to remove some or all of these and each release we carefully scrub the “in-box” device support relative to what we see from telemetry in terms of used devices. The ability to install a printer or USB device while offline is a key value, especially with laptops representing over half of all PCs being sold. In the future we can possibly assume “always go to Windows Update” but we’re not there yet in most places in the world.
  • ~1GB of system growth in serviced and superseded components to support robust rollback and recovery, after installing critical security and functionality updates. We receive a lot of positive feedback about the robustness of servicing but at the same time, the desire to rollback a specific fix for any variety of reasons remains an important robustness and reliability measure. We also understand the feedback we have received regarding the disk requirements to install SP1 on top of RTM. We hope folks are aware of the vsp1cln.exe utility in the system32 directory, for those that are in need of disk space.
  • ~1GB hibernation support is necessary in order to prevent data loss when a machine has been in standby for many hours. This can be removed via the Disk Cleanup wizard or via an elevated command prompt (powercfg /h off).
  • ~315mb of Fonts. Windows users speak many different languages, often on the same PC, and wish Windows to “speak” to them. Windows Vista contains native font support to allow users with systems defaulted to one language to be able to read documents, or websites in another. As we know, however, fonts are east to delete should you desire.
  • ~52MB of log files. Whether it is the event log, servicing logs, or device installation logs (or more) this space is consumed and becomes critical when trying to diagnose a failure. These logs are often used by our support personnel or corporate helpdesks to diagnose a specific failure.

Engineering Windows 7

Windows disk space consumption has trended larger over time. While not desirable, the degree to which it’s been allowed is due in large part to ever-increasing hard drive capacity, combined with a customer need and engineering focus that focused heavily on recoverability, data protection, increasing breadth of device support, and demand for innovative new features. However, the proliferation of Solid State Drives (SSDs) has challenged this trend, and is pushing us to consider disk footprint in a much more thoughtful way and take that into account for Windows 7.

This doesn’t mean that we’re going to stop adding great features or make Windows less reliable or recoverable. As we look to the future, it’s critical that as we innovate, we do so treating the disk space consumed by our work as a valuable resource, and have a clearer design for how Windows uses it. We want to make sure that we are making smart choices for the vast majority of customers and for those desiring more control providing places to fine tune these choices as appropriate. This design goal isn’t about a type of machine, or specific design, all Windows editions benefit from efforts that focus on a reduction of the overall footprint.

For example, as we consider the driver support discussed above, Windows Vista with SP1 installs almost 1GB of drivers on the system to support plug and play of devices. This local cache can get out of date as IHVs release updates to their drivers, and as a result, users are pushed to Windows update to get the latest version once the device is installed.

Why not extend the PnP user experience to include (or only use) the Windows Update cache of drivers and save some disk space? This has several benefits:

  1. Because MobilePCs rarely lack a network connection, they can simply get the new driver from the web.
  2. People don’t have to install the driver twice on updated devices because they do the round trip to the web anyways.

With this example it’s easy to see how engineering for a minimal footprint might actually deliver a better experience for people when attaching new devices to their systems. At the same time, we want to be careful about going too far too soon. We get a tremendous amount of feedback regarding the “plug and play” experience or feedback about costly download times (if download is at all possible). For Windows 7 we are going to continue to be deliberate in what we include based on the telemetry of real world devices and reducing the inbox set to cover the most popular devices around the world. At the same time we will continue a very significant effort around having the best available Windows Update site for all devices we can possibly support.

Windows features installed by default make sense in most cases to support many scenarios. We should consider how we make some features/components (such as Media Center) optional when they are not required rather than installing them by default on every system. We’re committed to make more features of Windows optionally installed. As you might notice today in Windows, when you choose to add a feature that was not installed Windows does not require a source (a DVD or network location). This is because the feature is stashed away as part of a complete Windows install—this is itself a feature. We will always keep features available and will always service them even when components are not installed—that way if you add a component later you do not risk adding a piece of code that might have been exploited earlier. This is another important way we keep Windows up to date and secure, even for optional features.

System growth over time is an area where we need to provide more “transparency”. For instance, Windows will archive previous versions of updated system components to allow robust rollback. A new system will install patches as Windows Update makes them available, just as expected by design. As a Service Pack or other large update is installed that contains or supersedes any of the previous patches; we can simply recover the space used by the old updates sometime after the update is successfully installed.

Windows writes logs in many places to aid in troubleshooting and these logs can grow very large. For instance, when an application crashes, Windows will archive a very large dump file to support analysis of the failure. There are many good reasons for this behavior, but as we change our mindset towards footprint, we need to extend our scenarios to include discussions of how to manage the growth, and recover the disk space consumed whenever possible. Other areas where we are considering the default disk space reserved include System restore and hibernation. On a disk constrained system, the 1GB or more reserved to support hibernation is costly and there may be ways to shrink the size of hiberfil.sys. System restore should be configurable, and default in all cases to the minimally useful number of snapshots vs. a blanket 15% of the system disk.

At WinHEC we had several machines on display with 16GB drives/partitions and on those you could see there was plenty of free disk space. Like all the benchmarks, measuring disk space on the pre-beta is not something we’re encouraging at this time.

In conclusion, as we develop Windows 7 it’s likely that the system footprint will be smaller than Windows Vista with the engineering efforts across the team which should allow for greater flexibility in system designs by PC manufacturers. We will do so with more attention to defaults, more control available to OEMs, end-users and IT pros, and will do so without compromising the reliability and robustness of Windows overall.

-Michael Beck

Leave a Comment
  • Please add 2 and 1 and type the answer here:
  • Post
  • After this last install of 6801 I had a total of 12 gb of update backup files after running microsoft update, I exclusively use MS products for programming and general computing.

    Please include an option in Windows 7 update to "Do Not Backup this Update..." and an easy way of purging the system of unwanted files.

    Also consider the fact that all do not use new units, allow the installation of drivers at the end of the Windows install process.

    Also at the beginning of the installation you should have a dialog as to whether the computer is a Laptop or a Desktop, this way only the drivers that are needed are installed, I do not need a mess of drivers that have no application sitting there taking up space, being indexed and generally messing things up.

    Wow as I examine the Registry I find a convoluted pile of...no word to discribe the mess.

    Make it simple make it clean and you will have happy consumers...

    (Slight Deviation from Disk Space)

    And Also throw in some graphics and sound drivers for old machine's that the consumer can install and not be locked out of features that are needed because you have raised the graphical fence...at least have a base score of 1 for all performance tuning.

    I think that your team has done a wonderful job to get it this far in such a short time, but there is a lot more work to do even before it is released.

  • > In practice, nearly every file in the WinSxS

    > directory is a “hard link” to the physical

    > files...

    According to wiki the Hard link is "...a directory reference, or pointer, to a file on a storage volume. The name associated with the file is a label stored in a directory structure that refers the operating system to the file data. As such, more than one name can be associated with the same file. When accessed through different names, any changes made will affect the same file data."

    I.e. if I have one 1GB file named "file.tst" and one hard-link to this file named "hl.to.file.tst" I'll have two files 1GB each (2GB total). But the real data size is just 1GB.

    > While it’s true that WinSxS does consume

    > some disk space by simply existing, and

    > there are a number of metadata files,

    > folders, manifests, and catalogs in it, it’s

    > significantly smaller than reported. The

    > actual amount of storage consumed varies,

    > but on a typical system it is about 400MB.

    Ok. And how I can see this? On my system I can see 10GB for my winsxs folder and there is no way to see that its size is  much smaller. I understand that all the hardlinked files under the winsxs folder counted as many times as hardlinks exist in this folder. But... If we have 5 hardlinks for one 500KB file it's absolutely clear that only 500KB for this file and all the hardlinks is allocated on hard drive. And if we want to back up our system only 500KB have to be backed up, but not 5 times by 500KB.

    For example if I have one file in Linux with 5 hardlinks and I try to back up these files I can simply tar them and then gzip. And I will have one archive with only one file inside and some info about hardlinks to original data.

    And it's wrong for Vista :(.  Vista can back up all hardlinks only as separate files.

    Does anyone tried to back up system partition using some backing up software like Acronis True Image? In my system I have 100GB disk C with 65GB free space. All my system is 35GB. My winsxs folder's size is 15GB. So I have 15GB for winsxs and 20GB for the rest files. When I try to back up my system partition I'm usualy get 15GB archive where all the hardlinks backed up as separate files in spite of being backed up as data and file names for this data :(.

  • It would be great if I could on release version Windows 7 install only core component's without IME Components, Internet Explorer and Speech components. My installed Windows would be smaller and faster.

  • Hi,

    I've written a batch file to clear the winsxs folder in Vista, of redundancies.

    I would like to know if it works on Win7 too, but I don't have access to it.

    So, is there anyone with a spare Win7 install they wouldn't mind testing it out on ?

    If so, download the batch from here:

    http://rapidshare.com/users/6HHIJB

  • New version, WinsxsLite v1.80 is now up.

    Added functionality to relocate folders.

    Enjoy,

    Chris

  • Keris wrote:

    > The reason that Explorer can’t tell you the

    > correct amount of actual disk space taken up

    > by files is because of the way it calculates

    > the information [..] I do believe this is

    > erroneous behavior.  Much like the path length

    > limitations Explorer imposes that NTFS doesn’t

    > have, it is something that needs to be fixed.  

    > The Size on Disk part SHOULD report the actual

    > size on disk, taking into account hardlinks.

    Calculating the used disk space accurately is indeed not very easy. The only graphical tool I know that is capable of doing this is TreeSize Professional from www.jam-software.com. It is aware of NTFS compression, sparse files, hrdlinks and Alternate Data streams (ADS), for the last two an option needs to be activated as it slows down a scan.

    In case of hardlinks there is a special problem: If you have two hardlinks in two different folders pointing to the same file, for which of the folders do you count the file if you don't want to count it twice?

    Eriwik wrote:

    > Generally speaking it is not possible to find

    > the other names that points to some data

    > (unless you search through all files), all you

    > know is that it will be in the same filesystem.

    That means if we see a file having more than one hardlink, we do not know if we already counted it elsewhere. TreeSize solves this problem by counting 1/n of the file's size which has n hardlinks. That way you get pretty exact statistics foe a Windows Vista system. But the WinSxS folder will never show up with 500KB!

  • Please check out the hard link aware Sizesxs utility posted at www.msftmvp.com - love to get feedback and suggestions. This utility requires no admin privileges and scans the winsxs directory in hard link aware manner, summarizing the total disk space used. In verbose mode, it lists every file under winsxs and reports whether it is a hard link or not

    Should it include any other directory besides windows\winsxs?

  • Dilip: Your utility does not work. I get a "not a Valid Win32 application" error.

    Vista Home Premium SP1.

    Michael Beck: Why does WinSxS contain copies of WMVs, JPGs, Media Centre recordings and other "non-sytem" files?

  • "But in reality it doesn’t actually consume as much disk space as it appears when using the built-in tools (DIR and Explorer) to measure disk space used....The actual amount of storage consumed varies, but on a typical system it is about 400MB"

    On my system, Explorer and other tools reports 7GB.  If it was the case that the disk space used is much smaller, then space used for all of C: as reported by Explorer subtracted from my partition size would be much smaller than the actual free space reported by the system, but in fact they match.  What gives?

  • Has the WinSxs problem been adressed in the current Windows 7 RC?

    I can think of a couple of possible solutions :

    - make explorer report the actual size of the WinSxS folder

    - build a system tool that can remove usless files/links, either automatically and/or manually

    Anyway, having just installed the W7 RC on my home PC, it looks like WinSxS works exactly the same as in Vista.

  • Just out of interest, what happens when your winsxs file becomes as big as the partition you've assigned for System?

    I'm about 2GB away from this happening.

  • I think giving advanced install options to not only drop driver files but also font files as well is a good idea. What about specifying your storage location away from the specific drive windows will be installed on then have the install move all the folders that store backup, history install user etc to that drive.

  • Why is your explaination of the winsxs folder exactly the opposite of the explaination posted at: http://blogs.technet.com/askcore/archive/2008/09/17/what-is-the-winsxs-directory-in-windows-2008-and-windows-vista-and-why-is-it-so-large.aspx ??

    Your explaintion: "In practice, nearly every file in the WinSxS directory is a “hard link” to the physical files elsewhere on the system—meaning that the files are not actually in this directory."

    Their explaination: "The WinSxS folder is the only location that the component is found on the system, all other instances of the files that you see on the system are “projected” by hard linking from the component store.  Let me repeat that last point – there is only one instance (or full data copy) of each version of each file in the OS, and that instance is located in the WinSxS folder."

  • Thanks for all the posting and information.  Here is what it should come down to: Great programming, which I think most Microsoft developers are driving toward, must have attitude of being nimble, precise and clean.  Like being green and conservative, or getting the rovers to Mars, or being stranded on a deserted island, every new technological advance, every single line of new or old codes, we should always think about how we can be more efficient and do our best to maximize the output giving the limited resources in long term.  I am not asking for a miracle, it is just a common sense.  Clearly in the case of WinSxS directory for Vista or Windows 7, Microsoft Windows R&D has not been at their best.  I hope they can change that old attitude someday soon.  

  • Too many compromises.

    About a year ago I just got sick of waiting eternities for things to happen in Vista - whether booting - or loading a file. I became paranoid all the time watching the little green lights flashing so hard you'd think the system would melt down, but no indication anywhere about what the computer was doing. Anyway, at last I downloaded a Mint Linux iso, installed it for dual boot. Linux is still snappy after a year and its been dead reliable the whole time (I never got anything like a year out of a Windows installation before). Since then, I've almost never been back to Windows. After reading up on Win7 I still can't see what it offers.  For comparo, linux loads in a snap, uses just about 2 gig of hard disk with all applications installed (office, multimedia, media studio, graphics etc etc) while the naked Vista partition - with no apps installed - consumes 29gb. That's not windows reporting on itself. That's "offline" reporting by a diagnostic boot CD looking at the hdd while it is asleep.

    I think there's a lot more to the MS problems than they're letting on about.

Page 5 of 7 (103 items) «34567