Notes on comments.
Welcome to our blog dedicated to the engineering of Microsoft Windows 7
This post is about disk space and the disk space “consumed” by Windows 7. Disk space is the sort of thing where everyone wants to use less, but the cost of using a bit more relative to the benefits has generally been a positive tradeoff. Things have changed recently with the availability of solid-state drives in capacities significantly smaller than the trend in spinning drives. Traditionally most all software, including Windows, would not hesitate to consume a 100MB on a specific (justified) need when looking at a 60GB (or 1,500GB) drive; with desirable machines shipping with 16GB of solid-state storage, we are looking carefully at the disk space used by Windows—both at setup time and also as a PC “ages”. We also had a specific session at WinHEC on solid-state drives that might be interesting to folks. This post is authored by Michael Beck, a program manager in the core OS deployment feature team. --Steven
Let’s talk about “footprint”. For the purposes of this post, when I say “footprint” I’m talking about the total amount of physical disk space used by Windows. This includes not only the Windows binaries, but all disk space consumed or reserved for system operations. Later in this entry, I’ll discuss in detail how the disk footprint is consumed by various Windows technologies.
A number of comments have asked about disk footprint and what to expect in terms of Windows 7’s usage of disk space. Like many of the design issues we have talked about, disk space is also one where there are tradeoffs involved so this post goes into the details of some of those tradeoffs and also discusses some of the feedback we have received. It should be noted, that we are not at the point where we are committing to system requirements for Windows 7, so consider this background and engineering focus.
To structure this post we’ll take two important points of feedback or questions we have received:
We’ll then talk about the focus and engineering of Windows 7.
We definitely get a lot of questions about the new (to Vista) Windows SxS directory (%System Root%\winsxs) and many folks believe this is a big consumer of disk space as just bringing up the properties on a newly installed system shows over 3000 files and over 3.5 GB of disk consumed. Over time this directory grows to even higher numbers. Yikes--below is an example from a Steven's home PC.
“Modularizing” the operating system was an engineering goal in Windows Vista. This was to solve a number of issues in legacy Windows related to installation, servicing and reliability. The Windows SxS directory represents the “installation and servicing state” of all system components. But in reality it doesn’t actually consume as much disk space as it appears when using the built-in tools (DIR and Explorer) to measure disk space used. The fact that we make it tricky for you to know how much space is actually consumed in a directory is definitely a fair point!
In practice, nearly every file in the WinSxS directory is a “hard link” to the physical files elsewhere on the system—meaning that the files are not actually in this directory. For instance in the WinSxS there might be a file called advapi32.dll that takes up >700K however what’s being reported is a hard link to the actual file that lives in the Windows\System32, and it will be counted twice (or more) when simply looking at the individual directories from Windows Explorer.
The value of this is that the servicing platform (the tools that deliver patches and service packs) in Windows can query the WinSxS directory to determine a number of key details about the state of the system, like what’s installed, or available to be installed (optional components, more on those later), what versions, and what updates are on the system to help determine applicability of Windows patches to your specific system. This functionality gives us increased servicing reliability and performance, and supports future engineering efforts providing additional system layering and great configurability.
The WinSxS directory also enables offline servicing, and makes Windows Vista “safe for imaging”. Prior to Windows Vista, inbox deployment support was through “Setup” only. IT professionals would install a single system, and then leverage any number of 3rd party tools to capture the installed state as a general image they then deployed to multiple systems. Windows wasn’t built to be “image aware”. This meant that greater than 80% of systems were deployed and serviced using a technology that wasn’t supported natively, and required IT departments to create custom solutions to deploy and manage Windows effectively. In addition, state stored in the WinSxS directory can be queried “offline”, meaning the image doesn’t have to be booted or running, and patches can be applied to it. These two features of WinSxS give great flexibility and cost reductions to IT departments who deploy Windows Vista, making it easier to create and then service standard corporate images offline.
While it’s true that WinSxS does consume some disk space by simply existing, and there are a number of metadata files, folders, manifests, and catalogs in it, it’s significantly smaller than reported. The actual amount of storage consumed varies, but on a typical system it is about 400MB. While that is not small, we think the robustness provided for servicing is a reasonable tradeoff.
So why does the shell report hard links the way it does? Hard links work to optimize disk footprint for duplicate files all over the system. Application developers can use this functionality to optimize the disk consumption of their applications as well. It’s critical that any path expected by an application appear as a physical file in the file system to support the appropriate loading of the actual file. In this case, the shell is just another application reporting on the files it sees. As a result of this confusion and a desire to reduce disk footprint, many folks have endeavored to just delete this directory to save space.
There have been several blogs and even some “underground” tools that tell you it’s ok to delete the WinSxS directory, and it’s certainly true that after installation, you can remove it from the system and it will appear that the system boots and runs fine. But as described above, this is a very bad practice, as you’re removing the ability to reliably service, all operating system components and the ability to update or configure optional components on your system. Windows Vista only supports the WinSxS directory on the physical drive in its originally installed location. The risks far outweigh the gains removing it or relocating it from the system, given the data described above.
As we all know adding new functionality consumes additional disk space--in Windows or any software. In reality, “code” takes up a relatively small percentage of the overall Windows footprint. The actual code required for a Windows Vista Ultimate install is just over 2GB, with the rest of the footprint going to “data” broadly defined. Let’s dig deeper into the use of storage in a Windows Vista installation and what we mean by "data".
Reliability and security were core considerations during the engineering process that built Windows Vista. Much of the growth in footprint comes from a number of core reliability features that users depend on for system recovery, performance, data protection, and troubleshooting. Some of these include system restore, hibernation, page file, registry back up, and logging. Each of these represent “backup state” that is available to the system to recover from any number of situations, some planned and others not. Because we know that different customers will want to make different tradeoffs of disk space relative to recovery (especially on small footprint devices) with Windows 7 we want to make sure you have more control than you currently do to decide ahead of time how much disk space to use for these mechanisms, and we will also tune our defaults to be more sensitive to overall consumption due to the changing nature of storage.
System restore and hibernation are features that help users to confidently recover their system and prevent data loss, in a number of situations such as low battery (hibernation), bad application installation or other machine corruption (system restore). Combined, these features consume a large percentage of the footprint. Because of the amount of space these use, they are easy to identify and make decisions regarding.
System restore protects users by taking snapshots of the system prior to changes and on regular intervals. In Windows Vista, system restore, is configured to consume 300mb minimally, and up to 15% of the physical disk. As the amount of space fills up with restore points, System Restore will delete older restore points to make room for new ones. The more space you have, the greater the number of restore points you have available to “roll back” to. We have definitely heard the feedback from Windows Vista customers around system restore and recognize that the it takes significant space and is not easy to tune. Some have already seen the pre-beta for Windows 7 provides an interface to manage the space better.
Hibernate is primarily used on mobile PCs and saves your work to the hard disk and puts the computer in an extremely low power state. Hibernate is used on mobile PCs when the battery drains below a certain threshold or when turning the computer off without using Shut Down to extend battery life as much as possible. On Windows Vista, Hibernate is also automatically used with Sleep on desktop PCs to keep a backup copy of open programs and work. This feature is called Hybrid Sleep and is used to save state to the hard disk in case power fails while the computer is sleeping. Hibernate writes all of the content in memory (RAM) to a file on the hard drive named Hiberfil.sys. Therefore, the size of the reserved Hiberfil.sys is equal to the amount of RAM in the machine. In the Windows Vista timeframe, the amount of RAM being built into computers has increased significantly, thus the disk footprint of Hibernate is more noticeable than before. This space must be reserved up front to guarantee that in a critical low battery situation, the system can easily write memory contents to the disk. Any mobile PC user that has experienced their computer automatically entering Hibernate when the battery is critically low can appreciate the peace of mind this footprint growth provides. While we're talking about RAM and disk footprint in the same paragraph, Mark Russinovich has a post this week on virtual memory and how big the swapfile could/should/can be that you might find interesting.
Now it’s clear that in the description above, I don’t account for the entire footprint required by Windows Vista. For instance, we also include many sample files, videos, high resolution backgrounds that allow users to easily customize their experience, and try out new features, but we’ve covered a couple of the more common questions out there.
It’s important that we consider more than just the size of the system once deployed, but we must also look at how the system grows over time as services write logs, updates, and service packs are installed, system snapshots are taken etc. For many, the “growth” over time of the installation proves to be the most perplexing—and we hear that and need to do better to (a) make smarter choices and (b) make it clearer what space is being consumed and can be reclaimed.
The following table provides one view of the installation footprint of a Windows Vista Premium/Ultimate installation. This includes the full installation, but to make it digestible this has been broken down into some logical categories and also highlights some specific features. Part of the reason to highlight specific feature is to illustrate the “costs” for items that have been raised as questions (or questionable).
Here are some items worth calling out:
Windows disk space consumption has trended larger over time. While not desirable, the degree to which it’s been allowed is due in large part to ever-increasing hard drive capacity, combined with a customer need and engineering focus that focused heavily on recoverability, data protection, increasing breadth of device support, and demand for innovative new features. However, the proliferation of Solid State Drives (SSDs) has challenged this trend, and is pushing us to consider disk footprint in a much more thoughtful way and take that into account for Windows 7.
This doesn’t mean that we’re going to stop adding great features or make Windows less reliable or recoverable. As we look to the future, it’s critical that as we innovate, we do so treating the disk space consumed by our work as a valuable resource, and have a clearer design for how Windows uses it. We want to make sure that we are making smart choices for the vast majority of customers and for those desiring more control providing places to fine tune these choices as appropriate. This design goal isn’t about a type of machine, or specific design, all Windows editions benefit from efforts that focus on a reduction of the overall footprint.
For example, as we consider the driver support discussed above, Windows Vista with SP1 installs almost 1GB of drivers on the system to support plug and play of devices. This local cache can get out of date as IHVs release updates to their drivers, and as a result, users are pushed to Windows update to get the latest version once the device is installed.
Why not extend the PnP user experience to include (or only use) the Windows Update cache of drivers and save some disk space? This has several benefits:
With this example it’s easy to see how engineering for a minimal footprint might actually deliver a better experience for people when attaching new devices to their systems. At the same time, we want to be careful about going too far too soon. We get a tremendous amount of feedback regarding the “plug and play” experience or feedback about costly download times (if download is at all possible). For Windows 7 we are going to continue to be deliberate in what we include based on the telemetry of real world devices and reducing the inbox set to cover the most popular devices around the world. At the same time we will continue a very significant effort around having the best available Windows Update site for all devices we can possibly support.
Windows features installed by default make sense in most cases to support many scenarios. We should consider how we make some features/components (such as Media Center) optional when they are not required rather than installing them by default on every system. We’re committed to make more features of Windows optionally installed. As you might notice today in Windows, when you choose to add a feature that was not installed Windows does not require a source (a DVD or network location). This is because the feature is stashed away as part of a complete Windows install—this is itself a feature. We will always keep features available and will always service them even when components are not installed—that way if you add a component later you do not risk adding a piece of code that might have been exploited earlier. This is another important way we keep Windows up to date and secure, even for optional features.
System growth over time is an area where we need to provide more “transparency”. For instance, Windows will archive previous versions of updated system components to allow robust rollback. A new system will install patches as Windows Update makes them available, just as expected by design. As a Service Pack or other large update is installed that contains or supersedes any of the previous patches; we can simply recover the space used by the old updates sometime after the update is successfully installed.
Windows writes logs in many places to aid in troubleshooting and these logs can grow very large. For instance, when an application crashes, Windows will archive a very large dump file to support analysis of the failure. There are many good reasons for this behavior, but as we change our mindset towards footprint, we need to extend our scenarios to include discussions of how to manage the growth, and recover the disk space consumed whenever possible. Other areas where we are considering the default disk space reserved include System restore and hibernation. On a disk constrained system, the 1GB or more reserved to support hibernation is costly and there may be ways to shrink the size of hiberfil.sys. System restore should be configurable, and default in all cases to the minimally useful number of snapshots vs. a blanket 15% of the system disk.
At WinHEC we had several machines on display with 16GB drives/partitions and on those you could see there was plenty of free disk space. Like all the benchmarks, measuring disk space on the pre-beta is not something we’re encouraging at this time.
In conclusion, as we develop Windows 7 it’s likely that the system footprint will be smaller than Windows Vista with the engineering efforts across the team which should allow for greater flexibility in system designs by PC manufacturers. We will do so with more attention to defaults, more control available to OEMs, end-users and IT pros, and will do so without compromising the reliability and robustness of Windows overall.
Would you say that drivers are the biggest "space hog" on a fresh Windows install?
An easy solution would be to check if the computer is connected to the Internet while installing, and if so offer the user the option to install the drivers or not. I'd be willing to bet that most computers that are connected to the Internet while installing will always be connected to the Internet.
As an English only user that's not at all interested in other languages, being able to remove things like IME components, various fonts, and Lang Pack Resources would save quite a bit of space, nearly as much as the Printer Drivers.
I haven't used Windows Update to grab drivers in Vista yet, but is it improved over XP? When a driver is large (or one's connection is slow), there's no way of knowing whether or not WU is working, or how much it's downloading unless you do something to watch the network traffic.
And there's a very easy way of timing when to clean out superseded files: When the user decides they no longer want to uninstall a service pack by running vsp1cln or some other built-in tool. Extending that to encompass individual patches wouldn't hurt either.
And while I'm a fan of having patches for components that aren't active already in place so I don't have to visit Windows Update after adding one, it'd be nice if the unpatched portions of those components weren't sitting on my hard drive anyway. Does Microsoft have a way of tracking metrics on component addition and removal? For desktops I'd expect that if it's not added shortly after setup, it's unlikely to be added at all. Maybe Windows could cache all the components on the hard drive during OS install, as it does now, but use some kind of heuristic (Hours of use? Just a wild guess.) to determine a point in time that a user isn't likely to be adding any more components and that it's a good time to purge that cache. If you wanted to be really tricky, you could try to gauge "most related" components so that if a user does add something later and needs the DVD, Windows would also cache some of those components for a while. Some way for power users and administrators to override this behavior would also be welcome, be it a GUI, CLI command, or registry change. It'd allow either fine tuning those timing parameters, an "always cache" and "never cache" list, and/or a "purge cache" option.
Just some ideas...
Ooo, disk space. The thing I absolutely hate about Windows in this respect isn't that it uses lots of disk space. I can live with that. 15GB is nothing when I've got 500GB harddrives.
The problem is that Windows decides that all of this must pretty much be one atomic blob. It's 15GB on *one* disk on *one* partition.
This causes problems in several ways. First, for those of us who prefer to keep Windows on a separate partition so we can easily reinstall it, it's hard to predict how much disk space to put on that partition. And later on, when you inevitably run out of disk space *anyway*, being able to move parts of Windows to a separate partition would be a lifesaver.
Of course this would be simpler to do for some components than others. I'm not suggesting that the \System32 folder should be able to be relocated, but all the rollback/uninstaller stuff could easily be relocated, both when installing Windows and later on, when the folder already exists. It seems like half the folders under \Windows don't need to be there at all. "Offline Web Pages"? Isn't that something my browser needs to keep track of, rather than the OS? The log files from each and every installation? I don't mind them existing, but why under \Windows? The place is crowded enough as it is. This is pretty much the same reason why I don't use the many folders you create by default under my user account folder. I don't want my many gigabytes of music stuffed into *that* folder on *that* partition. That's where I want to keep the data I work with and change on a regular basis. My music is much more static, I just need to listen to it. That can easily be somewhere else, even in a folder with no write access, I don't care, just not there.
And Program Files? Great, *one* folder for all programs, which quickly, especially once you install a few games, end up taking hundreds of gigabytes. The idea that all of a user's data can fit into two folders (User folder and Program Files) is just ridiculous.
I could (and do) use hardlinks and junction points to try to split this up, but this brings us to another important point. Pretty much nothing, except the core NTFS filesystem takes these into account. As you showed, disk space usage is computed wrong because Explorer doesn't understand the various kinds of symlinks. Installers *always* fail to understand them as well. If I install something to D:\Games\SomeGame, the installer will check how much disk space is free on D:\, whether or not Games or SomeGame are junction points, which leads to applications refusing to install when there are plenty of disk space, or installers running out of disk space after they've passed the disk space check (Which, coincidentally causes the VS2008 installer to hang so it has to be killed from Task Manager).
Please, if the necessary API's don't yet exist, add them so installers can compute *correct* disk space requirements, and for heavens sake, make your *own* software (like Explorer) aware of the feature you added to the filesystem almost a decade ago.
I already mentioned that I'd like to be able to relocate the folders containing uninstall/rollback information, which really has no business being hidden in \Windows, but another question is, why is it there at all? Yes, it's a valuable *option*, but is it something that all users need for everything they install, forever and ever?
Once a piece of software has turned out to install correctly, and work as expected, I don't really need the original installer any longer, do I? I can't imagine why Id would ever uninstall a Windows service pack or or other update. So why do I have dozens and dozens of $NtUninstall folders under \Windows?
Yes, for the paranoid sysadmin, that might be a useful option. But *option* is the keyword then, isn't it? The rest of us have no intention of ever uninstalling KB954211, whatever it is, so why do I need a $NtUninstallKB954211$ folder?
Finally, relying on Windows Update to install missing drivers is... interesting. I've never ever managed to find the correct driver there. Occasionally, it suggests I install some 6 month old driver, or more often, one with no version number or date at all.
Of course keeping it up to date with new drivers is a pretty big task, and I don't blame you for not offering the latest version of every driver. But failing to tell me which version you're offering to install is unacceptable, and makes Windows Update completely useless for drivers. I need to know which *version* of the driver you're installing, the date at which it was released, and if you're feeling friendly, detection of the currently installed driver version would be handy too. And of course, a link to the manufacturer so I can manually grab the latest driver in case you're offering a very old one.
Instead, what I see now is pretty much "New ethernet driver. See this link for more information", and then one of the infamous links Microsoft does so well, linking to a generic and totally useless help message saying I should contact my system administrator for more information.
It would be nice if there was an option to just include the most basic drivers: keyboard, mouse, monitor, basic video card functionality, IDE/SATA/RAID, and as many NIC/Wireless options as possible. As long as the system can install and boot with the most basic UI and 'net connection, everything else can be grabbed online. There are still some people who have a slow or nonexistent connection - give them the option for the full mess o' drivers. This may not be as important immediately after Windows is released, but a year down the road most expansion cards/motherboards/etc. are going to have updated drivers anyway, so there's no point in recognizing "advanced" functionality over "basic" functionality (in a video card, for instance) if it's just going to get replaced immediately anyway.
For the non-tech-savvy user, this would require an improvement in the relevance/quality of the Windows Update drivers...I've been offered various versions of a Dell printer driver for almost a year and it either never installs (though it never errors out) or it never picks up that it IS installed. That makes me skeptical of trusting any drivers from Windows Update, I just go straight to the manufacturer.
@Jalf: One of the reasons for leaving the option of uninstalling a patch or service pack available is, as you know, to afford for the case when a patch causes an issue. My question for you is: "When do you know that everything's working?" Some bugs are subtle and may not show up for some users for quite some time.
Also, isn't it up to the hardware vendors to supply Microsoft with drivers for Windows Update? If that's the case, blaming on MS for driver revisions is misdirected. If I'm incorrect I apologize. I agree with you on the install experience though -- too many install failures and they should be revoked. Which brings us back to having older drivers on WU.
Yes for most hardware the drivers are provided by the hardware company such as ATI, Nvidia, Logitech ect. But there are many drivers that are not offered though Windows Update. I am not saying that every driver should be but we need to have a higher amount that currently do. Also what happens when a Windows update driver install goes bad. The driver does not fail to install so MS does not find out because Windows Update says it installed corretly which it did but upon reboot the system fails to boot or causes some other issue?
Also the user expirence is vastly diffrent. If you can get a vista sp1 machine and a newer logitech webcam. Plug in the webcam without installing the driver first as the manual states you should. Allow windows to find the driver. It installs, then it says hey you dont have the logitech quickcam software do you want to download it click yes, wait. It walks throught the installer for the latest version of the software for your cam, then has you do one reboot. This is the best end user hardware install expirece I have ever had. Why cant more work like this.
Steven: Could we get a post from the Windows update driver team or whoever maintates that to know what is being done to improve accuracy of Drivers from Windows Update?
Michael, another note. Please be careful about using obscure Microsoft internal lingo in a post like this.
It took me most of the way through the post to figure out that when you were using the word 'inbox', that was internal lingo for 'the set of things that ship in-the-box with windows'.
I think that will cause unnecessary confusion. Even just hyphenating it to in-box as in 'in the box' makes it more understandable.
Larger drives is a pretty poor reason to increase the footprint of any OS. Why can't i have a basic streamlined Windows OS without all the useless fluff pre-installed? All features,extra drivers, eye candy shoud be optional installation. Example: windows Aero adds absolutely no value to my gaming system(my primary use of windows) I have it disabled completely but it still occupies space on my hard drive along with tons of other features i have never used in Windows(like the narrator). The future success of Windows is dependent on a minimalist approach.
Uh-oh, post I was eager to see!
WinSxS and all that 'servicing stack': it's just plain wrong. It's an extremely ugly technology. It's non-any-human-friendly set of files (and btw, what is HKLM\COMPONENTS?), and actually, no one cares if they're not real disk space - they are the enormous bottleneck of perfomance and compatibility (it seems that way for me).
Old programs used to scan hard drive choke opening it. Explorer hangs browsing it (although there is no reason to open it - it's non human-readable anyway). TrustedInstaller eating all CPU - what it exactly does? I'm afraid scanning through this huge directory and NTFS clot associated with it.
That's the one of the biggest things which makes me clench my teeth working with Vista.
Every time is see it, i think - who, and why implemented such an ugly thing? Who made OS a slave for itself?
In W7 you guys overcome UAC which was killing UX for technology sake (and to teach developers). Can you fix that ugly WinSxS stick? I don't believe everyone is okay with it!
I would use Windows 7 - XP time ran out, but please - please - one small requaest: if you can't make servicing stack okay, can you make Windows\WinSxS super-hidden? Pre-Vista Explorer could mask folders with 'barricades', like %SystemDrive% or %ProgramFiles%.
How compressible is hiberfil.sys? I would think that it would compress very well. Most apps store their in-memory data in non-compressed format (like a C string, or arrays of structures that contain much repetitive data).
If it compresses well, why not use Zlib and put more burden on the CPU and less burden on the disk? I think most systems, when hibernating and waking up, have the CPU at <5% utilization, but have the disk at 100%. Supposing hiberfil.sys can compress down to 100MB, you're reading 100MB over slow disk instead of 1GB. In fact, if the disk is fast enough and the compression good enough, you could get rid of sleep altogether if the de-hibernation only takes 3-4 seconds.
I just wanted to put in another vote for the option to do an install where all but 'needed-to-boot-and-get-on-the-network' drivers are retrieved from Windows Update an external-to-windows folder that we can update.
For a great example, look at the System Center Configuration Manager installer. It allows you to run setup explicitly to download updates and put then in a folder to be used during a later install.
One thing that this would require is giving much more information to the user while the system is checking for drivers on Windows Update. Right now, when you plug in a piece of hardware and tell Windows to find check WU, you are presented with a static dialog for a LONG time with no infomation. It always looks like the system has hung and it can take a LONG time.
I still don't quite get why the folder size calculations are not accurate. Can this not be improved?
My main issue with Vista and XP (and infact even versions before XP) is that Windows seems to slow down quite a lot after a few months of using it and you can never get the "new PC" feel without formatting. It would be great to understand why that is and why there isn't a way (a tool or method) or restoring the speed of the OS originally.
@caywen: I believe it's already compressed. But this fact doesn't save any diskspace because Windows has to prepare for the worst case situation: uncompressible ram content.
@d_e: Ah, I didn't know, thanks for correcting me. I made the assumption that it is not since it is the same size as physical RAM. I suppose it's the full size anyways for a reason.
ok I have a quick question. I dont even own a printer. I dont want any of the print drivers at all. This pretains to XP and Vista. is there a way to remove them? I just dont need them and with my SSD being only 16 gig it would really be ncie to have that extra space. please let me know