Engineering Windows 7

Welcome to our blog dedicated to the engineering of Microsoft Windows 7

Support and Q&A for Solid-State Drives

Support and Q&A for Solid-State Drives

There’s a lot of excitement around the potential for the widespread adoption of solid-state drives (SSD) for primary storage, particularly on laptops and also among many folks in the server world.  As with any new technology, as it is introduced we often need to revisit the assumptions baked into the overall system (OS, device support, applications) as a result of the performance characteristics of the technologies in use.  This post looks at the way we have tuned Windows 7 to the current generation of SSDs.  This is a rapidly moving area and we expect that there will continue to be ways we will tune Windows and we also expect the technology to continue to evolve, perhaps introducing new tradeoffs or challenging other underlying assumptions.  Michael Fortin authored this post with help from many folks across the storage and fundamentals teams.  --Steven

Many of today’s Solid State Drives (SSDs) offer the promise of improved performance, more consistent responsiveness, increased battery life, superior ruggedness, quicker startup times, and noise and vibration reductions. With prices dropping precipitously, most analysts expect more and more PCs to be sold with SSDs in place of traditional rotating hard disk drives (HDDs).

In Windows 7, we’ve focused a number of our engineering efforts with SSD operating characteristics in mind. As a result, Windows 7’s default behavior is to operate efficiently on SSDs without requiring any customer intervention. Before delving into how Windows 7’s behavior is automatically tuned to work efficiently on SSDs, a brief overview of SSD operating characteristics is warranted.

Random Reads: A very good story for SSDs

SSDs tend to be very fast for random reads. Most SSDs thoroughly trounce traditionally HDDs because the mechanical work required to position a rotating disk head isn’t required. As a result, the better SSDs can perform 4 KB random reads almost 100 times faster than the typical HDD (about 1/10th of a millisecond per read vs. roughly 10 milliseconds).

Sequential Reads and Writes: Also Good

Sequential read and write operations range between quite good to superb. Because flash chips can be configured in parallel and data spread across the chips, today’s better SSDs can read sequentially at rates greater than 200 MB/s, which is close to double the rate many 7200 RPM drives can deliver. For sequential writes, we see some devices greatly exceeding the rates of typical HDDs, and most SSDs doing fairly well in comparison. In today’s market, there are still considerable differences in sequential write rates between SSDs. Some greatly outperform the typical HDD, others lag by a bit, and a few are poor in comparison.

Random Writes & Flushes: Your mileage will vary greatly

The differences in sequential write rates are interesting to note, but for most users they won’t make for as notable a difference in overall performance as random writes.

What’s a long time for a random write? Well, an average HDD can typically move 4 KB random writes to its spinning media in 7 to 15 milliseconds, which has proven to be largely unacceptable. As a result, most HDDs come with 4, 8 or more megabytes of internal memory and attempt to cache small random writes rather than wait the full 7 to 15 milliseconds. When they do cache a write, they return success to the OS even though the bytes haven’t been moved to the spinning media. We typically see these cached writes completing in a few hundred microseconds (so 10X, 20X or faster than actually writing to spinning media). In looking at millions of disk writes from thousands of telemetry traces, we observe 92% of 4 KB or smaller IOs taking less than 1 millisecond, 80% taking less than 600 microseconds, and an impressive 48% taking less than 200 microseconds. Caching works!

On occasion, we’ll see HDDs struggle with bursts of random writes and flushes. Drives that cache too much for too long and then get caught with too much of a backlog of work to complete when a flush comes along, have proven to be problematic. These flushes and surrounding IOs can have considerably lengthened response times. We’ve seen some devices take a half second to a full second to complete individual IOs and take 10’s of seconds to return to a more consistently responsive state. For the user, this can be awful to endure as responsiveness drops to painful levels. Think of it, the response time for a single I/O can range from 200 microseconds up to a whopping 1,000,000 microseconds (1 second).

When presented with realistic workloads, we see the worst of the SSDs producing very long IO times as well, as much as one half to one full second to complete individual random write and flush requests. This is abysmal for many workloads and can make the entire system feel choppy, unresponsive and sluggish.

Random Writes & Flushes: Why is this so hard?

For many, the notion that a purely electronic SSD can have more trouble with random writes than a traditional HDD seems hard to comprehend at first. After all, SSDs don’t need to seek and position a disk head above a track on a rotating disk, so why would random writes present such a daunting a challenge?

The answer to this takes quite a bit of explaining, Anand’s article admirably covers many of the details. We highly encourage motivated folks to take the time to read it as well as this fine USENIX paper. In an attempt to avoid covering too much of the same material, we’ll just make a handful of points.

  • Most SSDs are comprised of flash cells (either SLC or MLC). It is possible to build SSDs out of DRAM. These can be extremely fast, but also very costly and power hungry. Since these are relatively rare, we’ll focus our discussion on the much more popular NAND flash based SSDs. Future SSDs may take advantage of other nonvolatile memory technologies than flash.
  • A flash cell is really a trap, a trap for electrons and electrons don’t like to be trapped. Consider this, if placing 100 electrons in a flash cell constitutes a bit value of 0, and fewer means the value is 1, then the controller logic may have to consider 80 to 120 as the acceptable range for a bit value of 0. A range is necessary because some electrons may escape the trap, others may fall into the trap when attempting to fill nearby cells, etc… As a result, some very sophisticated error correction logic is needed to insure data integrity.
  • Flash chips tend to be organized in complex arrangements, such as blocks, dies, planes and packages. The size, arrangement, parallelism, wear, interconnects and transfer speed characteristics of which can and do vary greatly.
  • Flash cells need to be erased before they can be written. You simply can’t trust that a flash cell has no residual electrons in it before use, so cells need to be erased before filling with electrons. Erasing is done on a large scale. You don’t erase a cell; rather you erase a large block of cells (like 128 KB worth). Erase times are typically long -- a millisecond or more.
  • Flash wears out. At some point, a flash cell simply stops working as a trap for electrons. If frequently updated data (e.g., a file system log file) was always stored in the same cells, those cells would wear out more quickly than cells containing read-mostly data. Wear leveling logic is employed by flash controller firmware to spread out writes across a device’s full set of cells. If done properly, most devices will last years under normal desktop/laptop workloads.
  • It takes some pretty clever device physicists and some solid engineering to trap electrons at high speed, to do so without errors, and to keep the devices from wearing out unevenly. Not all SSD manufacturers are as far along as others in figuring out how to do this well.

Performance Degradation Over Time, Wear, and Trim

As mentioned above, flash blocks and cells need to be erased before new bytes can be written to them. As a result, newly purchased devices (with all flash blocks pre-erased) can perform notably better at purchase time than after considerable use. While we’ve observed this performance degradation ourselves, we do not consider this to be a show stopper. In fact, except via benchmarking measurements, we don’t expect users to notice the drop during normal use.

Of course, device manufactures and Microsoft want to maintain superior performance characteristics as best we can. One can easily imagine the better SSD manufacturers attempting to overcome the aging issues by pre-erasing blocks so the performance penalty is largely unrealized during normal use, or by maintaining a large enough spare area to store short bursts of writes. SSD drives designed for the enterprise may have as high as 50% of their space reserved in order to provide lengthy periods of high sustained write performance.

In addition to the above, Microsoft and SSD manufacturers are adopting the Trim operation. In Windows 7, if an SSD reports it supports the Trim attribute of the ATA protocol’s Data Set Management command, the NTFS file system will request the ATA driver to issue the new operation to the device when files are deleted and it is safe to erase the SSD pages backing the files. With this information, an SSD can plan to erase the relevant blocks opportunistically (and lazily) in the hope that subsequent writes will not require a blocking erase operation since erased pages are available for reuse.

As an added benefit, the Trim operation can help SSDs reduce wear by eliminating the need for many merge operations to occur. As an example, consider a single 128 KB SSD block that contained a 128 KB file. If the file is deleted and a Trim operation is requested, then the SSD can avoid having to mix bytes from the SSD block with any other bytes that are subsequently written to that block. This reduces wear.

Windows 7 requests the Trim operation for more than just file delete operations. The Trim operation is fully integrated with partition- and volume-level commands like Format and Delete, with file system commands relating to truncate and compression, and with the System Restore (aka Volume Snapshot) feature.

Windows 7 Optimizations and Default Behavior Summary

As noted above, all of today’s SSDs have considerable work to do when presented with disk writes and disk flushes. Windows 7 tends to perform well on today’s SSDs, in part, because we made many engineering changes to reduce the frequency of writes and flushes. This benefits traditional HDDs as well, but is particularly helpful on today’s SSDs.

Windows 7 will disable disk defragmentation on SSD system drives. Because SSDs perform extremely well on random read operations, defragmenting files isn’t helpful enough to warrant the added disk writing defragmentation produces. The FAQ section below has some additional details.

Be default, Windows 7 will disable Superfetch, ReadyBoost, as well as boot and application launch prefetching on SSDs with good random read, random write and flush performance. These technologies were all designed to improve performance on traditional HDDs, where random read performance could easily be a major bottleneck. See the FAQ section for more details.

Since SSDs tend to perform at their best when the operating system’s partitions are created with the SSD’s alignment needs in mind, all of the partition-creating tools in Windows 7 place newly created partitions with the appropriate alignment.

Frequently Asked Questions

Before addressing some frequently asked questions, we’d like to remind everyone that we believe the future of SSDs in mobile and desktop PCs (as well as enterprise servers) looks very bright to us. SSDs can deliver on the promise of improved performance, more consistent responsiveness, increased battery life, superior ruggedness, quicker startup times, and noise and vibration reductions. With prices steadily dropping and quality on the rise, we expect more and more PCs to be sold with SSDs in place of traditional rotating HDDs. With that in mind, we focused an appropriate amount of our engineering efforts towards insuring Windows 7 users have great experiences on SSDs.

Will Windows 7 support Trim?

Yes. See the above section for details.

Will disk defragmentation be disabled by default on SSDs?

Yes. The automatic scheduling of defragmentation will exclude partitions on devices that declare themselves as SSDs. Additionally, if the system disk has random read performance characteristics above the threshold of 8 MB/sec, then it too will be excluded. The threshold was determined by internal analysis.

The random read threshold test was added to the final product to address the fact that few SSDs on the market today properly identify themselves as SSDs. 8 MB/sec is a relatively conservative rate. While none of our tested HDDs could approach 8 MB/sec, all of our tested SSDs exceeded that threshold. SSD performance ranged between 11 MB/sec and 130 MB/sec. Of the 182 HDDs tested, only 6 configurations managed to exceed 2 MB/sec on our random read test. The other 176 ranged between 0.8 MB/sec and 1.6 MB/sec.

Will Superfetch be disabled on SSDs?

Yes, for most systems with SSDs.

If the system disk is an SSD, and the SSD performs adequately on random reads and doesn’t have glaring performance issues with random writes or flushes, then Superfetch, boot prefetching, application launch prefetching, ReadyBoost and ReadDrive will all be disabled.

Initially, we had configured all of these features to be off on all SSDs, but we encountered sizable performance regressions on some systems. In root causing those regressions, we found that some first generation SSDs had severe enough random write and flush problems that ultimately lead to disk reads being blocked for long periods of time. With Superfetch and other prefetching re-enabled, performance on key scenarios was markedly improved.

Is NTFS Compression of Files and Directories recommended on SSDs?

Compressing files help save space, but the effort of compressing and decompressing requires extra CPU cycles and therefore power on mobile systems. That said, for infrequently modified directories and files, compression is a fine way to conserve valuable SSD space and can be a good tradeoff if space is truly a premium.

We do not, however, recommend compressing files or directories that will be written to with great frequency. Your Documents directory and files are likely to be fine, but temporary internet directories or mail folder directories aren’t such a good idea because they get large number of file writes in bursts.

Does the Windows Search Indexer operate differently on SSDs?

No.

Is Bitlocker’s encryption process optimized to work on SSDs?

Yes, on NTFS. When Bitlocker is first configured on a partition, the entire partition is read, encrypted and written back out. As this is done, the NTFS file system will issue Trim commands to help the SSD optimize its behavior.

We do encourage users concerned about their data privacy and protection to enable Bitlocker on their drives, including SSDs.

Does Media Center do anything special when configured on SSDs?

No. While SSDs do have advantages over traditional HDDs, SSDs are more costly per GB than their HDD counterparts. For most users, a HDD optimized for media recording is a better choice, as media recording and playback workloads are largely sequential in nature.

Does Write Caching make sense on SSDs and does Windows 7 do anything special if an SSD supports write caching?

Some SSD manufacturers including RAM in their devices for more than just their control logic; they are mimicking the behavior of traditional disks by caching writes, and possibly reads. For devices that do cache writes in volatile memory, Windows 7 expects flush commands and write-ordering to be preserved to at least the same degree as traditional rotating disks. Additionally, Windows 7 expects user settings that disable write caching to be honored by write caching SSDs just as they are on traditional disks.

Do RAID configurations make sense with SSDs?

Yes. The reliability and performance benefits one can obtain via HDD RAID configurations can be had with SSD RAID configurations.

Should the pagefile be placed on SSDs?

Yes. Most pagefile operations are small random reads or larger sequential writes, both of which are types of operations that SSDs handle well.

In looking at telemetry data from thousands of traces and focusing on pagefile reads and writes, we find that

  • Pagefile.sys reads outnumber pagefile.sys writes by about 40 to 1,
  • Pagefile.sys read sizes are typically quite small, with 67% less than or equal to 4 KB, and 88% less than 16 KB.
  • Pagefile.sys writes are relatively large, with 62% greater than or equal to 128 KB and 45% being exactly 1 MB in size.

In fact, given typical pagefile reference patterns and the favorable performance characteristics SSDs have on those patterns, there are few files better than the pagefile to place on an SSD.

Are there any concerns regarding the Hibernate file and SSDs?

No, hiberfile.sys is written to and read from sequentially and in large chunks, and thus can be placed on either HDDs or SSDs.

What Windows Experience Index changes were made to address SSD performance characteristics?

In Windows 7, there are new random read, random write and flush assessments. Better SSDs can score above 6.5 all the way to 7.9. To be included in that range, an SSD has to have outstanding random read rates and be resilient to flush and random write workloads.

In the Beta timeframe of Windows 7, there was a capping of scores at 1.9, 2.9 or the like if a disk (SSD or HDD) didn’t perform adequately when confronted with our random write and flush assessments. Feedback on this was pretty consistent, with most feeling the level of capping to be excessive. As a result, we now simply restrict SSDs with performance issues from joining the newly added 6.0+ and 7.0+ ranges. SSDs that are not solid performers across all assessments effectively get scored in a manner similar to what they would have been in Windows Vista, gaining no Win7 boost for great random read performance.

Leave a Comment
  • Please add 2 and 7 and type the answer here:
  • Post
  • Good read, especially the tip about putting a pagefile on the SSD.  I would've expected the pagefile to be bad for the SSD because it would wear out the flash faster!

  • @m.oreilly:

    Yes, fsutil shows the same value in the Beta and the RC, because Trim is supported and enabled by default in both the Beta and the RC. ;)

    I should add that very few (if any) SSD drives in the marketplace today actually support Trim.  Most of the ones that do are next-generation prototypes.  But when they do become available, Windows 7 will take advantage.

    - Craig (NTFS team)

  • I plopped a SSD into build 7100 RC and as stated it works like a charm.

    There are a lot of really great things happening with this Operating System, I have people who come to me for their computer needs testing the RC...all of them will be purchasing the Final Release.

    Thnx

  • All "Certified for Windows 7" logoed drives will support trim right?

  • @Craig (NTFS team):

    <quote>

    Have Trim enabled according to this setting, which you do, means that the filesystem will send Trim commands down the storage stack.  The filesystem doesn't actually know whether this command will be supported or not at a lower level.  When the disk driver receives the command, it will either act on it or ignore it.

    </quote>

    So do the Microsoft IDE Controller disk drivers support it? If you have a 3rd party disk driver provider (i.e. non-Microsoft), then I guess you rely on them implementing the Trim() functionality?

    Thanks for your quick responses!

    -krish

  • thanks for the clarification. is this the final form of trim to be implemented in win7 (barring any future rtm update/sp)? i am using vertex drives which have, at this point, a somewhat proprietary, functioning trim fw, though we are expecting an ms compatible version in about a weeks time. can i assume you are "good to go" re win7 trim, and that drive controller manufacturers will have what they need?

  • I'm using the Windows 7 RC and I love it. I have only a suggestion for you. To unpin some icon from the taskbar make something like drag-n-drop, to put drag the icon to the taskbar (like Win7 RC) and to unpin drag the icon to the desktop, disappearing with the icon

    ;)

    Thanks...

  • PLEASE PLEASE PUT THE CLASSIC START MENU OPTION BACK IN WIN 7!!!!

  • <<"I should add that very few (if any) SSD drives in the marketplace today actually support Trim.  Most of the ones that do are next-generation prototypes.  But when they do become available, Windows 7 will take advantage.

    - Craig (NTFS team)">>

    Thanks for your input so far Craig.. much appreciated.

    While the bigger folk sort out SSD 'logo' reqs etc.. any 'issues' with a little fella like me trying to disable Win 7(RC) native TRIM and using a brute force propriety trimming 'tool' which would work really well for me on a customised 'schedule' basis. I intend to use SATA Native IDE mode and default Microsoft ATA/ATAPI device drivers with a combined  MLC/SLC implementation.

    By 'issues' I refer to legal/propriety as well as technical.. my assumption being that Procmon can keep me reasonably well informed of what's going on 'technically' between the OS and my SSDs.

    Regards,

  • Hi,

    Just installed the RC and there's one thing strange to me: sharing is enabled by default for the USERS folder (not the public folder) and from my other computer I can access everything in this folder and perform operations (read/copy/delete). It was a clean Win7 install and of course I haven't changed the sharing policy for myself.

    I now disabled the sharing and now it is as it should be.

  • At what level does Win 7 disable defragmentation, Superfetch, ReadyBoost etc for SSDs ?  I have an SSD as the system drive in my Win 7 RC machine, and the defrag GUI shows scheduled defrag as turned on, and in the schedule, select disks includes the SSD.  Is defrag for SSDs disabled at some lower level that the GUI does not show ?

    I manually turned off scheduled defrag, suspecting that my SSD was not be handled correctly by the Operating System.

    When I run fsutil behavior query DisableDeleteNotify, I get a 0.  Does this mean that my SSD is being correctly recognised as an SSD ?

    I do notice that on the ReadyBoot tab for an SD card I put into my machine, it says "This device cannot be used for ReadyBoost.   ReadyBoot is not enabled because the system disk is fast enough...."

    Finally, is there a way for me to determine what rotational speed my SSD is reporting to the Operating System ?  fsutil ?  wmi call ?

  • It is important to note, for IOPS intensive enterprise storage, there are alternatives to a "NAND only SSD".  One alternative, the DDRdrive X1, elegantly avoids all of the above mentioned limitations of Flash by using DRAM for IO reads/writes and Flash solely for backup/restore.

    1) No LBA remapping, thus no wear leveling overhead.

    2) Deterministic performance, no pauses or stuttering when dealing with writes - ever.

    3) No OS alignment issues, i.e. no performance penalty on Windows XP.

    4) Unlimited reads/writes - no fundamental wear mechanism until drive performs a backup/restore.

    5) Defragmentation can be turned on, with resulting performance increases.

    The above benefits do come with the both capacity and price tradeoffs.  Certain applications, for example databases, can be architected to easily work within these constraints.

    The drive for speed,

    Christopher George

    Founder/CTO

    DDRdrive LLC

    www.ddrdrive.com

  • @krish4u:

    I'm not an expert on our storage drivers (I deal mainly at the file system level), but it appears that our ATA port driver (ataport) does implement trim support.  This means that SSD drives which present themselves as ATA drives (which I think most if not all do), can support trim provided the drive itself also supports trim.  Non-ATA devices -- including USB drives and SCSI drives -- don't yet have the ability to support trim, since our other port drivers don't implement trim.  This may change as the market evolves.  I don't know if any 3rd-party storage drivers implement trim as of yet, but yes, they would have to implement it for it to work.

    @m.oreilly:

    As far as Win7 RTM goes, trim is in its final form.  Of course it could evolve in service packs, etc., as the market demands.  It's a pretty new market.  Drive manufacturers know what they need to implement on their end.  Some have provided prototypes to us for testing.

    @me&er:

    I doubt there are any legal issues with you sending down trim commands yourself, but it sounds like an awful lot of work to me.  Firstly I'm skeptical that you can even get the proper information you need.  I don't think you can infer from ProcMon output alone what clusters are in use and what aren't.  The file system knows this; what makes you think you can do better?  If you get this wrong, you can end up corrupting the volume.  Secondly, I'm not sure what the benefit of your approach would be even if you could get it right.  By having the file system send down trim commands when appropriate, you enable the drive to immediately benefit from this information.  There's very little overhead to these commands.  Contrast this with defragging, where if you were constantly defragging everything the cost would outweight the benefits.  I strongly suggest you don't try to implement this yourself.

    @lukechip:

    That fsutil query just tells you whether the file system is sending down trim commands or not.  The file system doesn't know (or much care) what kind of storage lies at the very bottom; it might even be multiple types (think volumes that span multiple disks, RAID arrays, etc.).  If trim is enabled, NTFS sends down trim commands on all volumes and lets the underlying layers sort it out.  I'm not sure how you can get the physical characteristics you want about your SSD drive.  As a start, poke around at its properties in Device Manager (devmgmt.msc).

  • <<QUOTE>> @me&er:

    I doubt there are any legal issues with you sending down trim commands yourself, but it sounds like an awful lot of work to me.  Firstly I'm skeptical that you can even get the proper information you need.  I don't think you can infer from ProcMon output alone what clusters are in use and what aren't.  The file system knows this; what makes you think you can do better?  If you get this wrong, you can end up corrupting the volume.  Secondly, I'm not sure what the benefit of your approach would be even if you could get it right.  By having the file system send down trim commands when appropriate, you enable the drive to immediately benefit from this information.  There's very little overhead to these commands.  Contrast this with defragging, where if you were constantly defragging everything the cost would outweight the benefits.  I strongly suggest you don't try to implement this yourself.

    Tuesday, May 12, 2009 3:37 PM by craigbarkhouse<<QUOTE>>

    Many thanks Craig. Appreciate your responses.

    Just to put the record straight, I’m not looking at programming an alternative to TRIM and have no wish to compromise

    I’m using a propriety TRIM tool in beta at the ‘mo from OCZ/Indilinx for their Vertex series with the Barefoot controller. If you guys haven’t got it yet.. I would give it a go. I’m not marketing it at all.. all I am interested in is confirming that it actually consolidates the free space effectively and without excessive ‘overhead/nand longevity issues’. My limited testing ability confirms it initiates fine in cmd/conhost.exe and works along the FS stack with ‘native’ Microsoft storage drivers, writing to the MFT and paging to file effectively with no file corruption identified.. the end result being a noticeable increase in random read/write speed at little or no cost to sequential read/write. Calculations for the longevity of the specific MLC drive is somewhat complex but nevertheless quite acute on the 30GB models, so I will need to compare this propriety tool ‘initiated’ usage against the FS one.

    TRIM or trimming or defragmentation/consolidation of free space is much the same for my needs.. what is important is fitting this in to how it effects the smaller capacity MLC SSD, as part of a write optimisation strategy.

    This strategy involves looking at controller/FW Wear Leveling and how it interacts with Win 7 with both SLC and MLC SSD which is why any form of TRIM and Device/OS write cacheing is right at the top of my analysis. Neatly brings me onto this question:

    If..

    <<QUOTE>>for devices that do cache writes in volatile memory, Windows 7 expects flush commands and write-ordering to be preserved to at least the same degree as traditional rotating disks...<<QUOTE>>

    Q. What’s the reasoning behind Win 7 default ‘enable’ of OS cache/buffer 'flushing' as opposed to the default ‘disabled’ in Vista/XP (albeit described as Advanced Performance afaik).

    Regards,

  • I have noted that W7 is massivly faster on my SSD's than XP.  In my research, I found that Windows XP/2003 have three major drivers. (scsiport.sys, atapi.sys and storport.sys)

    I have found that SSD's fly under storport RAID controllers and don't do as well under atapi.sys.

    I understand that the old scsiport.sys and atapi.sys use a form of the C-LOOK disk scheduling algorithm while storport does no software scheduling. (leaving scheduling up to the controller.)

    Q.  Since SSD's do not have moving parts, is the lack in C-LOOK overhead accounting for the performance increase with storport?

    If so, what scheduling changes, if any, does the new ataport in windows 7 undertake when an SSD is detected?

    Regards

Page 3 of 10 (138 items) 12345»