Cache is used to reduce the performance impact when accessing data that resides on slower storage media. Without it your PC would crawl along and become nearly unusable. If data or code pages for a file reside on the hard disk, it can take the system 10 milliseconds to access the page. If that same page resides in physical RAM, it can take the system 10 nanoseconds to access the page. Access to physical RAM is about 1 million times faster than to a hard drive. It would be great if we could load up all the contents of the hard drive into RAM, but that scenario is cost prohibitive and dangerous. Hard disk space is far less costly and is non-volatile (the data is persistent even when disconnected from a power source).
Since we are limited with how much RAM we can stick in a box, we have to make the most of it. We have to share this crucial physical resource with all running processes, the kernel and the file system cache. You can read more about how this works here:
The file system cache resides in kernel address space. It is used to buffer access to the much slower hard drive. The file system cache will map and unmap sections of files based on access patterns, application requests and I/O demand. The file system cache operates like a process working set. You can monitor the size of your file system cache's working set using the Memory\System Cache Resident Bytes performance monitor counter. This value will only show you the system cache's current working set. Once a page is removed from the cache's working set it is placed on the standby list. You should consider the standby pages from the cache manager as a part of your file cache. You can also consider these standby pages to be available pages. This is what the pre-Vista Task Manager does. Most of what you see as available pages is probably standby pages for the system cache. Once again, you can read more about this in "The Memory Shell Game" post.
Too Much Cache is a Bad Thing
The memory manager works on a demand based algorithm. Physical pages are given to where the current demand is. If the demand isn't satisfied, the memory manager will start pulling pages from other areas, scrub them and send them to help meet the growing demand. Just like any process, the system file cache can consume physical memory if there is sufficient demand.
Having a lot of cache is generally not a bad thing, but if it is at the expense of other processes it can be detrimental to system performance. There are two different ways this can occur - read and write I/O.
Excessive Cached Write I/O
Applications and services can dump lots of write I/O to files through the system file cache. The system cache's working set will grow as it buffers this write I/O. System threads will start flushing these dirty pages to disk. Typically the disk can't keep up with the I/O speed of an application, so the writes get buffered into the system cache. At a certain point the cache manager will reach a dirty page threshold and start to throttle I/O into the cache manager. It does this to prevent applications from overtaking physical RAM with write I/O. There are however, some isolated scenarios where this throttle doesn't work as well as we would expect. This could be due to bad applications or drivers or not having enough memory. Fortunately, we can tune the amount of dirty pages allowed before the system starts throttling cached write I/O. This is handled by the SystemCacheDirtyPageThreshold registry value as described in Knowledge Base article 920739: http://support.microsoft.com/default.aspx?scid=kb;EN-US;920739
Excessive Cached Read I/O
While the SystemCacheDirtyPageThreshold registry value can tune the number of write/dirty pages in physical memory, it does not affect the number of read pages in the system cache. If an application or driver opens many files and actively reads from them continuously through the cache manager, then the memory manger will move more physical pages to the cache manager. If this demand continues to grow, the cache manager can grow to consume physical memory and other process (with less memory demand) will get paged out to disk. This read I/O demand may be legitimate or may be due to poor application scalability. The memory manager doesn't know if the demand is due to bad behavior or not, so pages are moved simply because there is demand for it. On a 32 bit system, the file system cache working set is essentially limited to 1 GB. This is the maximum size that we blocked off in the kernel for the system cache working set. Since most systems have more than 1 GB of physical RAM today, having the system cache working set consume physical RAM with read I/O is less likely.
This scenario; however, is more prevalent on 64 bit systems. With the increase in pointer length, the kernel's address space is greatly expanded. The system cache's working set limit can and typically does exceed how much memory is installed in the system. It is much easier for applications and drivers to load up the system cache with read I/O. If the demand is sustained, the system cache's working set can grow to consume physical memory. This will push out other process and kernel resources out to the page file and can be very detrimental to system performance.
Fortunately we can also tune the server for this scenario. We have added two APIs to query and set the system file cache size - GetSystemFileCacheSize() and SetSystemFileCacheSize(). We chose to implement this tuning option via API calls to allow setting the cache working set size dynamically. I’ve uploaded the source code and compiled binaries for a sample application that calls these APIs. The source code can be compiled using the Windows DDK, or you can use the included binaries. The 32 bit version is limited to setting the cache working set to a maximum of 4 GB. The 64 bit version does not have this limitation. The sample code and included binaries are completely unsupported. It is just a quick and dirty implementation with little error handling.
I'm getting "is not a valid Win32 application" error.
Good article. keep it up to keep us up to date.
Whilst coming out of L1/L2 cache may get you 20ns, coming from physical RAM is going to be between 80 and a 120ns on most systems (probably the latter), with the larger ones needing NUMA node traversal over a Xbar or AMD hyperlinks will also add to this number.
I've not seen any systems going under 80 ns, but not done performance work on DDR3 yet so it may be possible outside of cache.
However, your thought about access being a million times quicker is nearly right. Disk access on the big Arrays from EMC, Hitachi or HP would average (roundtrip) to be between 4 - 8 ms, if we average in cache utilization on their arrays.
Taking 8ms and 80ns to make the sums easier it's around 100,000 times quicker.
This does bring up a point about the new flash drives coming out the spec is access will be less than 1 microsecond, and typically at the slower end of RAM, which means 150ns, so at worst case (1us) these drives are about 8000 times better than mechanical spindles with a big front side cache. Hence Hybrid technology and ReadyBoost are being used to improve latency. Flash drives are too expensive for most consumers so far, but in the Enterprise the tipping point for mass production is about a year away if assume price/capacity rates keep diminishing linearly for the respective technologies. A single fibre flash drive can saturate a 4Gb/s Fibre link, so capacity is there.
This will give a very needed boost for many customers on the performance envelope.
how do u find out how much cache is on your pc?
(nice article learnt a bit keep i up)
Very helpful. However, setting max cache size to >= 2 GB does not work for me. Tried on w2k3 server r2 and xp.
I have upgraded our backup server to W2k8x64 from w2k3x86 and it kept using all physical RAM, I even put 16GB in it and it used it all! this explains it all!
I have managed to reduce the cache to 1GB, but can't set it at more if I try anything over 1GB it sets to 8TB, e.g.:
Current Cache Settings:
Minimum File Cache Size: 100 MBytes
Maximum File Cache Size: 1024 MBytes
New Cache Settings:
Maximum File Cache Size: 8388607 MBytes
Any ideas? otherwise I'll take some of this expensive memory out if it can't be used usefully!
Hello again - thanks for the updated setcache that works perfectly!
I have another problem though, after a month or so of our backup server being on (no reboots or installs, just doing it's normal backups as far as we can tell) the machine will reset the cache back to 8TB, at this point the machine grinds to a halt as all physical RAM is used and today it was so bad we had to actually press the power button because it wouldn't respond to ctrl-alt-del or pslist etc.
Any ideas what might cause that? there shouldn't be a time limit on how long the cache is set should there?
I have set up a scheduled task to run setcache daily and have to see how it behaves from now on.
Very informative, thanks. But does the SetSystemFileCacheSize() work on Vista SP1?
Here's my test: on a system with 1.7 GB RAM, copy 1.5 GB of pictures, file size varies 5 - 15 MB.
With default settings (no tweaks), Task Manager's Physical Memory Cached grows while Free goes to 0.
I ran your SetCache, setting max cache size to 512 MB - no change, Cached still grows to max.
I set SystemCacheDirtyPageThreshold per the KB article, still no change.
BTW, if I delete the copied files the Free memory immediately jumps up, indicating to me the system is still caching the files that were just written. Which isn't necessarily bad, I just want the system to cache LESS of them!
What really bugs me is my Vista box with 4GB RAM will use 3+ GB for cache while copying files and slowing everything else down.
Aug 29, 2008
Re: Cache tuning API's and Registry values
The problem seems to be the one size fits all mentality of the operating
system, not bad applications. There is no excellent algorithm for an
operating system as broadly used as Windows, despite Mark Russinovich's
statement "...the Windows file copy engine tries to handle all scenarios
People should read Mark's well written description of the efforts
expended by Microsoft in improving the cache system. It seems they
improve it in one place only to have bad performance immediately
pop up somewhere else.
The API's mentioned in this article are dangerous because they are
global and effect all applications on the system. If the system is a 16
core Datacenter, the ability to tune the system with these API's is like
using sledge hammer on ear rings.
There will come a time when the Windows I/O subsystem will be
redesigned. This will happen because of the amount of parallelism
arriving with the many cores of systems in the near future will
bring cache peformance, scalability and partitioning into focuse
where these issues were not visible to the original Windows Operating
Until that time, I think we all must endure the
one-size-fits-all design of the Windows cache.
I could be wrong, though.
Just my 1 cent.
Excessive paging on Exchange 2007 servers when working sets are trimmed
I used setCache.exe it on win2008 64bit enterprise to 2048MB and the "cached" value that shows in task manager (directly below memory guage) just keeps going higher than 2048MB. Mine is now 12,224MB! Does anyone know what is going on?
Using the sample code, is it possible to set the minimum cache size? It defaults to 100MB instead of the original 1MB.
This reads and acts like a Russinovich post and tool - easily understandable, educational, small, fast, and useful. Thank you!
I have Vista Home Premium x64 with 4GB RAM. 1GB max cache on a 4GB system seems much more reasonable than the default of 8.4 TB!
Do I need to reboot in order for the new cache setting to take effect?
Excessive cached read I/O is a growing problem. For over one year we have been working on this problem
Rilasciato Microsoft Windows Dynamic Cache Service