Too Much Cache?

Too Much Cache?

  • Comments 28

Cache is used to reduce the performance impact when accessing data that resides on slower storage media.  Without it your PC would crawl along and become nearly unusable.  If data or code pages for a file reside on the hard disk, it can take the system 10 milliseconds to access the page.  If that same page resides in physical RAM, it can take the system 10 nanoseconds to access the page.  Access to physical RAM is about 1 million times faster than to a hard drive.  It would be great if we could load up all the contents of the hard drive into RAM, but that scenario is cost prohibitive and dangerous.  Hard disk space is far less costly and is non-volatile (the data is persistent even when disconnected from a power source). 

 

Since we are limited with how much RAM we can stick in a box, we have to make the most of it.  We have to share this crucial physical resource with all running processes, the kernel and the file system cache.  You can read more about how this works here:

http://blogs.msdn.com/ntdebugging/archive/2007/10/10/the-memory-shell-game.aspx

 

The file system cache resides in kernel address space.  It is used to buffer access to the much slower hard drive.  The file system cache will map and unmap sections of files based on access patterns, application requests and I/O demand.  The file system cache operates like a process working set.  You can monitor the size of your file system cache's working set using the Memory\System Cache Resident Bytes performance monitor counter.  This value will only show you the system cache's current working set.  Once a page is removed from the cache's working set it is placed on the standby list.  You should consider the standby pages from the cache manager as a part of your file cache.  You can also consider these standby pages to be available pages.  This is what the pre-Vista Task Manager does.  Most of what you see as available pages is probably standby pages for the system cache.  Once again, you can read more about this in "The Memory Shell Game" post.

 

Too Much Cache is a Bad Thing

The memory manager works on a demand based algorithm.  Physical pages are given to where the current demand is.  If the demand isn't satisfied, the memory manager will start pulling pages from other areas, scrub them and send them to help meet the growing demand.  Just like any process, the system file cache can consume physical memory if there is sufficient demand. 

Having a lot of cache is generally not a bad thing, but if it is at the expense of other processes it can be detrimental to system performance.  There are two different ways this can occur - read and write I/O.

 

Excessive Cached Write I/O

Applications and services can dump lots of write I/O to files through the system file cache.  The system cache's working set will grow as it buffers this write I/O.  System threads will start flushing these dirty pages to disk.  Typically the disk can't keep up with the I/O speed of an application, so the writes get buffered into the system cache.  At a certain point the cache manager will reach a dirty page threshold and start to throttle I/O into the cache manager.  It does this to prevent applications from overtaking physical RAM with write I/O.  There are however, some isolated scenarios where this throttle doesn't work as well as we would expect.  This could be due to bad applications or drivers or not having enough memory.  Fortunately, we can tune the amount of dirty pages allowed before the system starts throttling cached write I/O.  This is handled by the SystemCacheDirtyPageThreshold registry value as described in Knowledge Base article 920739: http://support.microsoft.com/default.aspx?scid=kb;EN-US;920739

 

Excessive Cached Read I/O

While the SystemCacheDirtyPageThreshold registry value can tune the number of write/dirty pages in physical memory, it does not affect the number of read pages in the system cache.  If an application or driver opens many files and actively reads from them continuously through the cache manager, then the memory manger will move more physical pages to the cache manager.  If this demand continues to grow, the cache manager can grow to consume physical memory and other process (with less memory demand) will get paged out to disk.  This read I/O demand may be legitimate or may be due to poor application scalability.  The memory manager doesn't know if the demand is due to bad behavior or not, so pages are moved simply because there is demand for it.  On a 32 bit system, the file system cache working set is essentially limited to 1 GB.  This is the maximum size that we blocked off in the kernel for the system cache working set.  Since most systems have more than 1 GB of physical RAM today, having the system cache working set consume physical RAM with read I/O is less likely. 

This scenario; however, is more prevalent on 64 bit systems.  With the increase in pointer length, the kernel's address space is greatly expanded.  The system cache's working set limit can and typically does exceed how much memory is installed in the system.  It is much easier for applications and drivers to load up the system cache with read I/O.  If the demand is sustained, the system cache's working set can grow to consume physical memory.  This will push out other process and kernel resources out to the page file and can be very detrimental to system performance.

Fortunately we can also tune the server for this scenario.  We have added two APIs to query and set the system file cache size - GetSystemFileCacheSize() and SetSystemFileCacheSize().  We chose to implement this tuning option via API calls to allow setting the cache working set size dynamically.  I’ve uploaded the source code and compiled binaries for a sample application that calls these APIs.  The source code can be compiled using the Windows DDK, or you can use the included binaries.  The 32 bit version is limited to setting the cache working set to a maximum of 4 GB.  The 64 bit version does not have this limitation.  The sample code and included binaries are completely unsupported.  It is just a quick and dirty implementation with little error handling.

Leave a Comment
  • Please add 4 and 3 and type the answer here:
  • Post
  • I'm getting "is not a valid Win32 application" error.

    [I’m guessing that you are running SetCache.exe on Windows XP (or earlier). The GetSystemFileCacheSize and SetSystemFileCacheSize API functions require Windows Server 2003 SP1 or later (this includes Windows XP x64 Edition, since it is built from the 2003 SP1 codebase). Since these functions don’t exist on earlier versions of Windows, the SetCache.exe binary was compiled with the subsystem version set to 5.02, which prevents it from running on versions of Windows where the API functions do not exist.]
  • Good article. keep it up to keep us up to date.

  • Whilst coming out of L1/L2 cache may get you 20ns, coming from physical RAM is going to be between 80 and a 120ns on most systems (probably the latter), with the larger ones needing NUMA node traversal over a Xbar or AMD hyperlinks will also add to this number.

    I've not seen any systems going under 80 ns, but not done performance work on DDR3 yet so it may be possible outside of cache.

    However, your thought about access being a million times quicker is nearly right. Disk access on the big Arrays from EMC, Hitachi or HP would average (roundtrip) to be between 4 - 8 ms, if we average in cache utilization on their arrays.

    Taking 8ms and 80ns to make the sums easier it's around 100,000 times quicker.

    This does bring up a point about the new flash drives coming out the spec is access will be less than 1 microsecond, and typically at the slower end of RAM, which means 150ns, so at worst case (1us) these drives are about 8000 times better than mechanical spindles with a big front side cache. Hence Hybrid technology and ReadyBoost are being used to improve latency. Flash drives are too expensive for most consumers so far, but in the Enterprise the tipping point for mass production is about a year away if assume price/capacity rates keep diminishing linearly for the respective technologies. A single fibre flash drive can saturate a 4Gb/s Fibre link, so capacity is there.

    This will give a very needed boost for many customers on the performance envelope.

  • how do u find out how much cache is on your pc?

    (nice article learnt a bit keep i up)

    [If you want to see how much cache is currently being used, you can do so with Performance Monitor counter /Memory/Cache Bytes. If you want to see the limits, you can use the SetCache executable or !filecache in a postmortem or live debug. !filecache will also show you the current file cache usage along with how much cache each file is using.]
  • Very helpful.  However, setting max cache size to >= 2 GB does not work for me.  Tried on w2k3 server r2 and xp.

    [This is a limitation on 32 bit systems. With this API and tool you are setting the working set of the file cache. The file cache resides in the kernel address space. On 32 bit systems, you are limited to 2 GB of Kernel address space. Additionally the cache needs to share the kernel address space with other kernel resources, so your cache's working set won't even get up to 2 GB. Even though your cache's working set is limited on 32 bit systems, many pages still may be "cached" on standby pages. If you use the 32 bit version on an x64 box, you can set the cache up to 4GB. At that point you reach the 32 bit SIZE_T limit (32 bits can only address up to 4 GB).]
  • Fantastic!

    I have upgraded our backup server to W2k8x64 from w2k3x86 and it kept using all physical RAM, I even put 16GB in it and it used it all! this explains it all!

    I have managed to reduce the cache to 1GB, but can't set it at more if I try anything over 1GB it sets to 8TB, e.g.:

    C:\>setcache 2048

    Current Cache Settings:

    Minimum File Cache Size: 100 MBytes

    Maximum File Cache Size: 1024 MBytes

    Flags: 1

    New Cache Settings:

    Minimum File Cache Size: 100 MBytes

    Maximum File Cache Size: 8388607 MBytes

    Flags: 1

    Any ideas? otherwise I'll take some of this expensive memory out if it can't be used usefully!

    [Thanks for the feedback. It turns out there was a bug in the sample code. The bug reveals that I didn't have a modern system with a lot of RAM to properly test this code. I've updated the code and the binaries. Try this new version.]
  • Hello again - thanks for the updated setcache that works perfectly!

    I have another problem though, after a month or so of our backup server being on (no reboots or installs, just doing it's normal backups as far as we can tell) the machine will reset the cache back to 8TB, at this point the machine grinds to a halt as all physical RAM is used and today it was so bad we had to actually press the power button because it wouldn't respond to ctrl-alt-del or pslist etc.

    Any ideas what might cause that? there shouldn't be a time limit on how long the cache is set should there?

    I have set up a scheduled task to run setcache daily and have to see how it behaves from now on.

    Many Thanks!

    [I don't see a direct timer that would reset the max size. There could be some crazy set of events that could end up triggering a reset, but I cannot guess on a possible scenario. There is also the possibility that another application on the system is calling SetSystemFileCacheSize() and resetting the cache size. If scheduling a daily task to reset the max size is not enough for you, I recommend opening a support incident with us so that we can investigate this further.]
  • Very informative, thanks. But does the SetSystemFileCacheSize() work on Vista SP1?

    Here's my test: on a system with 1.7 GB RAM, copy 1.5 GB of pictures, file size varies 5 - 15 MB.

    With default settings (no tweaks), Task Manager's Physical Memory Cached grows while Free goes to 0.

    I ran your SetCache, setting max cache size to 512 MB - no change, Cached still grows to max.

    I set SystemCacheDirtyPageThreshold per the KB article, still no change.

    BTW, if I delete the copied files the Free memory immediately jumps up, indicating to me the system is still caching the files that were just written. Which isn't necessarily bad, I just want the system to cache LESS of them!

    What really bugs me is my Vista box with 4GB RAM will use 3+ GB for cache while copying files and slowing everything else down.

    [These APIs will work with Vista SP1. The client requirements are XP x64 and Vista. For Server systems, you will need Server 2003 SP1 or Server 2008.

    Do not rely on Task Manager for these values. It is not telling you what you think it is. Please read the Memory Shell Game. Task Manager is telling you that you are using Standby Pages. They are like cached pages and very close to freed pages. In your example the 3+ GB is probably in standby pages. Since they are not being actively used by processes, the system is using them as pseudo cache pages. We don't want memory to go unused.]
  • Aug 29, 2008

    Re: Cache tuning API's and Registry values

    The problem seems to be the one size fits all mentality of the operating

    system, not bad applications. There is no excellent algorithm for an

    operating system as broadly used as Windows, despite Mark Russinovich's

    statement "...the Windows file copy engine tries to handle all scenarios

    well" [from:

    http://blogs.technet.com/markrussinovich/archive/2008/02/04/2826167.aspx].

    People should read Mark's well written description of the efforts

    expended by Microsoft in improving the cache system. It seems they

    improve it in one place only to have bad performance immediately

    pop up somewhere else.

    The API's mentioned in this article are dangerous because they are

    global and effect all applications on the system.  If the system is a 16

    core Datacenter, the ability to tune the system with these API's is like

    using sledge hammer on ear rings.

    There will come a time when the Windows I/O subsystem will be

    redesigned. This will happen because of the amount of parallelism

    arriving with the many cores of systems in the near future will

    bring cache peformance, scalability and partitioning into focuse

    where these issues were not visible to the original Windows Operating

    system designers.

    Until that time, I think we all must endure the

    one-size-fits-all design of the Windows cache.

    I could be wrong, though.

    Just my 1 cent.

    Ed

    [There are many design challenges for a general use operating system. Windows isn’t a one trick pony. Seldom do servers only run one type of application. You have to throw in backups, management, and a host of other add-on services. While the Cache Manager does a good job for most of the usage scenarios, there are unique cases where it doesn’t work well. For the read I/O cache consumption on 64 bit case, the fundamental problem is a runaway working set. On 64 bit systems, working sets can grow larger than physical RAM on most servers. The default settings for the Memory Manager can’t handle processes that take more than their fair share of physical RAM. It can’t reliably determine whether one process deserves more pages than another. This is why administrators need to tune the server’s configuration. They can use things like WSRM to limit working set growth for individual processes or use the provided APIs to limit the working set of the system file cache. We are working on improving this experience in the next version of Windows. The changes are extensive and the risk of regression is far too high to backport to the current operating systems.]
  • Excessive paging on Exchange 2007 servers when working sets are trimmed

  • I used setCache.exe it on win2008 64bit enterprise to 2048MB and the "cached" value that shows in task manager (directly below memory guage) just keeps going higher than 2048MB. Mine is now 12,224MB!  Does anyone know what is going on?

    [Task Manager’s value for “Cached” is not what you think it is. In addition to the Cache Manager’s working set, this number includes the number of standby pages in physical RAM. While standby pages are like cached pages, they can be quickly disassociated with the previous working set, scrubbed and handed to a new process. Task Manager is just showing you another way of looking at the data. To see the real working set size of the System File Cache you need to use Performance Monitor. You can read more about this in the Memory Shell Game post.]
  • Using the sample code, is it possible to set the minimum cache size? It defaults to 100MB instead of the original 1MB.

    [By default the sample code is hard coded to set the minimum to 100 MB, but not enforce it. This was done to allow the memory manager to reduce the working set size as needed. You can modify the sample code to set a hard limit by changing the Flags parameter in the call to SetSystemFileCacheSize().]
  • This reads and acts like a Russinovich post and tool - easily understandable, educational, small, fast, and useful. Thank you!

    I have Vista Home Premium x64 with 4GB RAM. 1GB max cache on a 4GB system seems much more reasonable than the default of 8.4 TB!

    Do I need to reboot in order for the new cache setting to take effect?

    [Thanks for the feedback. In response to your question a reboot is not required. The setting is dynamically applied to the system file cache’s working set size. It is important to note this setting is not persistent so rebooting the machine will revert the size back to the default. In order to maintain the settings you’ll need to run the tool at least once per boot. One approach is the use of a machine start up script to automate the process after a reboot.]
  • Excessive cached read I/O is a growing problem. For over one year we have been working on this problem

  • Rilasciato Microsoft Windows Dynamic Cache Service

Page 1 of 2 (28 items) 12