Brett Keown, a TFS Senior Support Engineer has put together a collection of TFS performance tips based on his support experiences. The document contains tips on performance monitoring TFS, getting the most from your DT, improving SCC cache reactions, addressing build time problems, and a host of others. He has collected so much info that I will be posting it as a two-parter… with part one below. Tune in Monday for part two. Please feel free to comment on this collection from Brett, thank him for it <g>, or add tips of your own in the comments section.
Performance Data Collection
There are a few things we can do to isolate and test performance issues in TFS. We can use TFS Performance counters to record the server's performance. For more information on TFS Performance Counters and how to use them, please review the following material:
There is a command to check the queue in TFS for processing source code control requests. You can run this from the Application Tier (AT) machine. Open a browser and type the following path:
http://localhost:8080/VersionControl/v1.0/administration.asmx. Once you are at the page, choose to run QueryServerRequests to see what is currently being processed in TFS Version Control. If you have configured your Application Tier to allow it, you may also perform this test remotely by replacing “localhost” with the server name.
It is also possible to kill a long running SCC process here that may be hanging up your server. Say for example someone starts a GET from the root ($/) and then goes to lunch. Depending on the size of your organization and your server, this could have very detrimental effects on the server's performance. After determining the ServerProcessID of the long-running process on the QueryServerRequests output, return to the administration.asmx page and enter the KillProcess web service. Provide the ServerProcessID, a reason for termination and click invoke. Presto - you get your server back.
There are a few issues we see pretty regularly. The first thing you should always do is to make sure you have the most recent Service Packs installed for Visual Studio, SQL Server and Team Foundation Server. Team Foundation Server Service Packs do not need to be installed on clients, only the server. Please note that if you are opening Team Explorer on the server (or on a build server), you will need to install the Visual Studio Service Packs on those machines as well.
NOTE: You should be aware that if you install VS 2008 SP1 on any machine, it will require the installation of the 3.5 SP1 version of the .NET Framework. This version of the framework is not compatible with SharePoint 2.0. Once these SP's are installed you may need to upgrade your WSS to version 3.0. Here’s a blog post which discusses this issue and how to work around this issue: http://blogs.msdn.com/sharepoint/archive/2008/08/27/net-framework-3-5-sp1-issue-on-windows-sharepoint-services-v2-0.aspx
In Enterprise development environments where large builds, heavy 24 hour a day server usage and distributed development are common, it may be a good idea to move your Data Tier (DT) to its own dedicated 64 bit environment to make use of the extra memory space and more RAM. Moving your Data Tier to new hardware instead of a Single Tier installation or VM session will allow you to get quite a performance boost. Documentation on moving the TFS Data Tier:
If your CPU is constantly maxed out, check your Event Log for event ID 9000 errors. These typically are referring to the Team Foundation Server Side SCC cache being reset. This is a good thing. When the cache reaches the size limit, we recycle a portion of it. This keeps room in the cache which in turn speeds up requests for clients by having them get their files from the cache instead of directly from the Data Tier. The problem occurs when the total available size of the cache is smaller than a typical transaction from TFS. Team Foundation by default will use 10% of available disk space for its cache. It will recycle 20% of its data when it hits the cap. Here is where we run into the issue. Let’s say a customer has a TFS installation on a VM with only 10GB of free space. The customer is also performing continuous builds of approximately 1GB in size. With 10GB of total free space, Team Foundation will take 1GB for its cache. When it gets full it will remove the oldest 200MB of content. So we are constantly spending all of our time clearing the cache which spikes the CPU and kills performance. This doesn’t even take into account the load incurred by other users of the system!
To adjust the cache settings review this MSDN article:
If you would like to check the cache performance on either the AT or Proxy Server machines, please see:
There are a few very common issues with Virus Protection software and Team Foundation Server that every TFS Administrator should be aware of:
· Real time virus scanning of the Team Foundation directories on the server can severely slow down TFS. When using real-time scanning the AV application will scan the TFS cache for each user as their files are updated. We also run the risk of locking Team Foundation files as open if there is a problem with the scan. If this file is locked open while we try to access it, we throw some pretty weird errors in TFS that may or may not be related to a real issue. Background scanning is our recommendation for Anti-Virus Software on a Team Foundation Server.
· If running Virtual images of Windows Servers that contain Team Foundation Servers, make sure that any anti-virus software you run is as an installed application of the image itself and not against the VHD or VHC files from the host Operating System. We have found that some customers will try to run the anti-virus software against the VHD files themselves. Not only does this cause performance issues, but it can also corrupt the Virtual Hard Drive.
TempDB and RCSI
As any complex SQL Server database application would, Team Foundation Server uses the SQL Server TempDB both explicitly and implicitly. In addition, the version control component will use Read Committed Snapshot Isolation (RCSI) for improved concurrence. RCSI is a new feature to SQL Server 2008 and provides a mechanism for readers to read committed changes without having to take a shared lock on the data. In doing so it stores changed rows for active transactions in its version store TempDB. This further emphasizes the need for an optimally configured TempDB. Here are some pointers discovered by the support team during testing:
· Manually grow TempDB - file growth can be expensive and time consuming. SQL Server reverts back to the original TempDB size upon re-start thereby incurring the cost again after every restart. I suggest manually growing the data file to approximately 20% of the sum of all TFS databases. This also will prevent file fragmentation if multiple files are open on the same drive.
· Use several equally sized data files – If using a multi-processor data tier; use n equal sized data files (where n is the number of processors. The sum of all the data files can equal 20% all TFS databases combined). This will allow SQL Server to allocate events in a round robin fashion and reduce contention. If you have a dedicated drive for TempDB with more than adequate space assigned, turn off auto-growth to prevent the data files from growing unevenly and interfering with the round robin allocation. If you do choose to turn off auto-growth make sure that you have allocated enough space for TempDB. We wouldn’t want it to run out of space!
If Team Foundation Server is on the local network and you don't need a proxy to connect to it, you can bypass the proxy locally by adding the following registry key. This will give a small performance improvement:
In HKLM (global); create the following registry entry:
This should be of type string with the value "true"
The Developer Support Team Foundation Server Blog on TFS Performance Tips & Tricks - Part One Jeff
"RCSI is a new feature to SQL Server 2008 and provides a mechanism for readers to read committed changes without having to take a shared lock on the data..."
I remembered that I read about RCSI in SQL Server 2005 related documents. Here is one resource which supports this statement:
"Read committed snapshot isolation (RCSI) was introduced in SQL Server 2005 as a new mechanism to prevent queries reading data to block, or be blocked by, other queries modifying data in the same tables."