Everything you want to know about Visual Studio ALM and Farming
Brian Harry is a Microsoft Technical Fellow working as the Product Unit Manager for Team Foundation Server. Learn more about Brian.
More videos »
We have just completed our testing for TFS 2008 scalability and are ready to publish the final recommendations on server sizing and hardware configurations. If you want to compare this to the TFS 2005 recommendations, you will find them here.
Ultimately making capacity recommendations is a little like throwing darts at a board. The problem is that no two teams are the same. They use different processes, have different usage patterns, have different sized applications, are organized differently, etc. When we make estimates on things like how much load an average user puts on the system, we base that largely on what we observe in our own use of our internal TFS installation. It's not perfect and it changes over time. If you read the details below, I'll spell out all of the assumptions we made.
Quite a few things have changed since TFS 2005.
The net result though is that our recommendations, while more conservative, afford more users on similarly sized hardware.
Before I go into any gory detail, I'll spell out the configurations we tested and the results we got.
There's several things to note about this.
For a good background on the general approach we use to determine TFS's scaling abilities, read http://blogs.msdn.com/bharry/archive/2005/10/24/how-many-users-will-your-team-foundation-server-support.aspx. While the numbers in that post are out of date, the methodology is still accurate.
Load per user
The biggest change between TFS 2005 and TFS 2008 is that we changed the assumption for the amount of load an average user puts on TFS. We measure this on our own DevDiv TFS server by looking at load patterns and dividing by the number of "active" users. When we shipped TFS 2005, an average user in DevDiv used approximately 0.1 requests per second (in other words, an average of 1 request every 10 seconds during peak usage hours). That number has gone up quite a bit in the intervening year and a half or so. Why? Well it's hard to know for sure but I can speculate on a few things.
The end result is that we are now using 0.15 requests per second per user. That's a 50% increase over the number that we used to compute TFS 2005 capacity. So just to maintain the same user recommendation, TFS 2008 has to be 50% faster on the same hardware.
Another key change is that we've reassessed the amount of data that corresponds to various team sizes. We've done a survey of usage by different teams to determine how big their databases are on average. The result, in some cases, is almost a 10X increase in the size of the databases we tested with. This also, of course, causes TFS to have to work harder to accomplish the same throughput on the same hardware. Here are the sizes we used for TFS 2008:
These numbers are based on teams at the higher end of each range. They are also based on the amount of data accrued over about a 2 year period. Of course all teams are different and your numbers may be higher or lower but at least you know what assumptions we used.
An example of how these data size assumptions affect the performance of TFS. Look at the Avg workspace size column. This is the number of files that users typically work with on teams of that size. When our load testing simulates a version control "get" operation, it is getting that many files. So a get on a 3,600 person team is a 20 times larger operation than a get on a 250 person team.
The last substantial change we made was to the hardware configurations. Some of this was deliberate - for example, we decided to start officially including 8 proc data tier numbers because, with the advent of multi-core machines (particularly quad core), an 8 proc machine is no longer an outrageously expensive machine. In fact the 8P machine we tested on was actually a quad core dual proc machine.
As I mentioned above, we also added TFS proxies to the two larger configs. We did this because many of our larger customers use proxies and we use them internally quite a lot. In fact, we've set up proxies even on the same LAN for our highest demand users. For example, our build lab has its own proxy because it does approximately 75 full gets of a several million file tree every day. It probably adds up to 3 or 4 million file downloads a day. In our simulation, we configured half of the users to use the proxy. This doesn't actually mean that half of their load went to the proxy because it only handles downloads. Downloads are comparatively inexpensive and all other load goes straight to the TFS server.
Some of it was not deliberate. The hardware availability in our lab changes and the drive arrays and machines we used last time had been used for something else. So we picked machines that were generally close to what we tested last time. The only thing I regret is that we didn't have higher performance drive arrays to test. The 3,600 user configuration should have been a SAN and the 2,200 user configuration should have at least been a SCSI array instead of a SATA2. I suspect the differences wouldn't have been huge but the higher capability I/O systems would have provided better performance and been more realistic to what someone would use in a production environment.
The end result is that our hardware configurations for TFS 2008 allow for more users for similar hardware than our recommendations for TFS 2005. Our recommendations are based on a substantially more conservative estimate of how much load a user puts on the system. I'd estimate that between the increased request load, increased data size, etc, the estimates for TFS 2008 assume about double the load per user.
TFS 2008 is more than twice as fast as TFS 2005 and can support extremely large teams. Of course, even larger teams can deploy multiple servers and scale to any size they need.
I'm interested in your own stories about your experience with TFS 2008 performance if you have them. Please feel free to share.