MS.COM Operations Tools Team WebLog

Hey - What does this button do?

Performance Analysis and Capacity Planning

It always fascinates me the way a certain perspective will view Capacity Planning and Performance Analysis.  For instance, I could not count the number of times a developer or PM for a project has stated “we need a new server for our database”.  My response, of course, is to question how they know that they need a dedicated server for their lone database and invariably the answer comes down to the fact that they don’t really know.  They’re making an assumption, sometimes that assumption is based on a smattering of actual data, but most often is just a raw assumption.

 

Engineers on the other hand look at things a little differently, though often with the same limited perspective.  Many times an engineer will ask for a new server because “the CPU said 100%”. 100% for how long?  How many times did it reach 100% in an a 24 hour period? 

 

So let’s talk a minute about Performance Analysis and Capacity Planning.  The two are distinctly different, but often based on similar data.  The primary difference between Performance Analysis and Capacity Planning is duration.

 

The purpose of a Performance Analysis, whether for a specific application or for a server, is to establish a pattern of use and to determine when and to what severity a detrimental pattern might be occurring.

 

The purpose of Capacity Planning, whether for a specific application or for a server, is to establish a pattern of use and to determine the short and long term operating requirements.

 

The different between the two is somewhat subtle.  Capacity Planning is simply a view of performance data over a duration whereas Performance Analysis is an examination of a specific time slice.  You might want to know what the performance characteristics of a server were for a five minute period, or the last twenty four hours, or last seven days.  Examining the data in this manner will allow you to detect possible problem areas in performance.  For instance, an application may generate significant time-out errors at 3 AM PST every morning and continue to generate these errors until 9 AM PST every morning, but only during the week days.  By examining the performance data for these time period you could correlate a large ramp up of connections to an IIS application which in turns sucks up the available memory on that server (or server) until timeouts occur.  The time slice of 3 to 9 AM PST means your East coast customers are coming onto your site as they start their work day and the progress continues until the West coast customers have followed suit.  All of this data could mean that the login page for your application is too memory intensive or that the initial page load is too intensive.  Obviously this is a lame example, but you get the idea.

 

Now take that same data and look at the performance numbers for the entire month.  If the number of IIS connections was averaging 500 per day for week 1, 550 per day for week 2, 625 per day for week 3 and 700 per day for week 4, you can see a usage trend begin to develop over a duration.  If test have shown your application to peak at 1000 connections then you know from your examination of the data that you will reach your peak capacity within four weeks.

 

Unless it is a ridiculously obvious case, there is no way you can adequately plan for capacity with data of a duration less than seven days and I personally prefer thirty days, but it may depend on the application.

 

I’m sure my man Chris Ball has a plethora of algorithms to twist all of that data into massive reports with standard deviation and standard error lines.  In fact, Chris has implemented many of these mathematical functions into our standard performance collection routine.  This is a striking benefit for us as we can now examine immediate, detailed data for a variety of performance counters, but we can also establish long duration trends that assist in planning our hardware purchases and re-purposing.   I find the standard deviation line to be particularly useful.  Kudos Chris!

 

Will

Published Wednesday, July 28, 2004 12:56 PM by mscomts

Comments

 

David said:

Seems odd that you talk about capacity planning. Why do I, and everyone else I know, see so many errors on microsoft.com? Images missing, style sheets not loading, random script errors, all get fixed with a refresh so it's not code issues. Horribly unreliable.
July 28, 2004 11:34 PM
 

Scott said:

David -

I understand your frustration and confusion. Honestly, most of what you are experiencing likely has very little to do with capacity of the site and servers. As you're probably aware, there are many mitigating factors that can cause the sort of experience you are describing. ISP routes to us, DNS issues, partner links have all been issues we see regularly.

I won't try to tell you that it's not on our side either (that would be an outright lie). In the world that is www.microsoft.com, we have over 1000 content providers worldwide that all publish to our site (and not always in the same way). Typically, we experience almost 3 GB worth of content churn daily. With a large server farm like we have, you may end up hitting a server that either hasn't received a piece of content yet (or won't at all) because of publishing issues, servers out of rotation when the publishing happened, etc. On refresh, you simply end up hitting another server that does have the correct content.

If there are particular portions of the site that you or your friends experience this on more than others, please let us know via the Contact Us link. The Contact Us folks are going to be better at routing the issues to the right subsidiary or partner and they live for feedback on making the customer experience better. I'm happy to help in any way I can as well, but they will be a much better conduit.

Thanks

Scott
July 29, 2004 2:35 PM
New Comments to this post are disabled

This Blog

Syndication

Tags

No tags have been created or used yet.

News

All opinions posted here are those of the author(s) and are in no way intended to represent the opinions of our employer. This is provided "AS IS" with no warranties, and confers no rights. Use of included code samples are subject to the terms specified in the Terms of Use.

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker