This question comes up all of the time. Unfortunately, there isn’t really a single correct answer. However, there are a number of things that can be pointed out that will help lead you to an acceptable answer. The most important thing is to be sure you have defined the proper goals, objectives and success criteria (see this post). I also want to point out that I am not covering any information on how to use results to find issues or troubleshoot test runs. This post focuses solely on the mathematics of results and the expectations of the people receiving the results from test runs. Let’s look at a question and some of the responses I have seen before. Some of the answers look very reasonable at first, but they may have some caveats:
INITIAL QUESTION: In my performance report there are 3 columns, say Average Response Time, 95% Response Time and 99% Response Time. Which one should I use to report to my Business?
RESPONSES: Here are some responses I have seen and some thoughts to consider when reading them
I would love to hear some other thoughts on this subject as well.
Thanks for writing this post, it's nice to see this type of thing discussed. Most people just use an average and get on with it, not realising that it doesn't really help them.
I think the best approach I've seen is to create graphs like these ones from jHiccup (www.azulsystems.com/.../jHiccup).
The include percentiles overall (bottom graph) and per-interval (i.e. over time in the top graph). Once you understand them they give a really nice overview of latency and most importantly the don't hide outliers as an average can.
I think your suggestion of using different values depending on the standard deviation (a, b, c, or d) could get a bit confusing.
Thanks Matt. I like the graphs you pointed out. I am working on a tool to help create all kinds of different reports from Visual Studio, but I keep running into two issues (Hopefully I will have some things to share with everyone very soon):
1) There are so many different ways to report things, and so many different things to report that it can become over-whelming to decide what to use and what to not use.
2) I am trying to build this as a side project since it is not part of my normal job so time is extremely limited.
As for the Std Dev variants, I agree that it can be confusing but I included it since it was a suggestion that came from one of the other test teams in the company. They use it that way and it seems to work for them. The biggest point to take away from any of this is to ensure that YOU report on whatever you NEED and whatever you UNDERSTAND. And even more importantly, make sure that the things you report on are RELEVANT to the desired need (which is why I wrote Part #3).
I've just read Part #3, I really like the example you show of how to talk through the performance goals with a customer.
It's been a while since I have run load tests as it was a couple jobs ago. But I remember one thing when talking with the business side and that is they cared about two things.
1. Are our users having a great experience?
2. Is the site generating the end result? (most likely revenue)
You then bucketize the reports into those two categories. Response time data (login, search results, add to cart, submit order, etc.) mainly is in the fist bucket whereas number of transactions data (max concurrent, max per hour, average per hour, etc...) This is where I think Application Insights run in conjunction with the tests is extremely valuable.
Your Part #3 post is spot on as it illustrates the importance of getting the business to agree upon this stuff before you even create your tests.
This has been a great series of posts! I would love to hear your opinion on extrapolating test environment results to prod, in those cases where customers do not have a 1:1 representation. How would you correlate values in those cases?
Thanks. Glad you are enjoying this. Unfortunately your question is one that plagues many of us all the time. I will start with the de-facto answer of "it depends!" (sorry, I hate the answer, but it is so true). I am going to be publishing an article or two very soon on extrapolation and how it has bitten me and a couple of teammates in the past. I will attempt to put a few things you can consider when extrapolating, but there are many more than I am even aware of, so the best answer is to fully understand the application, the architecture, behaviors of similar architectures that you may be able to reference, and above all else, ADD DISCLAIMERS to any results you publish.