I believe for any performance investigation you need to have a scenario. Something which is believable and real world like. In my opinion scenarios drive requirements and design.

I want to have a workload which I can use to test and demonstrate various capabilities of SQL Server database. Why I am doing it you may ask? Over the years SQL Server has grown to be such a huge product that it is impossible to keep track of all the things you can do with it. There are tons of books, online articles, white papers, suggestions, tips for getting best performance out of SQL Server. Most of these are very disjointed and keeping track of them is proving impossible for me. So what I want to do is to start with one system and see how these scenarios apply to this system and what should be done to get the best of SQL Server.

I am going to start with "a" workload and start testing various suggestions on it. See how well these suggestions apply to this workload.

Here are the requirements of the workload off the top of my head -

  1. Scalable. The workload should be able to scale from running on a small desktop level machine (dual core 2 GB RAM with one physical hard disk) to a decent commodity server (four socket quad core 16 GB 6 physical disk) class machine.
  2. Real life like. Should represent a real life scenario. Amazon.com model comes to mind.
  3. Reproducible / Predictable. Should reproduce the same results no matter how many times it is executed. For a performance validation, being able to reproduce a scenario is very important.
  4. Shippable. The workload should be able to zip into a small download so that anyone can reproduce the workload on his / her machine.

Other requirements that are good to have but not necessary a deal breaker are -

  1. Able to execute from 2 machine set up. I would prefer not to have a application server layer just to keep things simple.
  2. Ability to capture performance metrics. Sure one can keep track of individual tests and the performance data using text files, word documents. But it would nice to have a system to keep track of all the tests and corresponding results.
  3. Ability to run on other commercial database systems. I am not targeting any other database system than SQL Sever 2005 and future versions of SQL Server. But if I can can have a database agnostic workload, that would be sweet.

What do you guys think the requirements of this workload should be? Anything else comes to mind?

If there is anything already available like this please point me to it.