Welcome to MSDN Blogs Sign in | Join | Help

Over next few posts I am going to talk about "a" methodology to do performance assessment / tuning of SQL Server based applications. Main focus will be on SQL Server engine but wherever necessary references to changes needed in application code (non T-SQL, non schema based, non configuration options etc.) will be made. I would love to hear your thoughts and comments on this.

 

SQL Server performance issues show up in following categories –

  1. Increased utilization of resources – This condition happens when SQL Server starts consuming more resources – CPU, IO, memory, network bandwidth etc. even when the workload has not changed much.
  2. Slow response time – Application which was running within user expected time start running slowly even when the workload has not changed much.

Basic essential information

Before starting performance investigation it is necessary to collect basic information about the machine.

  • Hardware – CPU, memory, Disk layout – physical / logical, network bandwidth
  • Other applications running on the system
  • SQL Server configuration – sp_configure
  • Database, transaction log, TempDB files layout.

Resource bottleneck

Typical resources which can become bottleneck in SQL Server set up are –

  • CPU
  • Memory
  • Disk I/O
  • Network
  • TempDB

We will look at these various resources to understand why it has become bottleneck and what can be done to reduce its impact.

I believe for any performance investigation you need to have a scenario. Something which is believable and real world like. In my opinion scenarios drive requirements and design.

I want to have a workload which I can use to test and demonstrate various capabilities of SQL Server database. Why I am doing it you may ask? Over the years SQL Server has grown to be such a huge product that it is impossible to keep track of all the things you can do with it. There are tons of books, online articles, white papers, suggestions, tips for getting best performance out of SQL Server. Most of these are very disjointed and keeping track of them is proving impossible for me. So what I want to do is to start with one system and see how these scenarios apply to this system and what should be done to get the best of SQL Server.

I am going to start with "a" workload and start testing various suggestions on it. See how well these suggestions apply to this workload.

Here are the requirements of the workload off the top of my head -

  1. Scalable. The workload should be able to scale from running on a small desktop level machine (dual core 2 GB RAM with one physical hard disk) to a decent commodity server (four socket quad core 16 GB 6 physical disk) class machine.
  2. Real life like. Should represent a real life scenario. Amazon.com model comes to mind.
  3. Reproducible / Predictable. Should reproduce the same results no matter how many times it is executed. For a performance validation, being able to reproduce a scenario is very important.
  4. Shippable. The workload should be able to zip into a small download so that anyone can reproduce the workload on his / her machine.

Other requirements that are good to have but not necessary a deal breaker are -

  1. Able to execute from 2 machine set up. I would prefer not to have a application server layer just to keep things simple.
  2. Ability to capture performance metrics. Sure one can keep track of individual tests and the performance data using text files, word documents. But it would nice to have a system to keep track of all the tests and corresponding results.
  3. Ability to run on other commercial database systems. I am not targeting any other database system than SQL Sever 2005 and future versions of SQL Server. But if I can can have a database agnostic workload, that would be sweet.

What do you guys think the requirements of this workload should be? Anything else comes to mind?

If there is anything already available like this please point me to it.

Here's an old document on analyzing memory problems. Here's summary -

This document is intended to help you diagnose common CLR performance issues. Over the years we have seen a wide variety of CLR performance issues from our customers. This document tries to classify these issues in broad categories and provide guidance for each class of problems. Many facts mentioned in this document hold for the CLR versions V1.0 and V1.1. If there are differences in behavior between these two versions then they are called out. V2.0 behavior, if documented here, is subject to change.

I think there have been some updates to this document. I'll try to dig out links to this.

While doing memory analysis on one of my scenarios, I noticed that System.Collections.BitArray.BitArrayEnumeratorSimple::get_Current Object () is doing lot of allocations of System.Boolean object.

I wrote a simple program to reproduce the behavior and as it turns out, for BitArray object returning an enumerator by calling foreach operator does allocate a lot of memory. In a simple test of 1000 entries in BitArray object, enumerating over the entries 1000 times resulted in 11,767,392 bytes of allocations for 980,616 instances of System.Boolean object. The total allocations for the scenario were 11.49 MB. The total time for executing the scenario was 35.120 ms.

bool b;

BitArray newBitArray = new BitArray(1000, false);

for (int j = 0; j < 1000; j++)

{

foreach (bool b2 in newBitArray)

    {

        b = b2;

    }

}

I changed the code to access the properties based on index rather than getting enumerator over the BitArray object. The allocations to System.Boolean object disappeared and the total allocations for the scenario reduced to 23.52kB. The time for execution came down to 8.189 ms.

bool b;

BitArray newBitArray = new BitArray(1000, false);

for (int j = 0; j < 1000; j++)

{

for (int i = 0; i < newBitArray.Count; i++)

    {

        b = newBitArray[i];

    }

}

So there you go. Watch for BitArray.GetEnumerator() in your code.

In case you are wondering, I used CLR Profiler for analysis. You can download this for free from here.

Update: Added file for sample program

Kim Hamilto on BCL team has written a great post on performance differences in Named Value Collections and Hashtable.

Lookups scale pretty well except that Named Value Collections is 2.6x worse. Addition also scales very well but this time Named Value Collections is 8.5x worse. He explains -

What causes the additional overhead for NameValueCollection? NameValueCollection allows a key to be associated with one or more values. So the additional cost is caused by the way lookups are performed internally: NameValueCollection actually delegates the hash key lookups to an internal Hashtable, which may contain multiple entries associated with that key.

If there are multiple entries associated with the key, it will return them appended together. So in fact, if you’ve assigned multiple entries to the same key (the trials above did not), then the lookup cost will be linear in the number of items you’ve assigned to the key, because the accessor will append them together. This is demonstrated in the following NameValueCollection.

The remove operation is worse of the three and does not scale. Here's explanation -

For NameValueCollection, the cost of removes is linear in the size of the collection, caused by the need to shift the index-based lookup array. Note that this means NameValueCollection and Hashtable have different asymptotic behavior for removes.

Here's recommendations on use of these data structures -

NameValueCollection only accepts keys and values that are Strings, so this is a very specialized collection. It’s useful in a situation in which you either need to associate multiple values with a key, or to do hash-based lookups as well as lookup by index (and hopefully not perform too many removes).

However, if you need to store string key/value pairs and you don’t need to perform index-based lookups or associate multiple values with a key, you may prefer to use the generic Dictionary class. This has the same asymptotic behavior as Hashtable in all cases and furthermore avoids any costs due to boxing.

Ryan Byington has written an entry on changes needed for reading performance counters data when logged as non-admin on the box. Basically the change required needs adding yourself to Performance Monitor Users group. For accessing performance counter on remote machine -

On Windows 2003, Windows XP x64 Edition, and Vista you still must be part of the Performance Monitor Users group on the remote machine but there is a problem with the PerformanceCounter class where it tries to read some registry keys on the remote machine that a non admin users do not have access to. To give your user read access to these keys without having to be an admin on the remote machine complete the following steps on the remote machine:

  1. Open the Registry Editor by going to the Start Menu and selecting Run…, then type “regedit”, and click the OK button.
  2. Navigate to the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurePipeServers\winreg registry key.
  3. Right click on the "winreg" key and select Permissions. Add users or groups to which you want to grant Read access.
  4. Exit Registry Editor and restart Windows.

For more explanation on this process see http://support.microsoft.com/?kbid=153183.

Maoni has written a good insight on process to identify performance issues.

I’ve seen enough people who looked really hard at some performance counters (often not correct ones) or some other data and asked tons of questions such as “is this allocation rate too high? It looks too high to me.” or “my gen1 size is too big, right? It seems big…”, before they have enough evidence to even justify such an investigation and questions.

For people who are required to investigate performance related issues, especially when the deadline is close, my advice is “understand the problem before you try to find a solution”. Determine what to look at based on evidence, not based on your lack of knowledge in the area unless you already exhausted areas that you do know about. Before you ask questions related to GC, ask yourself if you think GC is actually the problem. If you can’t answer that question it really is not a good use of your time to ask questions related to GC.

She brings up two fundamentals

  1. Having knowledge about fundamentals - memory and CPU really helps.
  2. Knowing what your performance requirements are is a must.

When you’ve decided that there is a problem, dig into it instead of guess what might be wrong. Example - If your app is using too much memory, look at who is using the memory. If you’ve decided that the managed heap is using too much memory, look at why. Managed heap using too much memory generally means you survive too much in your app. Look at what is holding on to those survivors.

Ever wanted to see data visually in excel without going through the complex wizard? Juice analysis has posted one of the neatest hacks to get graphical representation of data.

excel_rept2.gif

The hack uses simple function REPT.

Microsoft Excel 12 has similar functionality and is super easy to use.

See Lightweight data exploration in Excel for more detail.

This gets even better look at More on Excel in-cell graphing has additional information on this hack. They even have a great excel sheet demonstrating all the hacks mentioned in this article.

Ever got lost in reading the call stack trace from .NET profiler? I do in the recursive calls to functions. Other times I just want to get a feel of the application call stack to identify which areas should I be focusing on.

Josh Williams has written an amazing tool to help you out here. Performance Console starts where profiler ends. Once you have exported data from profiler into .csv files (using command line or export functionality), you can point the tool to the directory and the tool will generate a nice output for you.

The command line is very intuitive and you can do operations like show me call stack for all the functions where inclusive time is more than 5% and BTW, collapse all the un interesting nodes (e.g. where symbols are not present).

I have been using this tool for a few weeks now and believe me it has made my life so much simpler. It is certainly very useful in finding any regressions in the code.

So what does a developer dealing with unmanaged code do? Josh mentions in Rico's post introducing Performance Console that it works very well with unmanaged code as well.

Josh has written an introduction to the tool but you need to read the (short but very useful) documentation that comes with the download.

Give it a try!

Check out the CLR Profiler team's blog. It is a collection for all the great articles profiler team is writing. Some of my favorites are -

Enjoy!

Slava has written a great post on the information available in SQL Server DMV - sys.dm_os_sys_info. This is a great resource for finding Windows system information, such as the CPU tick count, hyperthread ratio, physical and virtual memory, as well as the worker and scheduler counts.

My favorites are -

-- How many sockets does my machine have
select cpu_count/hyperthread_ratio AS sockets
from sys.dm_os_sys_info

-- Is my machine hyper threaded?
-- Unfortunately you can’t derive this information using this DMV today

-- How many either cores or logical CPU share the same socket?
select hyperthread_ratio AS cores_or_logical_cpus_per_socket
from sys.dm_os_sys_info

You can find more information about this DMV at http://msdn2.microsoft.com/en-us/library/ms175048....

By default, SQL Server 2005 Express Edition and SQL Server 2005 Developer Edition do not allow remote connections. To configure SQL Server 2005 to allow remote connections, complete all the following steps:

  • Enable remote connections on the instance of SQL Server that you want to connect to from a remote computer.
  • Turn on the SQL Server Browser service.
  • Configure the firewall to allow network traffic that is related to SQL Server and to the SQL Server Browser service.

See How to configure SQL Server 2005 to allow remote connections for more details.

Very Interesting observation about CLR assemblies...

http://www.sqljunkies.com/WebLog/ktegels/archive/2006/03/06/18506.aspx

In our internal discussion forum for SQL Server, somebody asked this question.  Here is the answer (Thanks G2)

Temporary tables & table variables differ in semantics, capability and query performance. Which one is appropriate for you will depend on your circumstances.
 
Temporary table operations run under the user transaction (user controls commit/rollback.) Table variable operations run under a system transaction -- statement success/error controls commit/rollback.
 
Temporary tables allow altering of tables after create (create index, add/alter/drop columns etc) Table variables do not allow alter after declare.
 
Temporary tables leverages statistics based query recompile. Table variables do not.
 
A simple rule of thumb would be to use table variables for small results/simple queries & use temporary tables for larger results/complex queries (assuming semantics & capability differences do not matter.)

One of the biggest pain points for SQL Server 2000 was for looking at the plan of a SQL Statement, you have to execute the same. As per my understanding SQL Server 2000 did not expose any mechanism for this. So the only way to look at the plan of the query was to capture the queries used by application using profiler and then run them in Query Analyzer. This was a nasty process and sometimes it was not possible to use this mechanism if the query involved dependency on some objects not present in the query under consideration, for instance temporary tables.

In SQL Server 2005, this analyis has been made possible by presenting the plan in XML format. For looking at the plan of the query in XML format, execute the following command -

-- Find Plan for a Query
--
select query_plan
from sys.DM_EXEC_QUERY_PLAN(0x0600050059571120B8414C0B000000000000000000000000)

For better understanding of DMV DM_EXEC_QUERY_PLAN, see http://msdn2.microsoft.com/en-us/library/ms189747.aspx.

Being able to look at the plan after the query has executed is a big help. But I would have preferred to see this plan in graphical format. How do I do that?

SQL Server 2005 also supports a mecanism to so that. To accomplish this, store the xml plan in a file and change the extension of the file to .sqlplan. Now open this file in SQL Server Management Studio and VOLA you can see the plan in graphical format.

More Posts Next page »
 
Page view tracker