One of the first things I do when I begin to troubleshoot poor performance on a SQL instance is collect wait information. I want to see where the requests that are executed on the instance are spending their time. Knowing where the instance spends the most time waiting allows me to focus my tuning efforts on the most-impactful areas. This methodology is commonly called “Waits and Queues” and is already covered very well in several places:
The following script queries dm_os_wait_stats and returns totals for each wait type over the last 5 minutes. Running this while the instance is suffering from poor performance will help you understand where your instance has been waiting during this time. Once you know the time it spends on different waits, you can determine where your best tuning opportunities lie.
-- DROP TABLE #Waits SELECT GETDATE() AS 'snapshot_time', * INTO #Waits FROM sys.dm_os_wait_stats
-- Wait for 5 minutes, though this can be changed to any timeframe 'HH:MM:SS' WAITFOR DELAY '00:05:00'
INSERT INTO #Waits SELECT GETDATE(), * FROM sys.dm_os_wait_stats
-- Select the top waits during the time period SELECT TOP 50 ws1.wait_type , DATEDIFF(SECOND,ws1.snapshot_time, ws2.snapshot_time) AS 'snapshot_duration_sec' , ws2.waiting_tasks_count - ws1.waiting_tasks_count AS 'waiting_tasks_count_dif' , ws2.wait_time_ms - ws1.wait_time_ms AS 'wait_time_ms_dif' , (CAST(ws2.wait_time_ms AS money) - ws1.wait_time_ms) / DATEDIFF(SECOND,ws1.snapshot_time, ws2.snapshot_time) AS 'wait_time_ms_per_second' , (100 * (ws2.wait_time_ms - ws1.wait_time_ms) / SUM(CAST(ws2.wait_time_ms AS money) - ws1.wait_time_ms) OVER ()) AS 'percent_of_total_waits' , ws2.signal_wait_time_ms - ws1.signal_wait_time_ms AS 'signal_wait_time_ms_dif' FROM dbo.#Waits ws1 JOIN dbo.#Waits ws2 ON ws1.wait_type = ws2.wait_type WHERE ws1.snapshot_time = (SELECT MIN(snapshot_time) FROM #Waits) AND ws2.snapshot_time = (SELECT MAX(snapshot_time) FROM #Waits) ORDER BY wait_time_ms_Dif DESC
The definitions for the various wait types are documented in the BOL entry for dm_os_wait_stats. Some of the more-common types are explained at the CSS SQL blog and at Paul Randal’s blog.
There are a couple of important things worth mentioning here. First, the wait time totals returned by this script only reflect the time period during which the script was run. This seems obvious when you say it out loud, but keep it in mind when capturing your data. You want to be sure that you run the script during a period of poor performance and “typical” user traffic. Second, remember that the wait totals returned by this script are for the entire instance. Specific statements or batches may be performing poorly for entirely different reasons than the instance. If you’re focusing on a statement or small group of statements then you should focus your analysis on those statements.
(Cross posted to http://www.houseofsql.com/2012/01/31/5-minute-wait-script/ )