To get better performance from SQL Server queries, you need to be able to read a query execution plan and determine if that plan needs to be optimized. An execution plan is the method that the optimizer will follow to execute the submitted query, and it is built and compiled (stored away for later reuse) on the first query execution. The execution plan tells you which indexes are being used, how they are being used, and what other processing is being done on the data before the statement is complete. Scan and seek operations reflect the processing path that is taken over the logical data, whether on disk or in memory. They are very good indicators of areas to optimize for missing indexes (in the case of scans) or of good index usage (in the case of seeks).
SQL Server Management Studio can display a flow diagram of an execution plan that contains information about what operations have been selected to execute the query, some of the statistics that were used to make that decision, and even suggestions on where statistics or indexes are missing and how much of an impact that makes on the execution cost. To see a graphical execution plan for a query, click Include Actual Execution Plan on the Query menu in SQL Server Management Studio. When you execute the query, there will be a diagram in the Execution plan tab of the result. Note that graphical plans are read from right to left, and if you right-click or hover over an operation, detailed information about that operation is displayed.
The following graphic is a screenshot of a query to return all the Sales Orders rows in the AdventureWorks sample database that have an order date of 03/12/2004. You can download the AdventureWorks sample database at AdventureWorks2008 on CodePlex.
You see that there is a scan operation on the SalesOrderHeader table, indicating that each row is being read. Also note that the optimizer is actually telling you that this is probably not an optimal plan. The text in green (only partially shown) is suggesting an index to add to optimize this query and is indicating that it could bring an approximately 73 percent improvement.
Scans are potentially expensive operations, as they indicate that rows are being read from a data structure one at a time and then evaluated for some condition. Generally, it is optimal to seek within an index to a required row or range of rows rather than to read each and every row. So, I always investigate every scan operation that I see displayed in an execution plan.
Note that sometimes the scan operation is the best choice. An example of this is where small tables, fewer than about 500 rows, are being queried. In most cases, a clustered index scan will be selected even when there is a nonclustered index that might seem a better candidate. So, verify the row count of a table in a clustered index scan before you try to optimize it with more indexes.
Most often, the scan is the result of not having a good index to support the query. In the query above, if you right-click the text in green and then click Missing Index Details, a new screen will open with a script to create the missing index. In the case above, the following code is suggested.
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [Sales].[SalesOrderHeader] ([OrderDate])
If you create this index (replace [<Name of Missing Index, sysname,>] with an index name) and then rerun the query, you should see the following graphical execution plan.
This plan shows a seek operation on the SalesOrderHeader table, as there is a supporting index on the OrderDate column, and only the rows from the index that meet the criteria of the date that is indicated will be read from the disk and processed.
Next time, I'll write about how to determine how much I/O is being used by a scan operation vs. a seek operation.
Sharon Bjeletich is a database architect and performance and scalability expert who has worked with SQL Server since version 4.2. She is a former senior program manager in the Microsoft SQL Server product group and a former technical officer for the World Health Organization. Currently, she is an independent consultant at SQL Scalability.