Improving Your I/O Performance - SQL Azure Team Blog - Site Home - MSDN Blogs

Improving Your I/O Performance

Improving Your I/O Performance

Rate This
  • Comments 23

As a DBA I have done a lot of work improving I/O performance for on-premise SQL Server installations. Usually it involves tweaking the storage system, balancing databases across RAID arrays, or expanding the count of files that the tempdb is using; these are all common techniques of SQL Server DBA. However, how do you improve your I/O performance when you are not in charge of the storage subsystem, like in the case of SQL Azure? You focus on how your queries use the I/O and improve the queries. This blog post will talk about how to detect queries which use a high amount of I/O and how to increase the performance of your I/O on SQL Azure.

Detecting Excessive I/O Usage

With SQL Azure, just like SQL Server, I/O is a bottleneck in getting great query performance. Before you can make any changes, you first thing have to be able to do is detect which queries are having trouble.

This Transact-SQL returns the top 25 slowest queries:

SELECT TOP 25
    q.[text],
    (total_logical_reads/execution_count) AS avg_logical_reads, 
    (total_logical_writes/execution_count) AS avg_logical_writes, 
    (total_physical_reads/execution_count) AS avg_phys_reads, 
     Execution_count
FROM sys.dm_exec_query_stats   
    cross apply sys.dm_exec_sql_text(plan_handle) AS q 
ORDER BY
 (total_logical_reads + total_logical_writes) DESC

The output looks like this:

clip_image001

You can modify the ORDER BY clause in the statement above to get just the slowest queries for writes, or the slowest for reads.

Changing the Query

With SQL Azure, it is Microsoft’s job to maintain the data center, the servers, handle the storage, and optimize the performance of storage. There is nothing you can do to make the I/O faster from a storage subsystem perspective. However, if you can reduce the amount of read and writes to storage, you can increase the performance of your queries.

Reading Too Much Data

One way to overuse I/O is to read data that you are never going to use. A great example of this is:

SELECT * FROM [Table]

This query reads all the columns and all the rows from the [Table]. Here is how you can improve that:

  • Use a WHERE clause to reduce the number of rows to just the ones that you need for your scenario
  • Explicitly name the columns you need from the tables, which hopefully will be less than all of them

Create Covered Indexes

Once you have reduced the number of columns you are returning for each query, you can focus on creating non-clustered covered indexes for the queries that have the most read I/O. Covered indexes are indexes that contain all the columns in the query as part of the index, this includes the columns in the WHERE clause. Note that there might be several covered indexes involved in a single query, since the query might join many tables, each potentially with a covered index. You can determine what columns should go into the index by examining SQL Azure’s execution plan for the index. More information about be found in the MSDN article: SQL Server Optimization.

Just a note, non-clustered indexes (what you make when you do a covered index) reduce the performance of your writes. This is because on insertion or updates, the indexes need to be updated. So you need to balance your covered index creation with the ratio of reads and writes to your database. Databases with a disproportion amount of reads to writes gain more performance from covered indexes.

Summary

I barely touch on the ways to reduce your I/O usage and increase your query performance by modifying your queries. As a rule of thumb the techniques that you find on the MSDN for query optimization for on-premise SQL Server installations should work for SQL Azure. The point I am trying to make is that if your query I/O usage is high, focus on optimizing your queries, Microsoft is doing a great job of optimizing the storage subsystem behind SQL Azure.

Do you have questions, concerns, comments? Post them below and we will try to address them.

  • Some more ways to improve storage efficiency and reduce I/Os:  

    1. Filtered indexes reduce the size of index tree inside SqlServer.

    2. Sparse columns reduce disk storage inside SqlServer. Sparse columns can have filtered indexes on condition like (FilterColumn is not null) which will speed up queries along with storage.

    3. Enabling Data Compression at page level/ row level will make SqlServer store more rows in one data page.

    4. On big tables with many indexes, having a smaller clustered index key (an int as opposed to GUID) will reduce all index tree size of all non-clustered indexes and in turn improve performance on queries that use these indexes.

    If billing is based on amount of disk space used and/or number of IOs, the above should help you a bit by making effective use of SqlServer storage.

  • I would recommend to fine tune your queries in development environment before putting it on Cloud. MS cloud DB will have same limitations as your in-hour DB. Generate execution plan, fix issues , re-verify and after full satisfaction upload at cloud.

  • We have a visual basic programme that is giving us SQL server errors but we cannot seem to see any error in the server event log. Is there a way to tell if a query or stored procedure has gone over the SQL timeout period as we suspect this may be the cause?

  • Does SQL Azure support Indexed views ?

  • Wayne,

    Is it possible to elaborate on your statement:

    "Microsoft is doing a great job of optimizing the storage subsystem behind SQL Azure."

    Can you share some details about the storage subsystem - specs etc and what is being done to further optimize it. For example, we know that an Azure role instance comes with x compute power, Y amount of memory and so on.

    Thanks,

  • Ashish: Yes, see: msdn.microsoft.com/.../ee336275.aspx

  • John:  We have a white paper in the works on this subject to be posted shortly.  That topic is longer than a blog post or the comment section of the blog.

  • Hi Wayne,

    Sorry for coming late to the party, but this blog's comments are the closest to an answer I've found.

    I'm trying to follow all the guidelines at azurescope.cloudapp.net/BestPractices , and it talks a bit about concurrency limits.

    My question is, how do databases on the same azure sql server affect each other, if at all? From reading what you said earlier, I imagine each will be considered a separate tenant and will be allowed what it needs in terms of resources regardless of the server groupings I have them under in my windows azure management, correct? If not, what bandwidth limits will they have (as in how will database A throttle database B if both are located on the same sql azure server)?

    Thank you very much for the great topics btw!

Page 2 of 2 (23 items) 12
Leave a Comment
  • Please add 5 and 6 and type the answer here:
  • Post