Both SQL Server and SQL Azure have a procedure cache which is used to improve the performance of queries on the server. This blog post will talk about how the procedure cache works on SQL Azure.
SQL Azure has a pool of memory that is used to store both execution plans and data buffers. This pool of memory is used for all databases on the physical machine, regardless of owner of the database. Even though the pool is across all databases on the machine, no one can see execution plans that they do not own. The percentage of the pool allocated to either execution plans or data buffers fluctuates dynamically, depending on the state of the system. The part of the memory pool that is used to store execution plans is referred to as the procedure cache.
When any SQL statement is executed in SQL Azure, the relational engine first looks through the procedure cache to verify that an existing execution plan for the same SQL statement exists. SQL Azure reuses any existing plan it finds, saving the overhead of recompiling the SQL statement. If no existing execution plan exists, SQL Azure generates a new execution plan for the query.
An instance of a database running on SQL Azure is called a replica; there are three replicas of any one database running at a given time. One replica is considered the primary replica, all the read and write queries go to this replica. The other two replicas are considered secondary, any data written to the primary replica is also written to the secondary replicas. If the primary replica fails, needs to be cycled for updates, or there is a load balancing operation, then a secondary replica is promoted to the primary.
Each replica resides on different physical machines which are on separate fully redundant racks. Because, procedure cache is on a per server basis, if the primary replica fails/cycled, when the secondary is promoted there are no query plans for that database in the procedure cache.
SQL Azure keeps track of performance statistics for every query plan in the cache. You can view these statistics for your database queries using the dynamic managed view (DMV) sys.dm_exec_query_stats like so:
For a more details example, see this blog post, I showed how to find queries with poor I/O performance by querying sys.dm_exec_query_stats.
sys.dm_exec_query_stats only reports statistics for the primary replica for your SQL Azure database. If a secondary is promoted to the primary, the results from sys.dm_exec_query_stats might be much different, just seconds later. Queries in the cache might suddenly not be there, or execution counts could be smaller, typically you would see them grow over time.
Execution plans remain in the procedure cache as long as there is enough memory to store them. When memory pressure exists, SQL Azure uses a cost-based approach to determine which execution plans to remove from the procedure cache.
SQL Azure removes plans from the cache, regardless of the plan owner; all plans across all databases existing on the server are evaluated for removal. There is no a portion of the memory pool set aside for each database. In other words, the procedure plan cache is optimized for the benefit of the machine not that individual database.
SQL Azure currently doesn’t support DBCC FREEPROCCACHE (Transact-SQL), so you cannot manually remove an execution plan from the cache. However, if you make changes to the to a table or view referenced by the query (ALTER TABLE and ALTER VIEW) the plan will be removed from the cache.
In many ways the procedure cache work much like SQL Server; SQL Azure is built on top of SQL Server. However, there are some differences and I hope you understand them better now. Do you have questions, concerns, comments? Post them below and we will try to address them.
What if another database or databases on the same server have many inefficient stored procedures? Doesn't that reduce the liklyhood that some of my moderately expensive procedures will be cached thus reducing the performance of my procedures? Is there any balancing to ensure the databases on the same server as a hog of a database get a certain minimal level of caching to keep them performant as well?
Michael: Caching is down for the overall performance of the server, there is no minimal level of caching per database. However, the procedures are not removed from the cache based on how inefficient they are it is based on the query compile cost. Balancing is performed on each replica, and replicas are moved based on database usage compared to other databases on the machine. So in essence your database is balanced to have a share of the resources; however the procedure cache is not reserved per database.
This is a great series of posts on SQL Azure explaining various aspects logically organized. Thanks for this and keep them going :)
Couple of things:
1. I would like to understand the basics of replica. Why we need it? Are the other two instances created automatically? Any link or artical which details on this would be a great input.
2. I think for on-premise version of SQL Server, buffer pool and procedure cache are set at an instance level but not for specific database and same is the way how even SQL Azure works. Am I missing something when I talk about buffer pool and cache management for on-premise version of SQL Server here?
The same sql works 10 to 100 times faster as ad-hoc query than as a procedure. This problem has plagued SQL Server since the beginning but it is far worse on Azure. Are you aware of this, and do you have any suggestions?