Database Snapshots (http://msdn2.microsoft.com/en-us/library/ms175158.aspx) in SQL Server 2005 Enterprise Edition allow a read-only point-in-time view of data. Recently we encountered an interesting scenario with a customer implementation. The customer creates a database snapshot. Then a query is executed on the database snapshot. The first time a set of queries are executed, they take 17 minutes. Every subsequent execution of the same queries takes only 20 seconds. Obviously, the first execution required reading the data from the disk (physical reads) and takes a longer time, whereas, the second and subsequent execution of the query finds the data in the buffer pool (logical reads), and therefore takes less amount of time. There is nothing unusual here.
What is interesting is that right before querying the database snapshot, the source database was queried for exactly the same data, so essentially, the source database pages required to satisfy the query are already in the buffer pool, by the time we execute the first query on the database snapshot. So, the customer expected that, since the query against the database snapshot actually reads data from the source database (in this case, since the data has not changed after the database snapshot was created), it should have used the pages in the buffer pool, instead of doing the physical reads. But it didn’t.
The reason lies in the way the queries work on a database snapshot. In this blog we will discuss some of the internals that will throw some light on this. This blog assumes an understanding of database snapshots.
We will analyze the behavior by creating a database snapshot on the AdventureWorks database:
CREATE DATABASE [AdventureWorksSnapshot] ON
AS SNAPSHOT OF [AdventureWorks];
So, for the example discussed here, AdventureWorks is the “source database” and AdventureWorksSnapshot is the “database snapshot”. Note that the database snapshot has its own data file (AdventureWorks_DataSnapshot.mdf), which is a Windows sparse file (refer to Books Online for more details).
For our test, no changes have been done to the source database after the database snapshot was created. Therefore, the sparse file of the database snapshot is empty. At this time, if you run a query on the database snapshot, SQL Server will read the data from the data file of the source database.
Now, let’s execute a query on the source database. But before we do that we need to clear the buffer cache by issuing dbcc dropcleanbuffers.
Let’s also turn on the io and time statistics.
set statistics io on
set statistics time on
Now, let’s execute the following query one time:
select * from AdventureWorks.Sales.SalesOrderDetail d
inner join AdventureWorks.Sales.SalesOrderHeader h
The io and time statistics gives us the following output:
SQL Server parse and compile time:
CPU time = 62 ms, elapsed time = 364 ms.
(359 row(s) affected)
Table 'SalesOrderDetail'. Scan count 12, logical reads 41, physical reads 18, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 38, physical reads 15, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 813 ms.
The above results indicate that physical reads were performed on the database files to retrieve the result of the query. Now let’s execute the same query one more time:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'SalesOrderDetail'. Scan count 12, logical reads 41, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 38, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 0 ms, elapsed time = 207 ms.
The second execution of the query didn’t need any physical reads because after the first execution, the required pages are already cached in the buffer pool.
Now, let’s execute the same query on the database snapshot AdventureWorksSnapshot, and observe the io and time statistics.
select * from AdventureWorksSnapshot.Sales.SalesOrderDetail d
inner join AdventureWorksSnapshot.Sales.SalesOrderHeader h
CPU time = 31 ms, elapsed time = 398 ms.
Table 'SalesOrderDetail'. Scan count 12, logical reads 41, physical reads 22, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 38, physical reads 16, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 15 ms, elapsed time = 300 ms.
Note that the first query on the database snapshot required physical reads. Since no data in the source database was changed after the database snapshot was created, the sparse file is empty, and therefore, the data is retrieved from the source database. And, since, the required pages of the source database are already cached in the buffer pool, the customer expected that there is no need to perform any physical reads. But there were physical reads, why?
A peek into the internals of how database snapshots work will reveal the mystery.
When a database snapshot is created, an in-memory bitmap is maintained to indicate whether a page has been changed in the source database or not. If the page is changed, then the original page has been copied to the database snapshot’s sparse file. If the page is not changed, the original page is still there in the source database. While executing a query, this bitmap is referenced to find out whether a particular page is to be retrieved from the database snapshot’s sparse file or from the source database’s data file.
The database snapshot is just another database from the SQL Server engine point of view. Each database maintains its own cache in the buffer pool. Data pages cached in the buffer pool for one database cannot be copied or transferred to the cache managed by another database (or database snapshot) in the same SQL Server instance. Therefore, when we execute a query the first time on the database snapshot, the database snapshot has no pages in the buffer pool, and therefore decides to read from disk. Even though the source database has already read these pages into the buffer pool it still performs physical reads against the data files of the source database, and places the pages in the buffer pool associated with the database snapshot.
The number of data pages for each database read into the buffer pool is found by querying the sys.dm_os_buffer_descriptors.
select db_name(database_id) [DB Name], database_id, count(*) [cached pages count] from sys.dm_os_buffer_descriptors
group by database_id
set statistics io off
The output we get:
DB Name database_id cached pages count
----------------------- ----------- ------------------
AdventureWorks 6 560
AdventureWorksSnapshot 7 116
Once the pages are read from disk, they are put in the buffer pool for the database snapshot, and the subsequent queries that need these pages will be able to use the cache. Let’s execute the same query again on the database snapshot to verify this.
CPU time = 0 ms, elapsed time = 176 ms.
As you can see, the second execution of the query didn’t do any physical reads.
The queries against the source and snapshot database demonstrate that the data pages are maintained in the buffer pool independently of each database. Fundamentally, reading any database’s data pages that do not exist in the buffer pool will resolve to disk. Databases do not share data pages in the buffer pool. In the case of a snapshot database where the queried data pages are not in the buffer pool, the in-memory bitmap is checked to determine if the data pages are in the sparse files for the snapshot database or in the source databases files. The source database’s data pages in the buffer pool will never be read or copied by a snapshot database.
Okay, I think the customer here is right: It should be a little bit more intelligent - or lets rephrase this - there is room for further optimisation. This may be how its working (great-insight) currently and gives ideas on how we could get a few more umph for our bucks.
1) If the sparse file is empty, why not go and check the buffer cache of the database and treat them as the same thing?
--> Easy optimisation 1 - prehaps aim at Developer/Standard Edition.
2) In the case of a snapshot database where the queried data pages are not in the buffer pool, the in-memory bitmap is checked to determine if the data pages are in the sparse files for the snapshot database or in the source databases files --> If its found that its still in the origional database, check the buffer execution plan in the original database and execute from that plan instead of generating a new plan.
--> Slightly harder logic required : Enterprise edition kind of function. As this allows checking at the table level initially until greater granularity can be used.
I can imagine how DB007's suggestions would add complexity to the logic to such a level that bugs are introduced. However, I think that there's an easier,cleaner and safer way to optimize this kind of situation.
1. Query executed in snapshot DB needs a page, so a logical I/O is initiated.
2. Page is not in cache for snapshot DB, so a request for physical I/O is mage.
3. Now, the new logic kicks in. The request for physical I/O is intercepted and a check is made to see if the same page happens to be in the cache of some other DB.
4. If that is the case (and the page in cache is not dirty), the physical I/O is carried out by copying the 8K bytes of that page from Db1's cache to Db2's cache. Otherwise, the physical I/O is performed as it's currently done.