On extremely rare occasions in SQL Server 2005, when the following conditions are true:
1. You have a database configured for database mirroring
2. You have replication configured on the same database
3. You are running the database mirroring monitor
4. The SQL Agent job created by the database mirroring monitor is enabled
It was possible to encounter a situation where tempdb grew suddenly and significantly, leading to all the complications that such a situation causes, including but not limited to:
Msg 1105, Level 17, State 2, Line 3
Could not allocate space for object xxxxxxx in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
Error: 17053, Severity: 16, State: 1.
C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\tempdb.mdf: Operating system error 112(There is not enough space on the disk.) encountered.
Error: 1101, Severity: 17, State: 2.
Could not allocate a new page for database 'tempdb' because of insufficient disk space in filegroup 'PRIMARY'. Create the necessary space by dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
If you were able to capture this situation and have some monitoring running, you would notice the following event captured in profiler:
An execution of the following which is the main entry point stored procedure for the mirroring monitor:
exec sys.sp_dbmmonitorupdate @database_name
followed by an execution of :
exec @retcode = sys.sp_dbmmonitorMSgetthelatestlsn @database_name, @end_of_log_lsn output
which was the last procedure executed by the mirroring monitor before tempdb started to grow. If you were to actually trace the details of this procedure, you could see the statements which it executes. What you will see is that last statement actually executed before tempdb starts filling is this:
dbcc dbtable(blog_mirror) with tableresults, no_infomsgs
Being that this procedure is actually held in the resource database (the others are in msdb so you can take a look easily if you want to), it is protected from normal viewing, so I won't reproduce it here, but it does exactly what it's name would imply, it gets the latest LSN and then uses this to do some calculations of amount of time required to synchronize and other such functions.
The problem lies here in that in extremely rare occurrences this procedure can cause huge tempdb growth due to the way it calculates the latest LSN using dbcc dbtable (an undocumented internal DBCC command) and stores it's temporary calculations in a temp table. Unfortunately the only way to resolve this problem in SQL Server 2005 is to disable the mirroring monitor job in SQL Agent. You should only consider or need to do this though if you are definitely experiencing this specific problem, i.e. you have all the pre-requisite conditions and you have a profiler trace proving that the dbcc command was the command executed before tempdb filled up.
However the good news is that in SQL Server 2008 we made a design change to collect this data in a far more efficient way. If you look at the results returned from a query of the sys.database_mirroring DMV in SQL 2008:
you will note 2 columns have been added to the output:
Meaning that the information that the mirroring monitor requires is now available in a much simpler format. Therefore if you look at the SQL 2008 version of the msdb stored procedure sys.sp_dbmmonitorupdate, you will note that the execution of sys.sp_dbmmonitorMSgetthelatestlsn has been replaced with a simple select from the new DMV columns:
select @role = (mirroring_role - 1),
@status = mirroring_state,
@witness_status = mirroring_witness_state,
@failover_lsn = mirroring_failover_lsn,
@end_of_log_lsn = mirroring_end_of_log_lsn
from sys.database_mirroring where database_id = @database_id
which is both simpler, faster and avoids any possibility of the tempdb problem occurring.
Graham thanks for the post from 2008. there was very little else out there. I was investigating this problem for the third time in a couple of months in our installation and found the culprit was indeed sp_dbmmonitorresults. It basically ran for over 2 hours until I killed it taking Tempdb from 5 - 83 Gigs. Problem is, its now Jan 2011 and I am still running on SQL 2005 for a few more months in Production.
Was this problem confirmed in SQL 2005? Do you know if there is a Hotfix?
Is there also a relationship between this issue and Read Committed Sanpshot which makes extensive use of Tempdb?
Is the work around? Turning this Job off will leave me without Database mirroring monitoring statistics which I use to generate Alerts.
Sounds to me like the solution in 2008 could have been applied to 2005?
Any info you can provide would be great.
Hi Ray, I'm not aware of it being related to isolation level in addition to the other symptoms described, but then I haven't seen your exact data. The specific problem was really the use of DBCC DBTABLE execution to get a temporary table populated and this is where we made a fix. Thoeretically other conditions could cause the results of DBCC DBTABLE to grow in addition to those I described, but I don't have the conclusive data to prove this one way or the other. I've browsed the bug again and we did make a fix into SQL 2008 as you are aware, but this has not been backported to SQL 2005 as the occurences were so rare and there is a work around. The reason that we didn't push it back is that there are several other code changes which needed to go in around the stored procedure change which made it a high risk change that could only be pushed into a future version. I've double checked the latest version of the procedure pushed out with SP4 and it still contains the original code. Therefore as I see it you will have to disable the monitoring to be certain of avoiding this. If you want to discuss in more detail please feel free to email me directly via the contact section of the blog. Thanks - Graham
I just experienced this and was able to kill the process doing the dbcc. That stopped the tempdb growth and the mirroring job continued to run on its normal schedule. Also the mirroring appears to be intact.