Recently, I worked on an interesting case where customer after upgrading an instance of SQL Server 2008 with SP1, the SQL server 2008 services and resources went into a stopped and failed state respectively. The services might appear to come online briefly for few seconds but will fail eventually. Let me share more about it and how I got that fixed.
Here is the environment details
Windows server 2003 R2 SP2 64 bit and versions and editions
SQL Server 2008 64 bit and all versions and editions
Clustered or Non clustered Instance of SQL Server 2008
· Scanning through the SQL Server error log after SQL Server 2008 was upgraded to SP1, could get below details.
Error: 5133, Severity: 16, State: 1
Directory lookup for the file "<path>\MSSQL10.<Instancename>\MSSQL\Data\temp_MS_AgentSigningCertificate_database.mdf" failed with the operating system error 3(The system cannot find the path specified.).
Error: 1802, Severity: 16, State: 1.
CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
· The following errors accompany the above errors. These errors signify that the upgrade script used to upgrade Master database was not executed successfully.
Error: 912, Severity: 21, State: 2.
Script level upgrade for database 'master' failed because upgrade step 'sqlagent100_msdb_upgrade.sql' encountered error 598, state 1, severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the 'master' database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
Error: 3417, Severity: 21, State: 3.
Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.
· Additionally, we might see the following errors in the Application event logs
MSSQLSERVER Error Failover 19019 N/A <Server Name> "[sqsrvres] CheckServiceAlive: Service is dead
SQLSERVERAGENT Information Service Control 102 N/A <Server Name> SQLServerAgent service successfully stopped.
MSSQLSERVER Error Failover 19019 N/A <Server Name> "[sqsrvres] OnlineThread: Error 1 bringing resource online.
MSSQLSERVER Error Failover 19019 N/A <Server Name> "[sqsrvres] OnlineThread: service stopped while waiting for QP.
· SQL Server setup creates a database with the data file named temp_MS_AgentSigningCertificate_database.mdf during the process of installation
· The above error is returned when SQL Server is not able to create that database in the default data path because the system cannot find the path specified
· There could be various reasons as to why the path cannot be found. One of the situations could be that the path is invalid.
· Firstly, we can check if this is a valid path. Since we are seeing an operating system error 3 the path is definitely not valid
· Please Note: We will not find this file anywhere on the machine as this is a temporary database that is created only for the sake of setup and will not persist after the setup
· The registry Key that we can check for the Default Data Path is
o HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.<Instance Name>\Setup\SQLDataRoot
· If this key has an erroneous path we can change that and restart the services for SQL Server to solve the issue
· Else if this key has the correct data path and still the error message shows an invalid path then we can visit the following registry hive
o HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.<Instance Name>\MSSQLServer
· I was able to identify the above registry hive by simply searching the registry for some portion of the path given in the error
· We might find keys like "backupdirectory", "defaultdata" and "defaultlog" having the invalid path in them. The most obvious ones that we need to change are "defaultdata" and "defaultlog"
· We have to change their values to the one we see under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.<Instance Name>\Setup\SQLDataRoot
· Now we should be able to start the services without any issues
· We can also look into the Errorlog and see if the upgrade script for the Master database was executed successfully.
Alternate way to identify the Default data path is by going to the Server properties in Management studio -> Database Settings -> Find the path under Database default locations. But this is only possible while the services are online and if we can connect to SQL server
Regards,Bharath Kumar SE, Microsoft SQL Server
Reviewed by,Akbar Farishta and Balmukund LakhaniTL, Microsoft SQL Server support
and is this not a BUG?
why cant you check for the validity of the path before you start to create a DB in that path?
And finally the error- "Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it.".. how misleading. Atleast stop misreporting. Check outr on the net how many people actually ended up rebuildign the systemDB to realize that its not the master corruption.
I agree with "Dude" here. Microsoft should implement checks before the user start the this/any similar process.
This is outstanding and invaluable information.
We encountered this exact issue after using MS Updates to install SQL 2008 SP1 (x86). In our case, the DefaultData and DefaultLog registry entries had a typo in the path ("SQLBD" instead of "SQLDB"). Since the installation was entirely unattended, we're unaware of how the typo could have occurred, but as soon as we corrected the entries, the service started without issue.
Thanks very much for your due diligence.
Man you saved my live! I was praying to solve this. Thanks.
Totally agree with "Dude". This should be a bug and fixed. Very misleading and leads to think that the master DB is corrupted, etc.
SQL Server team should design the product installation package to be more intelligent than this. I'm not exactly buying a antivirus program for my home computer from nitrosoft here. This is enterprise software!
Thanks Bharath!, u helped me too, huh, when i saw sql server down i was shocked and when searched error i found your post first. Thanks very much!
This post saved my bacon!! I upgraded SQL Server 2008 to SP2 on a Sunday afternoon with batch processing slated to started in 4 hours and the doggone server wouldn't come up, displaying exactly the error messages displayed above. The problem ended up being an extra space at end of the default data and log values. Once I removed the spaces - in the registry - MSSQLSERVER and MSSQLAGENT started right up and all was well.
SQL 2008: SP1, SP2, SP3 (all the CUs) - The issue persists.
R2 - SP1 (and all the CUs)- The issue persists.
Such a lame bug. I am sure the code fix is easy.. but guess Bharath + Akbar are too bizi to get it fixed.
worked thank you very much!
It worked!!! Well done. MS please fix the error messages atleast :(
Thank you. This blog helped me with the exact same problem after deploying 2008 SP3 to our clustered instance. However, in my case, the registry values were correct. For some reason, my folder defined for the Log directory was gone. I have no idea how that happened since it was obviously there for SQL to start up after the initial RTM installation.
Thank you, thank you, thank you!!
"We have to change their values to the one we see under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.<Instance Name>\Setup\SQLDataRoot"
Could we not also change the value in this "HKLM\..\Setup\SQLDataRoot" to be equal to the value in "HKLM\..\MSSQLServer"? (This is the opposite "direction" from what you suggest.)