Windows Server 2008 introduces significant changes to the Windows Cluster Service. The new Windows Cluster security model is one of the most important of these changes and will require seasoned SQL Server administrator to update her/his old-and-trusted knowledge.
Some days ago I was facing a problem trying to install a SQL Server 2005 cluster on Windows Server 2008. The SQL Server installation seemed to complete successfully on both cluster nodes however, the SQL Server service was not starting as a clustered service. As soon as the installation completes, the clustered database resource was failing over the second cluster node and failing completely right after.
The setup log file Summary.txt (located by default under C:\Program Files\Microsoft SQL Server\90\Setup Bootstrap\LOG folder) showed no installation problems and the SQL Server ERRORLOG was not showing any useful information about the error:
2008-07-09 14:33:10.81 Server Microsoft SQL Server 2005 - 9.00.1399.06 (X64) Oct 14 2005 00:35:21 Copyright (c) 1988-2005 Microsoft Corporation Enterprise Edition (64-bit) on Windows NT 6.0 (Build 6001: Service Pack 1)
2008-07-09 14:33:39.39 spid5s SQL Server is terminating in response to a 'stop' request from Service Control Manager. This is an informational message only. No user action is required.
We did check that al the installation requirements for SQL Server 2005 were satisfied on the environment; the new "Validate Cluster Configuration Wizard" report recently run on the cluster was not showing any error either. Since many cluster installation errors in SQL Server are caused by incorrect settings at the Windows cluster level, I decided to look into the cluster.log file for any possible hint. You can find guidelines to start working on this cluster.log file, as well as a procedure to generate this file in Windows Server 2008 on my previous post.
Reviewing the cluster.log proved to be a good idea:
10:29:14.021 INFO [RES] Network Name <SQL Network Name (C2K5NET1)>: Failed to find a computer account for C2K5NET1. Attempting to create one on DC \\dccorp.contoso.lab. 000003f4.00001108::2008/07/10-10:29:14.021 INFO [RES] Network Name <SQL Network Name (C2K5NET1)>: Trying NetUserAdd() to create computer account C2K5NET1 on DC \\dccorp.contoso.lab in default Computers container 000003f4.00001108::2008/07/10-10:29:14.037 ERR [RES] Network Name <SQL Network Name (C2K5NET1)>: Unable to create computer account C2K5NET1 on DC \\dccorp.contoso.lab, in default Computers container, status 5 000003f4.00001108::2008/07/10-10:29:14.053 ERR [RHS] Online for resource SQL Network Name (C2K5NET1) failed.
It was clear that "something" was trying to create the computer account for the SQL Server Network Name on the DC and was failing on the attempt.
A look into the Application Event log showed also the same error (XML data is excluded):
Log Name: System Source: Microsoft-Windows-FailoverClustering Date: 09-Jul-08 10:47:21 AM Event ID: 1194 Task Category: (19) Level: Error Keywords: User: SYSTEM Computer: CNODE2.contoso.lab Description: The description for Event ID 1194 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
SQL Network Name (C2K5NET1) contoso.lab Unable to create computer account Access is denied.
The Application Event log was indeed of great help since it included and Event ID (1194). A quick search into the Microsoft Support Web site showed detailed information for this event. According to this information, each clustered service or application on Windows Server 2008 is associated with a computer account in Active Directory (with the exception on a Hyper-V virtual machine), this account, called Virtual Computer Object (VCO) is created during installation by the High Availability wizard. If this VCO cannot be created for any reason, the clustered service won't work.
This VCO is created by the cluster account, known as Cluster Name Object (CNO, a different computer account with the same name as cluster), which is created during the initial Create Cluster wizard. These relationships are better explained in the following diagram:
If the "application account" does not have the "Create account" permission into Active Directory, the VCO (computer account for service or application) will not be created. In our case the VCO was non-existent in the Directory, a quick look into the CNO security properties showed the "Create account" right was missing. We did add the "Create account" permission and provisioned the VCO manually as described in TechNet Library. After these changes the SQL Server clustered service started successfully, we didn't need to re-install or change any other thing at cluster level.
You can find more information about the new security model introduced by Windows Server 2008 cluster in the following Knowledge Base article:
Description of the failover cluster security model in Windows Server 2008, http://support.microsoft.com/default.aspx?scid=kb;EN-US;947049