How to Troubleshoot Create Cluster failures in Windows Server 2012

How to Troubleshoot Create Cluster failures in Windows Server 2012

Rate This
  • Comments 5

In this blog, I will outline the steps to troubleshoot “Create Cluster” failures.

Steps when Troubleshooting “Create Cluster” Failures

Step 1: Run the Cluster Validation Tool

The cluster validation tool runs a suite of tests to verify that your hardware and settings are compatible with failover clustering. The first thing to do when troubleshooting, and something you should do every time you create a cluster is to run the Validate tool. To run cluster validation:

    1.       Open the Failover Cluster Manager snap-in (CluAdmin.msc)

      2.       Select Validate Cluster:

        Note:

        You can also use the Failover Clustering Windows PowerShell® cmdlet, Test-Cluster, to validate your cluster.

         

          3.       Navigate to C:\Windows\Cluster\Reports directory and open the Validation Report .MHT file

            4.       Review any tests that report as Failed or Warning.

             

            The validation summary provides a starting point to drill down further into the failure.  For instance, in the example below we can detect an invalid Windows Firewall Configuration.

             

            It is also useful to investigate the warnings flagged by validate. For example, the Active Directory Configuration test warning below flags a potential cluster creation problem:

            Step 2: Examine the CreateCluster.mht file

            If you cannot successfully create a cluster after all your validation tests are passing, the next step is to examine the CreateCluster.mht file. This file is created during the cluster creation process through the “Create Cluster” wizard in Failover Cluster Manager or the Create-Cluster Failover Clustering Windows PowerShell® cmdlet. The file can be found in the following location: C:\Windows\Cluster\Reports\CreateCluster.mht

            The admin level logging in the CreateCluster.mht file can help you determine the step at which the cluster creation process failed. For example in the CreateCluster.mht snippet below you can infer that there was a problem with configuring a Cluster Name Object for the cluster.

            Step 3: Turn on Cluster API debug tracing

            If you are unable to pinpoint the root cause of the failure by neither the Validate report nor the Create Cluster log, then verbose debug logging can be enabled. Debug tracing can be turned on with the following steps:

            1.      Open Event Viewer (eventvwr.msc)

            2.      Click View then “Show Analytic and Debug Logs”

            3.      Browse down to Applications and Services Logs \ Microsoft \ Windows \ FailoverClustering-Client \ Diagnostic

            4.      Right-click on Diagnostic and select “Enable Log”

            5.      Attempt to create a cluster

            6.      Right-click on Diagnostic and select “Disable Log”.

            Note: The debug tracing will be generated to the Diagnostic channel and viewable only after you disable logging.

            7.      Left-click on Diagnostic to view the logging captured.

             The following are examples of events generated to the Diagnostic channel when cluster creation fails when the Cluster Name Object cannot be added to the clusterou container. In this case, the cluster administrator does not have the Read All Properties permission on the organizational unit (OU) in Active Directory.

            Step 3b: Turn on Cluster API event log tracing programmatically

                   You can also turn on the Cluster API event log tracing programmatically. The debug information obtained will be the same as Step 3 but you are able to set this up using a script. The following are the steps to configure:

            1.  Run:

            logman start clusapiLogs -p {a82fda5d-745f-409c-b0fe-18ae0678a0e0} -o clusapi.etl -ets

            2.       Attempt to create a cluster

            3.  Run: logman stop clusapiLogs -ets

            4.       Run: tracerpt clusapi.etl -of CSV –o c:\report.csv

            5.       Open the generated Comma Separated Value (CSV) dump file and examine the User Data column for potential issues. Note that the ‘-o’ parameter determines where the CSV dump file is generated.  

            The following are some examples of Cluster API event log traces found for a “create cluster” failure.

            CreateCluster: Create cluster test-33 will be using a Read-Write DC \\subhatt-VM1.subhattcluster.com.

            CreateClusterNameCOIfNotExists: Failed to create computer object test-33 on DC \\subhatt-VM1.subhattcluster.com with OU ou=clusterou

            "CreateCluster: Create cluster failed with exception. Error = 8202

            msg: Failed to create cluster name test-33 on DC \\subhatt-VM1.subhattcluster.com. Error 8202.

            Step 4: Generate the cluster.log file

            The cluster log provides verbose logging for the cluster service and allows advanced troubleshooting. The cluster log can be generated even when the cluster creation fails by specifying the node to collect the log on. You can generate the cluster log using the Failover Clustering Windows PowerShell® cmdlet Get-ClusterLog

            Get-ClusterLog –Node <CreateClusterNode>

            Note:

            ·         The default verbosity level for the cluster log is 3. This proves to be sufficient for most debugging purposes. However, if this verbosity level is not capturing the data you need, you can increase the verbosity level

            o   On a Windows PowerShell® console run: (Get-Cluster).ClusterLogLevel = 5

            o    This generates significant spew so the default level should be restored once the troubleshooting is completed.

            ·         The cluster log can be generated in local time using Failover Clustering Windows PowerShell®:

            o   Get-ClusterLog -UseLocalTime

            Bonus Tip:

            The number one reason for create cluster failures is due to misconfigured permissions in Active Directory environments resulting in failures while creating the Cluster Name Object (CNO).

            Review: “How to Create a Cluster in a Restrictive Active Directory Environment”

             “Failover Cluster Step-by-Step Guide: Configuring Accounts in Active Directory”

            Did you really review the links above? Here’s a quick test… How would you fix the following “Create Cluster” errors?

            1.       An enabled computer account (object) for <cno> was found.

            Answer:

            1.       Verify that the cluster name you attempting to use for the new cluster is not already being used by a cluster in production. If it is, you should chose another name for the cluster.  In other words you need to ensure that you can take over the computer name with no adverse repurcussions.

            2.       On the Domain Controler,  launch the Active Directory Users and Computers  snap-in (type dsa.msc)

            3.       Navigate to the OU you which has the cluster name you are trying to use. In this case you are searching for “Test-8”. You might have to search multiple OUs to find the conflicting cluster name.

            4.       Delete the existing Cluster Name Object (CNO), “Test-8” or disable it by right-clicking on the CNO and selecting disable.

             

            2.       You do not have permissions to create a computer account (object) in Active Directory

            Answer:

            1.       On the Domain Controler launch the Active Directory Users and Computers  snap-in (type dsa.msc)

            2.       On the View menu, make sure that Advanced Features is selected

            3.       Navigate to the OU you are trying to create your Cluster Name Object (CNO) in. By default this will the same OU as that of the node you are trying to create a cluster from.

            4.       Right-click on the OU and select Properties and then the Security tab.

            5.       Ensure that the Cluster Administrator has Create all child objects permissions

            6.       Select the Advanced tab

            7.       Click Add, type the name of the cluster administrator account for the Principal

            8.       In the Permission container dialog box, locate the Create Computer objects and Read All Properties permissions, and make sure that the Allow check box is selected for each one.

            A final note: In this blog I have focused on “Create Cluster” failures. However, the same troubleshooting steps can also be used for “Add node” failures (failures encountered while adding a node to a cluster).

            Thanks!

            Subhasish Bhattacharya                                                                                                               
            Program Manager                                                                                          
            Clustering & High Availability                                                                                      
            Microsoft           

            Leave a Comment
            • Please add 7 and 7 and type the answer here:
            • Post
            • When I run Test-Cluster in PowerShell on a Windows Server 2012 RC cluster node (to be), the result shows a path for the Validation report that is like "C:\Users\[userID]\AppData\Local\Temp\1. Shouldn't the display of the path correlate to the default i C:\Windows\Cluster\Reports instead?

            • Hi Jimmy, The validation report is copied to both locations. The temporary location indicated and also the Cluster\Reports folder. Thanks!

            • I assign create computer object and read all properties permissions to CNO. But after I hit ok and go out of that dialog box it uncheckes create computer object automatically. It does retain read all properties permission though. Any idea??

            • When I run the validation configuration, it is not creating any verification report and also Validation Data For Node Set files under Windows\Cluster\Reports\.... and thus my sql installation is failing. Though the validation is throwing only one warning but no errors, verification report is not getting saved under reports folder and the next time, I try to open the view verification report, it asks me to run the validation configuration again as it cannot find one. Please help me with this. Thanks!

            • Are you running on a client or a node?  If you are running on a client are you an administrator of that machine?  The report is built in a temp directory for the running user during the validation run, and is then copied to %windir%\cluster\reports on all nodes and the client.  If the running user lacks access to these folders then the copy would fail… Some check steps for you:

              1. Verify that you are logged onto a WS2012 system or a Win8 client with RSAT installed.

              2. Verify that you are logged onto the system that you are running validation from with a domain account, and the domain account is in the Administrators group on each of the cluster nodes.

              3. Verify the ACLs for the %windir%\cluster\reports directory.  Compare them to a cluster that is working correctly.  

            Page 1 of 1 (5 items)