SharePoint Strategery

Best used when *strategy* breaks down... (blog by Brian Pendergrass, Microsoft SharePoint - Premier Field Engineer)

SP2010: Troubleshooting ServerID Mismatch (Deleting Components)

SP2010: Troubleshooting ServerID Mismatch (Deleting Components)

Rate This
  • Comments 1

In a another post, I described a scenario where removing/re-joining a SharePoint Server that hosts a Search component will lead to an inconsistency in the ServerID being referenced by the applicable Search component(s) and the SharePoint farm Configuration object for the applicable server. If you identify a ServerID mismatch, the recommendation is to remove (or move to another server) any component with a mismatched ServerID, then (optionally) re-add the component(s) to the original server(s).

But what happens when you cannot delete the impacted component?

First, on both the server hosting the Search Administration Component and any impacted server, look in ULS for the "Application Server Administration Service Timer Job" (job-application-server-admin-service) running with focus on times shortly after a topology change is invoked.

This Timer Job runs once a minute and is probably the most important job you never knew applied to Search. This particular Timer Job (job-application-server-admin-service) is used by Search to provision, mount, and dismount components as well as performing any task that requires Local Administrative rights. When problems occur with topology related actions, you will typically find any registry, network share, folder creation, and COM related errors within this block.

When there is a ServerID mismatch, you'll find messages that resemble the following in the ULS of the Search Admin Component: 

03/25/2013 12:20:55.68 OWSTIMER.EXE (0x06A8)  0x034C  SharePoint Foundation
   Monitoring     nasq    Medium
   Entering monitored scope (Timer Job job-application-server-admin-service)
03/25/2013 12:20:55.68 OWSTIMER.EXE (0x06A8)  0x034C  SharePoint Server Search
   Administration dkd5    High
   synchronizing search service instance
03/25/2013 12:20:55.68 OWSTIMER.EXE (0x06A8)  0x034C  SharePoint Server Search
   Administration eff0    High
   synchronizing search data access service instance   
03/25/2013 12:20:56.78 OWSTIMER.EXE (0x06A8)  0x034C  SharePoint Server Search
   Administration fel1    High
   Unable to find server 441318c0-476e-4c7e-af76-34d14b5c7067 
03/25/2013 12:20:56.84 OWSTIMER.EXE (0x06A8)  0x034C  SharePoint Foundation
   Monitoring     b4ly    Medium 
   Leaving Monitored Scope (Timer Job job-application-server-admin-service)

If the ULS doesn't provide any additional insights (other than confirmation that the server could not be found), check if you see output such as the following in PowerShell:

$SSA = Get-SPEnterpriseSearchServiceApplication "Name-of-your-SSA
$SSA.CrawlTopologies   #or $SSA.QueryTopologies 
     Id              : 925018ed-02db-4704-8473-5f07e306a8d1 
     CrawlComponents : {104a537d-1f93-4ed6-8fc7-3441110f9c55-crawl-0} 
     State           : Active 
     ActivationError : 

     Id              : 77f93a93-6914-4e33-baa5-932b54800382 
     CrawlComponents : {104a537d-1f93-4ed6-8fc7-3441110f9c55-crawl-1} 
     State           : Activating 
     ActivationError : Object reference not set to an instance of an object

Alternatively, you may find the Topology state indefinitely stuck in “Deactivating”, such as:

     Id              : 4d162109-1572-43d5-94a9-c3556bda3bb3
     CrawlComponents : {104a537d-1f93-4ed6-8fc7-3441110f9c55-crawl-1}
     State           : Deactivating
     ActivationError : 

 

  • To resolve any Crawl or Query Topologies stuck in "Activating", run:
    $SSA = Get-SPEnterpriseSearchServiceApplication "Name-of-your-SSA"
    foreach ($oldTopo in ($SSA.CrawlTopologies | where {$_.State -eq "Activating" })) {$oldTopo.CancelActivate() }
    • If this fails or has not completed within ~30 minutes, run:
      $SSA.RefreshComponents()
      • Note: After running this, you will initially have problems connecting to the Search Administration page.
      • Within a few minutes, all of the healthy components should return and the page will be accessible again (and the mismatched components will likely remain offline/disabled)
      • When the healthy components all return, try to cancel the activation again…
    • If the $SSA.RefreshComponents() also fails, try shutting down the Server hosting the impacted Component(s), which will cause the Search Admin to move these applicable Components to a Disabled state (and should also unblock many problems associated with the ServerID mismatch). In other words, the Search Admin has code that specifically handles a server being unreachable, and these steps take advantage as a workaround to overcome the topology blockers:
      • Shut down the Server hosting the impacted Component (e.g. Crawl Components)
      • After 60 minutes, the Search Admin Component should move the Component(s) on this shut down Server to a Disabled state (based on the DisableInterval registry key)
      • To verify the state of the Component, run:  $ssa.CrawlTopologies.ActiveTopology.CrawlComponents
      • Once the Component moves to a disabled state, delete the Component from the Topology before restarting the server (otherwise the Component will move out of the Disabled state and attempt to once again come online)
        • Note: If this an impacted Query Component, you may have to lower the SSA's MinimumReadyQueryComponentsPerPartition property depending on your particular topology configuration. For example, assuming the property's default value of 2 and an Index Partition with 3 Query Components, the SSA can abandon a Query Component should it go offline for a period longer than that defined by the SSA's TimeBeforeAbandoningQueryComponent property (by default, 60 minutes). However, should the second Query Component become unresponsive, the SSA could not abandon the second Query Component because it would take the number of "Ready" Query Components for this partition below the MinimumReadyQueryComponentsPerPartition threshold.
      • If this still fails, it's probably worth opening a support case to further troubleshoot the specifics of your scenario, but an index reset may be required at this point. 
  • Resolving any Crawl or Query Topologies stuck in “Dectivating” typically requires an Index reset, but again, it's probably worth opening a support case to further troubleshoot the specifics of your scenario.

Once the "Activating" or "Deactivating" Topology has been unblocked (and there is only an "Active" Crawl and Query Topology along with zero or more in an "Inactive" state), try many of the same steps above once again to move the impacted Components to a Disabled state. After the Component is disabled, the steps to delete it should complete without timing out:

  • $SSA.RefreshComponents()
  • Shut down the Server hosting the impacted Component (e.g. Crawl Component)
  • After 60 minutes, the Search Admin Component should move the Component(s) on this shut down Server to a Disabled state
  • To verify the state of the Component, run:  $ssa.CrawlTopologies.ActiveTopology.CrawlComponents

Once a Component moves to the Disable state, you should be able to delete the Component (*before restarting the server) without the timeout occurring...

 

Blog - Comment List MSDN TechNet
  • Loading...
Leave a Comment
  • Please add 3 and 1 and type the answer here:
  • Post