Welcome to MSDN Blogs Sign in | Join | Help

I'm using forms or kerb auth and search/crawl (indexing) isn't working

<Updated Jan 10 with latest info from the testers.>

Got this question from a reader:

Question: The consultant that helped us develop the SharePoint architecture plan said that all web applications that have sites that need to be crawled need to use NTLM as the "default" authentication method and then can be extended using Kerberos in the Intranet zone.  He said that this was because the crawler could not crawl a Kerberos site, only the NTLM address.

In my research to learn SharePoint I have not run across this requirement in any other location.  No one else ever mentions it.  Was the consultant off base?  Or is just not mentioned often?

Answer: The consultant was right based on the information on TechNet.  On Jan 9th the testers narrowed down the issue to custom ports which is still debated by some MVPs who attest that they have it working (more on that when those results are verfied.  If you use standard ports with Kerbers and setup your SPNs correctly you can get indexing to work.   

If you are testing kerberos authentication and setting it up on a single server where either SQL or SQL Express or windows internal database engine is installed locally this hasn't been seen as an issue.  More details from the team, by the way in the comming months you should see more content on crawling and authentication in the near future as the docs on TechNet are updated.

There is an issue which prevents the correct start address from being created if the default zone is digest.  The timer still picks it as valid even though it is not.  The workaround had been to make the default zone NTLM and have some other zone be digest. 

These are the docs that will be updated that are currently misleading. 

http://technet2.microsoft.com/Office/en-us/library/40117fda-70a0-4e3d-8cd3-0def768da16c1033.mspx?mfr=true

Start with the section titled:  “Order in which the crawler accesses zones”

When planning the zones for a Web application, consider the polling order in which the crawler accesses zones when attempting to authenticate. The polling order is important, because if the crawler encounters a zone configured to use Kerberos or digest authentication, authentication fails and the crawler does not attempt to access the next zone in the polling order. If this occurs, the crawler will not crawl content on that Web application.

<etc>

Planning zones for your authentication design

If you plan to implement more than one authentication method for a Web application by using zones, use the following guidelines:

• Use the default zone to implement your most secure authentication settings. If a request cannot be associated with a specific zone, the authentication settings and other security policies of the default zone are applied. The default zone is the zone that is created when you initially create a Web application. Typically, the most secure authentication settings are designed for end-user access. Consequently, the default zone will likely be the zone that is accessed by end users.

• Use the minimum number of zones that is required by the application. Each zone is associated with a new IIS site and domain for accessing the Web application. Only add new access points when these are required.

• If you want content within the Web application to be included in search results, ensure that at least one zone is configured to use NTLM authentication. NTLM authentication is required by the index component to crawl content. Do not create a dedicated zone for the index component unless necessary.

Troy: Yeah, we explicitly documented that Search cannot crawl over Kerberos authentication. 

Additional thoughts by Joel:

I can imagine people asking about the overhead of the second web app for the already existing web apps for content.  My recommendation would be to consolidate the app pools for all of the secondary "crawling only" web apps and set it to idle when not in use.  The overhead of those other web apps is really in the app pool and hence the work processes, so by consolidating them, you could quite quickly eliminate the additional overhead.

Not sure if it's clear anywhere, but in the past I've seen people get burned by it... Make sure you always extend all web apps on all web front ends in the farm.  The timer service should take care of it, but just in case...  So when concerned about the web apps make sure you're consistent about the app pool configuration across your WFEs.  If you're targetting your indexing at one WFE or having your index as dual purpose for indexing itself then having the web app that's set specifically just for the crawl having it shut down on the servers that aren't using it will save your resources.  Hope this additional tip is useful and clear.

Let me explain it one more time with an example to make sure it's clear.  Let's say you've got a medium farm 2 WFE/Query in load balancing an Index and a SQL Cluster.  Let's say you're using forms and create additional web apps for all the web apps you want to index.  You make the Index server also a web front end.  All of the web apps should be consistent across all 3 servers (2 load balanced WFE/Query, WFE Index).  On each of these you consolidate web apps to use a single app pool that is configured to shut down the worker process when idle on the WFEs.  Hence, you don't have the overhead of worker processes that don't do anything.  When you get worried about having lots of web apps the concern should really be focused on the worker processes.  That's where your resources, memory and CPU is consumed.

Published Wednesday, January 02, 2008 11:29 AM by joelo
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

Wednesday, January 02, 2008 7:42 PM by Mel

# re: I'm using forms auth and search/crawl (indexing) isn't working

Thursday, January 03, 2008 11:07 AM by Steven Fowler

# re: I'm using forms auth and search/crawl (indexing) isn't working

In a recent engagement I extended the web application to a new zone using Windows authentication. This allows the site to be crawled with the service account, and result will appear in your FBA protected web application.

Thursday, January 03, 2008 6:13 PM by Cody

# re: I'm using forms auth and search/crawl (indexing) isn't working

Does this link (scroll down to "Known Issues") contradict what you're saying above?

http://technet2.microsoft.com/Office/f/?en-us/library/f136c3bd-14e6-4f3f-b7ef-36581e515c6f1033.mspx

Thursday, January 03, 2008 10:33 PM by Russ

# re: I'm using forms auth and search/crawl (indexing) isn't working

Unfortunately Basic+SSL doesn't work either, it has to be NTLM.

As a hoster, we extend the web app to ensure the crawler can auth via NTLM otherwise no content gets crawled.

Saturday, January 05, 2008 2:38 PM by Joel Oleson's Blog SharePoint Land

# Apologies to Dan and a couple of other house keeping items

I did a post a few days ago about Kerb/Forms authentication and Indexing . Since the post on Jan 2nd

Saturday, January 05, 2008 2:44 PM by Noticias externas

# Apologies to Dan and a couple of other house keeping items

I did a post a few days ago about Kerb/Forms authentication and Indexing . Since the post on Jan 2nd

Sunday, January 06, 2008 8:08 PM by Ben Curry

# re: I'm using forms or kerb auth and search/crawl (indexing) isn't working

I have instructions for setting up this second App here:

http://mindsharpblogs.com/ben/archive/2007/10/26/3305.aspx

Thursday, January 10, 2008 2:43 PM by Joel Oleson's Blog SharePoint Land

# Crawling and Kerberos the saga continues

Exiting times, here's another chapter in the Crawling and kerberos saga. A few days ago I apologized

Thursday, January 10, 2008 2:59 PM by Noticias externas

# Crawling and Kerberos the saga continues

Exiting times, here&#39;s another chapter in the Crawling and kerberos saga. A few days ago I apologized

Leave a Comment

(required) 
required 
(required) 
 
Page view tracker