Moved from http://blogs.msdn.com/vijgang
What this blog describes is the steps you need to get SharePoint to crawl a Forms based Authentication site. What are the issues you may face and how to resolve them.
We start by downloading addrule.exe to the SharePoint server. This tool is available at SharePoint Server 2007 Tool: Add/Edit Crawl Rules with Form/Cookie Credentials.
We then create a XML file to feed the addrule.exe. The specification of the XML file is documented in Searching Sites Protected by Forms Authentication with Enterprise Search in SharePoint Server 2007 and even in SharePoint Server 2007 Tool: Add/Edit Crawl Rules with Form/Cookie Credentials.
I get this Sample XML file created.
<rule> <path>http://fbasite/*</path> <type>FORMS</type> <auth_url>http://fbasite/_layouts/login.aspx?ReturnUrl=%2f</auth_url> <login_type>POST</login_type> <parameters> <param name="UserName">administrator</param> <param name="password">mypassword!</param> <param name="login" public="true">Sign In</param> <param name="__EVENTVALIDATION" public="true/wEWBQKLhuipCQLE96mtBQLLtsPBAgLkkP7MCgK/lZyyB9CK4YpD9xxOo46u87JbhTsQ5AkW</param> </parameters> <error_pages> <error_page>/layouts/login.aspx</error_page> </error_pages> </rule>
I then create a content source that points to the http://fbasite URL.
Then I run the command "addrule.exe myfba.xml" command to create a crawl rule in the SharePoint SSP search settings.
Many standard FBA sites will work using these steps. But some might still fail. I found 2 reasons why this can happen and they are,
The solution is the same for both issues. We need to use this wonderful tool - Fiddler that actually is a HTTP debugging proxy. This tool allows us to see the traffic between the client and server when using the HTTP protocol.
So the steps we take to fix this are,
Here the section in RED is what is interesting. These are the parameters that are sent by the browser to log you in. This is exactly what is needed by SharePoint to login to the FBA site. If you look at the formatting of this text, it is something like this:
So if we copy each and every parameter in that string and its respective value to the addrule XML file, we should get SharePoint to login the way you logged in using the browser.
By copying all parameters we are resolving both the issue - that of missing params and also of the URL encoding.
NOTE: If the site is a SharePoint FBA site, then the recommendation is to extend and map the site to a NTLM site and then crawl the NTLM site. Prepare to crawl host-named sites that use forms authentication talks about this. But if you still would like to crawl the SharePoint site using the FBA credentials - then you need to make this adding configuration in the crawl rule.
Now your SharePoint site will get crawled as a standard HTTP site using the FBA credentials, but note that you would miss a lot of SharePoint related functionality.
Note: If you are using Microsoft Search Server 2008, then you actually have UI that simplifies this process. There is an update planned to include the MSS features into MOSS. Once that's done, the UI should take care of finding the params to create the crawl rule.
Other reference articles:
Nice article !
This article might help to troubleshoot your issue..