Today morning I was answering a DL question. Question was..."How to I control what Search content to be crawled? There are some paths URLs which I do not want to crawl?.

I was wondering if there is a way to configure MOSS Search to exclude the path and library names in the search result?

YES! You can easily control content to be crawled using following techique.

Create a new page INDEX.HTML and use Index.html page to control what needs to be crawled. Details ….

 

1.     Create a new path with only one page in it (Index.html)

2.     Add all the paths URLs you want to crawl to Index.html page. [Do not include paths you do not want to crawl]

3.     Use Index.html page URL to define Content Catalog.  specify newly created Path where Index file is (Say http://MyServer/Search/Index.html)

    • Content Source Type (select Web Sites radio button) 
    • Start Adderess (Type the Index.HTML page URL)

4.     In crawler setting use custom settings to control server hop and page depth

    • Choose Radio button option "Custom- specify page depth and server hops"
    • Use Limit Page Depth option and Limit Server Hops options to control content to be crawled

This way you have full control on content to be crawled.