Today morning I was answering a DL question. Question was..."How to I control what Search content to be crawled? There are some paths URLs which I do not want to crawl?.
I was wondering if there is a way to configure MOSS Search to exclude the path and library names in the search result?
YES! You can easily control content to be crawled using following techique.
Create a new page INDEX.HTML and use Index.html page to control what needs to be crawled. Details ….
1. Create a new path with only one page in it (Index.html)
2. Add all the paths URLs you want to crawl to Index.html page. [Do not include paths you do not want to crawl]
3. Use Index.html page URL to define Content Catalog. specify newly created Path where Index file is (Say http://MyServer/Search/Index.html)
4. In crawler setting use custom settings to control server hop and page depth
This way you have full control on content to be crawled.