Welcome to MSDN Blogs Sign in | Join | Help

Some Useful MOSS Search Development Related Articles

Hi all,

I just recently put together an email on some of the articles available for Developers on MOSS Search. So here is my quick list:

 

Best Practices: Writing SQL Syntax Queries for Relevant Results in Enterprise Search --> http://msdn2.microsoft.com/en-us/library/bb219479.aspx

Creating Search Queries Programmatically by Using the Search Object Model in SharePoint Server 2007 --> http://msdn2.microsoft.com/en-us/library/bb626127.aspx

Creating Search Queries Programmatically by Using the Search Web Service in SharePoint Server 2007 --> http://msdn2.microsoft.com/en-us/library/bb625950.aspx

Evaluating and Customizing Search Relevance in SharePoint Server 2007 --> http://msdn2.microsoft.com/en-us/library/bb499682.aspx

Book Excerpts "Chapter 3: Customizing and Extending the Microsoft Office SharePoint 2007 Search (Part 1 of 2)" --> http://msdn2.microsoft.com/en-us/library/bb608302.aspx

Book Excerpts "Chapter 3: Customizing and Extending the Microsoft Office SharePoint 2007 Search (Part 2 of 2)" --> http://msdn2.microsoft.com/en-us/library/bb608305.aspx

Creating a Custom Search Page and Tabs in the Search Center of SharePoint Server --> http://msdn2.microsoft.com/en-us/library/bb428855.aspx

Creating and Exposing Managed Properties in the Advanced Search Page of SharePoint Server Enterprise Search --> http://msdn2.microsoft.com/en-us/library/bb428648.aspx

Creating and Exposing Search Scopes in SharePoint Server 2007 Enterprise Search --> http://msdn2.microsoft.com/en-us/library/bb428856.aspx

Displaying Search Results in a Grid View in SharePoint Server 2007 --> http://msdn2.microsoft.com/en-us/library/ms497338.aspx

Exposing Enterprise Search in SharePoint Server 2007 by Using Internet Explorer 7 and the Office Research Pane --> http://msdn2.microsoft.com/en-us/library/bb625970.aspx

Creating Custom Enterprise Search Web Parts in SharePoint Server 2007 --> http://msdn2.microsoft.com/en-us/library/bb871647.aspx

Customizing Search Results with Custom XSLTs in SharePoint Server 2007 --> http://msdn2.microsoft.com/en-us/library/bb896018.aspx

Searching in MOSS in the SDK --> http://msdn2.microsoft.com/en-us/library/ms497338.aspx

Class Library reference in MOSS SDK --> http://msdn2.microsoft.com/en-us/library/ms577961.aspx

 

And others :-)

Mike

Posted by miketag | 3 Comments
Filed under: , ,

Do a Full Crawl after installing MOSS SP1

Hi all,

I have heard questions around the need to do a Full Crawl, after installing MOSS Service Pack 1. I did not see anything in the documents for SP1 that a full crawl is required or recommended.

But after talking to some of the experts, I was told that a full crawl is recommended but not required. The reason for that is there have been performance improvements made in SP1 that will leverage the information that MOSS collects in the full crawl, to make the incremental crawl much faster. So you do not have to do a full crawl after you move to SP1, and things will be fine ... but you will not see the performance improvements.

Cheers
Mike

Posted by miketag | 2 Comments
Filed under: ,

Force a Full Crawl for a Site

Hey all,

So I was doing some work with a customer and I was asked how to do a Full Crawl for a site, without actually doing a Full Crawl and only doing Incremental Crawl. So bascially, they would like to crawl a site fully, using only Incremental crawls. So I was told a quick thing that worked for me in my test. I am not saying that this is the right way, or the recommended way, or the best way to do this, but it is a cool thing that worked for me, so take it with a grain of salt :-)

  1. Go into the site's Site Settings, click the option for Search Visibility to NO, so the site is not available to be searched by MOSS.
  2. Now perform an Incremental Crawl. This site's content will no longer be available in search results.
  3. Now go back into the site and change the setting above to YES.
  4. Now perform an Incremental Crawl again. This Incremental Crawl will now act as the Full Crawl on the site :-)

This is helpful in situations where you can not wait for a Full Crawl but can do few Incremental Crawls during this time. This will do a Full Crawl on the site, even though you are doing an Incremental Crawl. Pretty cool :-)

Give it a try and let me know if it works for you.

Cheers
Mike

Posted by miketag | 4 Comments
Filed under: ,

Tafiti goes Shared Source

Hi all,

Earlier this year, Microsoft debuted www.tafiti.com . Tafiti, which means “do research” in Swahili, is an experimental Web site that explores the intersection of two trends: the specialization of search and richer experiences on the Web.

Tafiti was very successful and today, Microsoft is announcing the release of the Tafiti Search Visualization source code to CodePlex which means any developer can download, modify, and use the code royalty free (see MS-PL License for all the details).

Tafiti has been released as a Windows Live Quick Application via CodePlex, which is a set of demos developers can download and use as reference implementations or starter kits for the Windows Live Platform. Microsoft hopes developers will enhance the Windows Live Quick Apps via the CodePlex community development platform.

You can see Tafiti running at http://tafiti.mslivelabs.com (or see the original with the Halo3 skin) and to understand how to use it read the walkthrough or watch the video (4.5 minutes).

There is a wealth of information in this codebase such as:

  • Reusable Silverlight Controls

  • Code which wraps the Live Search SOAP API

  • Windows Live ID Web Authentication implementation

For more information see the announcement on dev.live.com or on Angus Logan's blog.

Thanks
Mike

 

Posted by miketag | 1 Comments
Filed under: ,

MOSS and WSS 3.0 SP1 Released

Hi all,

Just wanted to make sure that everyone is aware of the release of MOSS SP1 and WSS 3.0 SP1:

For all the links related to this, see http://blogs.msdn.com/sharepoint/archive/2007/12/11/announcing-the-release-of-wss-3-0-sp1-and-office-sharepoint-server-2007-sp1.aspx 

Thanks
Mike

 

Posted by miketag | 0 Comments
Filed under: ,

Announcing Microsoft Search Server 2008 Express

Hi all,

I wanted to make sure you saw the following sites:

Announcement on this news at the SharePoint Blog: http://blogs.msdn.com/sharepoint/archive/2007/11/06/announcing-microsoft-search-server-2008-express.aspx

Microsoft Enterprise Search site: http://www.microsoft.com/enterprisesearch/

Microsoft Search Server 2008: http://www.microsoft.com/enterprisesearch/serverproducts/searchserver/default.aspx

Microsoft Search Server Express 2008 [Free]: http://www.microsoft.com/enterprisesearch/serverproducts/searchserverexpress/default.aspx

Cheers
Mike

 

Posted by miketag | 1 Comments
Filed under: ,

Customizing and Extending the Microsoft Office SharePoint 2007 Search Excerpts

Hi all,

I just wanted to make sure that everyone saw these two excerpts from Inside Microsoft Office SharePoint Server 2007 by Patrick Tisseghem, from Microsoft Press

These two excerpts can be found at:

Chapter 3: Customizing and Extending the Microsoft Office SharePoint 2007 Search (Part 1 of 2)

Chapter 3: Customizing and Extending the Microsoft Office SharePoint 2007 Search (Part 2 of 2)

Thanks
Mike
 

Posted by miketag | 1 Comments
Filed under: ,

Technical Whitepaper on Microsoft Deployed SharePoint Server 2007 Search

Hi all,

I just wanted to make sure that everyone saw this technical whitepaper on how Microsoft deployed SharePoint Server 2007 Search internally:

"This document shares the experiences of Microsoft teams in deploying Enterprise Search at Microsoft. Because of the significant amount of knowledge that these teams gained, the experience provides relevant guidance to organizations that want to help improve employee productivity, improve the quality of work produced, reduce duplication of efforts, and take advantage of the financial investment in digital assets by deploying Enterprise Search."

This paper can be found at: http://technet.microsoft.com/en-us/library/bb735129.aspx

Thanks
Mike

 

Posted by miketag | 1 Comments
Filed under: , ,

Announcing the Microsoft Business Data Catalog Definition Editor for MOSS 2007

Hi all,

Just want to make sure that you saw the announcement for the new Microsoft Business Data Catalog Definition Editor for MOSS 2007. The BDC Definition Editor tool is now available in the latest version of the MOSS SDK at http://www.microsoft.com/downloads/details.aspx?familyid=6d94e307-67d9-41ac-b2d6-0074d6286fa9&displaylang=en

This tool will make it easy for setting up BDC connections in MOSS, which will allow you to search across your back end applications and databases easier.

For more info, please see http://blogs.msdn.com/sharepoint/archive/2007/08/22/announcing-the-microsoft-business-data-catalog-definition-editor-for-microsoft-office-sharepoint-server-2007.aspx

Thanks
Mike

 

Posted by miketag | 0 Comments
Filed under: , ,

Reasons for a Full Crawl

Hi all,

I have been asked few times, the reasons why MOSS Search would need to do a full crawl. The following information has been taken out from one the whitepapers on TechNet and does a good job of explaining this:

Reasons for an SSP administrator to do a full crawl include:

  • One or more QFE or service pack was installed on servers in the farm. See the instructions for the hotfix or service pack for more information, on if a full crawl is needed for it.
  • An SSP administrator added a new managed property.
  • To re-index ASPX pages on Windows SharePoint Services 3.0 or Office SharePoint Server 2007 sites.

    Note: The crawler cannot discover when ASPX pages on Windows SharePoint Services 3.0 or Office SharePoint Server 2007 sites have changed. Because of this, incremental crawls do not re-index views or home pages when individual list items are deleted. We recommend that you periodically do full crawls of sites that contain ASPX files to ensure that these pages are re-indexed.

  • To resolve consecutive incremental crawl failures. In rare cases, if an incremental crawl fails one hundred consecutive times at any level in a repository, the index server removes the affected content from the index.
  • One or more crawl rules have been added or modified
  • To repair a corrupted index


The system does a full crawl even when an incremental crawl is requested under the following circumstances:

  • An SSP administrator stopped the previous crawl.
  • A content database was restored.
  • A full crawl of the site has never been done.
  • To repair a corrupted index. Depending upon the severity of the corruption, the system might attempt to perform a full crawl if corruption is detected in the index


For more information on this, please see this article at TechNet: http://technet2.microsoft.com/Office/en-us/library/f32cb02e-e396-46c5-a65a-e1b045152b6b1033.mspx?mfr=true

Thanks
Mike

Posted by miketag | 3 Comments
Filed under: , ,

Estimate MOSS Search Disk Space Requirements:

Hi all,

I have been asked many times about storage requirements around MOSS Search. By this I mean, what estimates can be made on the disk space requirements for Index server, Query Server and Database Server.

Below I have pasted in content that is included in the whitepaper on TechNet on this. I pulled this out because many people miss this information:

Index server disk space requirements:

To estimate the index server disk space requirements, we recommend that you use the following calculations:
 
  Size of data crawled = Y
 
  Size of index on index server = a range of 5% through 12% * Y = X
 
  Initial disk space = 2.5*X.
 

A large amount of index server disk capacity is required to accommodate backups, which must reside on the same disk as the index, and to accommodate the merge process when crawled data is merged with the index.

Note: The volume of crawled data can differ based on the content source. A content source is a set of options that you can use to specify what type of content is crawled, what URLs to crawl, and how deep and when to crawl.

For example, if the content source specifies file-share content, the index size can be up to 30 percent of the size of the content.

Content Index Sizing:

You can estimate the size of the content index with the following equation:

Index size = Average size of document * number of documents * 4 x 10-10 GB.

Note that this equation is intended only to establish a starting-point estimate. Real-world results may vary widely based on the size of documents being indexed, and how much metadata is being indexed during a crawl operation.

Query server disk space requirements:

Content indexes are propagated from the index server to every query server in the farm. The full index is propagated to the query servers during the query server initialization phase, and incremental changes in the index are propagated on a continual basis. The merging process requires more disk space than what is required to accommodate the index itself.

Given a content index size of X, we recommend that initial disk space be at least 2.5*X for every content index on each query server in the farm.

Database server disk space requirements:

The search database that stores the metadata for the search system requires more disk space than the index. This is especially the case if you crawl many SharePoint sites, which are very rich in metadata.

To estimate disk space requirements for the search database, use the following guideline: For an index size of X, we recommend initial disk space of 4*X for the hard disk that contains the search database.

Note: When a farm contains only site collections, sites, lists, and document libraries, and no external content such as documents stored on file shares, the typical size of the index is approximately 1-5 percent of the size of the content database. If there are no document libraries in the farm, the typical size of the index size is approximately one percent of the size of the content database. The actual size of the index relative to the content database varies depending on the size and type of the documents stored in the farm.

For more information on this, please see the whitepaper that this content was taken out at: http://technet2.microsoft.com/Office/en-us/library/5465aa2b-aec3-4b87-bce0-8601ff20615e1033.mspx?mfr=true 

Hope that helps,
Mike

Posted by miketag | 4 Comments

Recommended Guidelines for Search Boundaries

Hi all,

I have been asked a lot for guidelines around some of the boundaries for MOSS Search. Here is a quick table that does a good job on this:

 

Search object

Guidelines for acceptable performance

Notes

Search indexes

One per SSP

Maximum of 20 per farm

MOSS supports one content index per SSP. Given that we recommend a maximum of 20 SSPs per farm, a maximum of 20 content indexes is supported.

Note that an SSP can be associated with only one index server and one content index. However, an index server can be associated with multiple SSPs and have a content index for each SSP.

Indexed documents

50,000,000 per content index

MOSS supports 50 million documents per index server.

Content sources

500 per SSP

This is a hard limit enforced by the system.

Start Addresses

500 per content source

This is a hard limit enforced by the system.

Alerts

1,000,000 per SSP

This is the tested limit.

Scopes

200 per site

This is a recommended limit per site. We recommend a maximum of 100 scope rules per scope.

Display groups

25 per site

These are used for a grouped display of scopes through the user interface.

Crawl rules

10,000 per SSP

We recommend a maximum 10,000 crawl rules irrespective of type.

Keywords

15,000 per site

We recommend a maximum of 10 Best Bets and five synonyms per keyword.

Crawled properties

500,000 per SSP

These are properties that are discovered during a crawl.

Managed properties

100,000 per SSP

These are properties used by the search system in queries. Crawled properties are mapped to managed properties. We recommend a maximum of 100 mappings per managed property.

Authoritative pages

200 per relevance level

This is the maximum number of sites in each of the four relevance levels.

Results removal

100

This is the maximum recommended number of URLs that should be removed from the system in one operation.

Crawl logs

50,000,000

Number of individual log entries in the crawl log.


For more information, see the "Plan for Software Boundaries" at: http://technet2.microsoft.com/Office/en-us/library/6a13cd9f-4b44-40d6-85aa-c70a8e5c34fe1033.mspx?mfr=true

Hope that helps
Mike

 

Posted by miketag | 3 Comments
Filed under: , ,

Configure MOSS to crawl Lotus Notes

Hi all,

 

I have been asked about how MOSS can be configured to crawl Lotus Notes. Below are some links that may be useful:

 

Configure MOSS 2007 to crawl Lotus Notes:
http://technet2.microsoft.com/Office/en-us/library/82c7c354-6347-4ae8-b5f8-7d0cdfe432401033.mspx

 

Configuring the Lotus Notes Protocol Handler:

http://office.microsoft.com/en-us/sharepointportaladmin/HA011603581033.aspx

 

Information about installing the Lotus Notes Protocol Handler for SharePoint Portal Server 2003:
http://support.microsoft.com/default.aspx/kb/830971

 

Secure crawls of Lotus Notes with SharePoint:

http://blogs.msdn.com/edhild/articles/473060.aspx

 

Search and Index Lotus Notes:
http://www.sharepointblogs.com/helloitsliam/archive/2007/01/09/17654.aspx

 

Displaying Correct Titles of Lotus Notes Documents in SharePoint Search Results:

http://meiyinglim.blogspot.com/2007/01/displaying-correct-titles-of-lotus.html

 

Use MOSS 2007 to index a Lotus Notes Database:

http://meiyinglim.blogspot.com/2007/01/using-sharepoint-2007-to-index-lotus.html

  

How to Configure Search to honor Lotus Notes Security Settings:

http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q288816

 

Hope it helps

Mike

Posted by miketag | 0 Comments
Filed under: , ,

SQL Syntax Queries in MOSS Search

Hi all,

I have been asked few times for best practices on writing SQL Syntax Queries with respect to MOSS Search. There are great articles in the MOSS SDK and MSDN on best practices. In case folks are not aware of these, here are couple of quick links:

Best Practices: Writing SQL Syntax Queries for Relevant Results in Enterprise Search: http://msdn2.microsoft.com/en-us/library/bb219479.aspx

Enterprise Search SQL Syntax Reference: http://msdn2.microsoft.com/en-us/library/ms493660.aspx

Hope that helps
Mike

Posted by miketag | 3 Comments
Filed under: , ,

Planning your MOSS Search Team

Hi all,

I have been asked many times by other consultants or customers, how to plan the MOSS Search team. Before planning and rolling out the deployment of MOSS, you need to know what your Search planning and Operations team might need to look like.

There are guidelines on Microsoft's TechNet site on what the Search planning team might look like, what the Search Process might include and plan your Search Operations and Deployment team. You can find this information at: http://technet2.microsoft.com/Office/en-us/library/b4b9ceae-3837-4a79-8097-d6381500e4401033.mspx

There is also a handy worksheet for customers to track who their Search administrators are for a MOSS deployment and their details such as their name, user or group account, names of SSPs they will administrator and other useful information. I recommend you download this worksheet and use it for your MOSS Search deployment. You can find this at http://go.microsoft.com/fwlink/?LinkId=73621&clcid=0x409

Hope that helps
Mike

Posted by miketag | 4 Comments
Filed under: , , ,
More Posts Next page »
 
Page view tracker