Welcome to MSDN Blogs Sign in | Join | Help

“Fix” SharePoint 302 redirect problem by IIS7 and URL Rewrite

Recently in my spare time I’m helping my friends to get their internet facing sharepoint site up and running. Since this is for the internet, the first thing they need to consider is SEO. So we have a well known problem now: SharePoint use 302 temp redirect instead of 301. If you do a search for “sharepoint 302” you will see a lot of articles talking about the problem.

Here’s the default SharePoint Site. The request was temporarily redirected (302) to Pages/default.aspx. Search engine bots don’t like it, they like 301. So this is BAD.

snap006 

How to solve it? Oh, I’m a IT Pro person, I don’t want to deal with a custom redirect HttpModule – god knows what will happen if those custom code mess up my sites! So any other options?

I’m lucky because I installed SharePoint on Windows Server 2008, so I can use IIS7 features. I downloaded URL Rewrite module from http://www.iis.net/extensions/URLRewrite, installed it, and started to configure the redirect.

snap007

Choose your site in IIS Manager, click URL Rewrite, and create a new blank rule.

Use Regular Expressions to match ^$ (which means “empty”). Set Action Type to Redirect, and add the redirect URL (by default should be Pages/default.aspx), set redirect type to Permanent (301). You are all set!

snap005

 

 

 

Now, clear browser cache and revisit the site:

snap004

It is 301 now! Pretty easy, isn’t it?

URL Rewrite module is great. If you are a regex guru you can also create more complex rules to make everything fit for your site.

Posted by opal | 0 Comments

Upgrade Checker in SP2 – Behind the Scene

Following the pervious post Upgrade Checker in SP2 – prepare your way to SharePoint Server 2010, here’s the detail of what upgrade checker checks.

Where are the upgrade checker rules?

The upgrade checker rules can be found at

X:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\CONFIG\PreUpgradeCheck

By default, there’re two rule files, one for WSS(WSSPreUpgradeCheck.XML) and one for MOSS(OssPreUpgradeCheck.XML). You can create your own rule files and put it into this directory. The checker will automatically load them.

How to use upgrade checker?

A simple answer is, run

"X:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN\STSADM.EXE" -o preupgradecheck

X is the drive letter you install SharePoint.

There’re a few options with this operation, for example you can use “–rulefiles rulefilename” to specify which rule file it should check, “-localonly” to only check those rules marked as localonly. This could help you in certain scenarios.

The syntax of upgrade checker can be found here:

Preupgradecheck: Stsadm operation (Windows SharePoint Services)

How does upgrade checker check my farm?

By calling object model. You can check this by opening the rule files in a XML editor yourself. Microsoft.SharePoint.Administration.Health is responsible for most of the rules. Here’s an example to check OSreqs.

<Setting Key1="Microsoft.SharePoint.Administration.Health.OSPrerequisite" Key2="LocalOnly">

If you are familiar with BestPracticeAnalyzer, you can also find these:

<ObjectProcessor Name="Group" Assembly="BPA.Common.dll" Class="Microsoft.WindowsServerSystem.BestPracticesAnalyzer.Common.GroupObjectProcessor" />
<ObjectProcessor Name="Registry" Assembly="BPA.ConfigCollector.dll" Class="Microsoft.WindowsServerSystem.BestPracticesAnalyzer.Extensions.RegistryObjectProcessor" />
<ObjectProcessor Name="SQL" Assembly="BPA.ConfigCollector.dll" Class="Microsoft.WindowsServerSystem.BestPracticesAnalyzer.Extensions.SQLObjectProcessor" />
<ObjectProcessor Name="WMI" Assembly="BPA.ConfigCollector.dll" Class="Microsoft.WindowsServerSystem.BestPracticesAnalyzer.Extensions.WMIObjectProcessor" />

These are used to help check group, registry, sql and WMI objects.

There’re two rule types, Information and Error, what’s the difference?

Information rules will check server farm for certain configurations, which would need to be considered during the upgrade process. The configurations that being checked here will not stop you from upgrade, but you might need to follow the advice to upgrade the farm. These rules also tell you the summary of the farm, to help you estimate the time needed for upgrade. For example, UpgradeType rule will check your farm for eligible upgrade methods, ServerInfo will list all the server names in the farm.

Error rules will check if there’s anything wrong which could prevent things from being upgraded. For example, your server does not meet Windows Server 2008 x64 requirement, any orphaned objects in your farm that would not be a part of upgrade process, etc.

Any explanation for the rules shipped with SP2?

You can also refer to TechNet article here for WSS rules:

Pre-upgrade scanning and reporting for future releases (Windows SharePoint Services)

There’s not enough detail in the document, so I borrowed their nice table and added my own comments here:

  • ServerInfo
    Description: All servers that are running SharePoint bits in the farm. Basically this is just a list of servers.

  • FarmInfo
    Description: The components of this farm. For “components” it means how many servers, web apps, content dbs, and site collections in your farm. A sample report is here:
    1 servers
    3 web applications
    3 content databases, approximately total size = 108199936 bytes
    4 Site collections

  • UpgradeType
    Description: The upgrade types supported by the farm. For most of the server farms, there will be two method available, Inplace Upgrade and Content Database Attach. Content Database Attach (also called DB Attach in some materials) is a recommended way to upgrade.

  • SiteTemplates
    Description: This farm uses the following site definitions. This rule will list all the site defs in the farm, sample here (WSS+Search Server):
  • name = STS, language = 1033, template id = 1, count = 1, status = Internal
    name = MPS, language = 1033, template id = 2, count = 0, status = Internal
    name = CENTRALADMIN, language = 1033, template id = 3, count = 1, status = Internal
    name = WIKI, language = 1033, template id = 4, count = 0, status = Internal
    name = BLOG, language = 1033, template id = 9, count = 0, status = Internal
    name = OSRV, language = 1033, template id = 40, count = 1, status = Installed
    name = SRCHCENTERLITE, language = 1033, template id = 90, count = 1, status = Installed

  • Features
    Description: The features installed on the farm. This would be a big list for every feature you installed on the farm. Sample:
  • Name = [S2SearchAdmin], Feature id = [2b1e4cbf-b5ba-48a4-926a-37100ad77dee], Reference count = [1], Scope = [Site], Status = [Installed]

  • LanguagePacks
    Description: The language packs required for the farm. If you have any other language packs installed on your farm, you will need to install new SharePoint 2010 language pack after the upgrade process.

  • AAMURLs
    Description: AAM URLs within the current environment to be considered when upgrading. It will list all AAMs, sample:
  • name = [Central Administration], zone = [Default], public Url = http://iws1:2000, internal Url = http://iws1:2000
    name = [SharePoint - 80], zone = [Default], public Url = http://iws1, internal Url = http://iws1
    name = [SharePoint - 80], zone = [Internet], public Url = http://www.mssearch.cn, internal Url = http://www.mssearch.cn

  • OSType
    Description: This server machine in the farm does not have the 64-bit edition of Windows Server 2008 or later installed.  I would assume that you already know the system requirements of SharePoint Server 2010, if you don’t, please refer to Richard’s post here:Announcing SharePoint Server 2010 Preliminary System Requirements

  • DatabaseSchema
    Description: Content databases are modified by user, and cannot be upgraded.
    Sometimes this things do happen, especially with wrong patch process. For example, I know an admin patched the farm database, and didn’t patch other servers in the farm so they are not working. What he did was, directly modify the database schema version to older ones! You should NEVER do this. Direct modification to SharePoint Content DB should always be avoided.

  • DataOrphan
    Description: Content databases contain orphans. This will be reported when the items has no relationship with the parent. For example, a corruption happened in content DB so a site has no web, a list with no parent list. STSADM operation databaserepair will be suggested to find and fix the errors.

  • SiteOrphan
    Description: Some sites cannot be referenced properly. Sometimes site collections are not in the sitemap, which cannot be upgraded. This could happen when you have duplicated URLs/hostheaders. You could detach the content DB or delete the site collection to fix this.

  • UnfinishedGradualUpgrade
    Description: This farm is currently being upgraded by using the gradual upgrade process.
    If there’re still some V2 sites (WSS v2 and SPS2003) inside the content DB which are not upgraded properly, you need to first finish this process.

  • MissingWebConfig
    Description: This Web site does not have a web.config file. This definitely is a problem, so you may need to copy a web.config there.

  • InvalidHostNames
    Description: Invalid host names found. This actually checks if there’re any reference with “http://localhost”. You need to change this to something that make sense.

  • InvalidServiceAccount
    Description: The application pool account must be fixed. “NT AUTHORITY\LOCAL SYSTEM” and “NT AUTHORITY\LOCAL SERVICE” should not be used as app pool account.

  • DatabaseReadOnly
    Description: Databases in this farm are configured as read-only and the upgrade will fail unless they are configured as read-write. Of course.

  • WYukonLargeDatabase
    Description: Databases in this farm are hosted on the Windows Internal Database uses SQL Server technology as a relational data store for Windows roles and features only, such as Windows SharePoint Services, Active Directory Rights Management Services, UDDI Services, Windows Server Update Services, and Windows System Resources Manager. and are larger than 4 gigabytes. 

  • WYukonLargeSiteCollection
    Description: Site collections in this farm are hosted on the Windows Internal Database and are larger than 4 gigabytes.

There’re two additional rules for MOSS to check search related stuff. They check for server names, content sources, indexed file numbers, index size and search DB size, etc. You can figure them out by yourself.

Upgrade Checker in SP2 – prepare your way to SharePoint Server 2010

The upgrade checker in MOSS/WSS SP2 stsadm operation is very useful. It checks server farm for system requirements, database health and a list of rules. The rules can also be extended.

To use upgrade checker, first open a command line prompt, and run

"X:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\BIN\STSADM.EXE" -o preupgradecheck

(X is the drive letter where you install SharePoint)

Make sure you are in administrator mode. Otherwise it would be denied.

snap016

You can see there’re a list of rules checked by the operation. I will have a seperate post talk about the detail of each rule.

SearchContentSourcesInfo
SearchInfo
ServerInfo
FarmInfo
UpgradeTypes
SiteDefinitionInfo
LanguagePackInfo
FeatureInfo
AamUrls
LargeList
CustomListViewInfo
CustomFieldTypeInfo
CustomWorkflowActionsFileInfo
ModifiedWebConfigWorkflowAuthorizedTypesInfo
ModifiedWorkflowActionsFileInfo
DisabledWorkFlowsInfo
OSPrerequisite
WindowsInternalDatabaseMigration
WindowsInternalDatabaseSite
MissingWebConfig
ReadOnlyDatabase
InvalidDatabaseSchema
ContentOrphan
SiteOrphan
PendingUpgrade
InvalidServiceAccount
InvalidHostName

A successful run could show the following:

 snap018

Hey, we got a “OSPrerequisite… Failed” here. So let’s take a look at the report.

The report will give you the following information:

Search content sources and start addresses 

Office Server Search topology information 

Servers in the current farm 

The components from this farm 

Supported upgrade types 

Site Definition Information 

Language pack information 

Feature Information 

Alternate Access Mapping Url(s) within the current environment that should be considered when upgrading. 

Lists and Libraries 

Customized field types that will not be upgraded

Windows SharePoint Services Search topology information 

And also the failed items it checked.

In my case, because this machine is still on Windows Server 2003 32bit, so it does not meet the requirement of SharePoint Server 2010, which needs to be install on Windows Server 2008 x64.

Failed : This server machine in the farm does not have Windows Server 2008 or higher 64 bit edition installed.

Upgrading to Windows SharePoint Services 4.0 requires Windows Server 2008 or higher 64 bit edition.
Please upgrade the server machines in your farm to Windows Server 2008 64 bit edition, or create a new farm and attach the content from this farm. For more information about this rule, see KB article 954770 in the rule article list at http://go.microsoft.com/fwlink/?LinkID=120257.

snap045 

I will explain the detail of the checker in another post later.

Update: the post is here: http://blogs.msdn.com/opal/archive/2009/05/12/upgrade-checker-in-sp2-behind-the-scene.aspx

Install MOSS 2007 & WSS 3.0 on Windows Server 2008 R2 – you will need SP2 slipstream

Windows Server 2008 R2 RC is avaliable several days ago. You may ask questions: What if I want to install WSS/MOSS on Windows Server 2008 R2? Is that supported?

The answer: WSS/MOSS RTM & SP1 is not supported on WS2008R2. But with SP2, it is supported. If you try to run the installer without SP2 slipstreamed, it would be blocked and you cannot continue. Meanwhile, if you want to use SQL Server 2008, you will also need to apply SQL Server 2008 SP1 on it after installation.

snap014

So slipstream build of WSS and MOSS SP2 is required. WSS SP2 slipstream build can be found here: x86 x64. There’s no slipstream build for MOSS so you need to create your own one. Here’s a quick guide:

Remove all stuff inside the Updates folder of your MOSS installation directory. Download both wss and moss SP2 packages, extract them in command line using /extract:drive\path option,  and then put all into the Updates folder. Delete Wsssetup.dll, this is important. Otherwise only WSS SP2 will be installed.

More details can be found on TechNet.

 

With SP2 slipstreamed, you can run the installer without any problem now. After installation, site version will be 12.0.0.6421.

snap015 

Windows Server 2008 SP2 is also supported by MOSS/WSS SP2.

PDF iFilter Battle, second round

If you still remember the last round of our PDF iFilter battle, FoxIT won it. Now in this round, we bring in another challenger: TET PDF iFIlter. It is also avaliable on x86 and x64, free for non-commercial desktop use, will need a license for Server installation.

So here's the new result for file set II:

 

File Number

Total File Size(MB)

Avg File Size(MB)

Crawl Time(m:s)

Crawl Time(s)

File Per Second

Success

Error

FoxIT

2676

2406

0.90

7:46

466

5.74

2759

0

Adobe

2676

2406

0.90

40:58

2458

1.09

2757

2

TET

2676

2406

0.90

13:48

828

3.23

2752

0

 

I also obtained an archive copy from People's Daily, from 2001 to 2006. ~20,000 PDF files, 13.4GB total. Tested on a 8 cores XEON box.

 

 

File Number

Total File Size(MB)

Avg File Size(MB)

Crawl Time(h:m:s)

Crawl Time(s)

File Per Second

Success

Error

FoxIT

19890

13793

0.69

00:30:53

1853

10.73

19884

7

Adobe

19890

13793

0.69

05:19:04

19144

1.03

19887

4

TET

19890

13793

0.69

01:40:09

6009

3.31

19879

12

 

And licensing comparsion for production(USD):

  Desktop Server 1-2 Cores
Per Server
4 Cores
Per Server
8+ Cores Per Server
Adobe Free Free Free Free Free
Foxit Free Not Free 329.99 589.97 1109.93
TET $119 for commercial usage Not Free 595 595 595

 

Summary

It is good to see another vendor joined this market. TET showed good performance, although still behind Foxit. But it's licensed based on servers not cores, the cost would be lower than Foxit if you have a typical 2 way quad cores box.

PDF iFilter Battle! FoxIT vs.. Adobe, 64bit version

After so long a time Adobe finally released its 64bit version of PDF iFilter!

http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025

“In response to customer requests, Adobe is releasing Adobe PDF iFilter 9 for 64-bit platforms, which will allow searching PDF files on Microsoft® Windows® 64-bit platforms for applications such as Microsoft Office SharePoint Server 2007, Microsoft Exchange Server 2007, and Microsoft SQL Server 2005.”

But, what about performance? How does it compare with FoxIT 64bit PDF iFilter?

My friend Deb Haldar did a performance test last year for their 32bit iFilters. You can find the result here: FOXIT vs.. Adobe PDF IFilter [ 32-bit only ]. Let’s say, FoxIT 32bit PDF iFilter is more than 4 times faster than the Adobe one.

Will the story change in 64bit age?

I picked about two sets of PDF files. Set I contains ~1000 PDF files, 1.7 GB in total. Set 2 contians ~2600 files, 2.4G in total.  Language is mixed by 30% Chinese, 70% US English. The hardware spec is a two-way dual core XEON at 3.4GHz, 4G Ram. SharePoint was patched with October CU. Here’s the result.

 

  File Set File Number Total File Size(MB) Avg File Size(MB) Crawl Time(m:s) Crawl Time(s) File Per Second Success Error
FoxIT I 1041 1751 1.68 6:02 362 2.88 1064 0
Adobe I 1041 1751 1.68 30:03 1803 0.58 1063 1
FoxIT II 2676 2406 0.90 7:46 466 5.74 2759 0
Adobe II 2676 2406 0.90 40:58 2458 1.09 2757 2

On average, FoxIT x64 PDF ifilter is still ~5 times faster than the Adobe one. But FoxIT charges 330 USD for a 2 core machine, while Adobe PDF iFilter is free. So if PDF indexing is the key to your business, go with FoxIT to get much better performance. If not, you may play with Adobe PDF iFilter to furfill some simple and basic request.

Important: Check MS08-067 and Apply the Update!

This vulnerability is marked as “Critical”, and nearly all windows product are affected.

Although it was reported privately to Microsoft and no expolit code leaked now, it is always safer to take action immediately. If you don’t do that, later hackers and worms might be able to attack your machines through RPC service from Internet, and take full control of your machine.

If automatic update is turned on, you will receive the update now. Apply it, make a restart.

For IT Pros, you need to check this for details:

http://www.microsoft.com/technet/security/Bulletin/MS08-067.mspx

In case you need to download the files manually:

Operating System Maximum Security Impact Aggregate Severity Rating Bulletins Replaced by this Update

Microsoft Windows 2000 Service Pack 4

Remote Code Execution

Critical

MS06-040

Windows XP Service Pack 2

Remote Code Execution

Critical

MS06-040

Windows XP Service Pack 3

Remote Code Execution

Critical

None

Windows XP Professional x64 Edition

Remote Code Execution

Critical

MS06-040

Windows XP Professional x64 Edition Service Pack 2

Remote Code Execution

Critical

None

Windows Server 2003 Service Pack 1

Remote Code Execution

Critical

MS06-040

Windows Server 2003 Service Pack 2

Remote Code Execution

Critical

None

Windows Server 2003 x64 Edition

Remote Code Execution

Critical

MS06-040

Windows Server 2003 x64 Edition Service Pack 2

Remote Code Execution

Critical

None

Windows Server 2003 with SP1 for Itanium-based Systems

Remote Code Execution

Critical

MS06-040

Windows Server 2003 with SP2 for Itanium-based Systems

Remote Code Execution

Critical

None

Windows Vista and Windows Vista Service Pack 1

Remote Code Execution

Important

None

Windows Vista x64 Edition and Windows Vista x64 Edition Service Pack 1

Remote Code Execution

Important

None

Windows Server 2008 for 32-bit Systems*

Remote Code Execution

Important

None

Windows Server 2008 for x64-based Systems*

Remote Code Execution

Important

None

Windows Server 2008 for Itanium-based Systems

Remote Code Execution

Important

None

 

Although this is only a security fix for the OS, as a SharePoint Developer/Administrator, you will always be responsibile for the security issues. So let’s prevent things from happen at the beginning.

Posted by opal | 0 Comments
Filed under: ,

Search Suggestions in IE8 with SharePoint/Search Server

IE8 Beta 2 has been there for a while. Although it is not supported to use together with SharePoint products yet (there’s no chance to support a beta product), you can still try it out. There’re couple of new features introduced like WebSlices, Accelerators and Search Suggestions.

Search Suggestions! Isn’t it cool to make your intranet SharePoint portal to be a Search Provider and have this lovely suggestion feature?

However, by default you can make SharePoint a Search Provider, but no way to add a suggestion feature.

So here it comes – an update to “Search As Your Type(SAYT)” codeplex project, with Search Suggestion working in IE8! And I also included a small green “S” logo icon file for that, all free:)

Since it needs time to update codeplex project, I’ll put something here.

1. Install SAYT on your SharePoint Server/Search Server as instructed. Do some test searches, to make sure it works.

2. Download new update from here:

http://cid-8007edf5c56fc334.skydrive.live.com/self.aspx/Public/SharePointSearchIE8.zip

3. Extract this zip file, copy all files to the directory you put GetInfo.aspx in first step, and overwrite it.

4. Modify ssprovider.xml as needed. Replace SharePointSearchCenter and SAYTUrl with your own ones.

5. Use IE8 to navigate to add.html, and add search provider.

6. Choose the green “S” provider and try it out!

You can add a sample provider on http://www.mssearch.cn:8099/add.aspx, and try type in “search server” to see the result.

2008-9-4 17-20-07

Disclaimer: This code is not support by Microsoft, if you have problems, leave your comment here. SAYT Codeplex project will be updated later to include this feature.

IE 8.0 Beta 2 with SharePoint, what’s the story?

Today Internet Explorer 8.0 Beta 2 was released. Well, if you tried Beta 1 with Office SharePoint Server 2007 or Windows SharePoint Services 3.0, you may notice that some of the features like content editor webpart does not function properly, even in IE7 mode.

But IE7 mode should be the same as original IE7, not to cause new problems. After several months’ bugfix and feature improvement, that problem now is gone.

There’re also some changes between IE8 Beta 1 and Beta 2. For example, there’s no IE7 mode button anymore, it is changed to a “Compatibility View” button next to refresh button. And that is not turned on by default.

So what will happen if I use IE8 Beta 2 to access a SharePoint site?

That depends on your master page of the site. IE8 Beta 2 will check DOCTYPE and meta tag to determine wether to use Compatibility View or not. If there is no DOCTYPE indicated in the page, IE8 Beta 2 will use this mode by default, and you will not see the button at all. This is what happened for default master pages come with SharePoint Server. All things should work by default.

It is quite common that you are using your own modified themes in master pages. So what if DOCTYPE is there in the pages? Will it cause any problem?

Maybe. But if you want to ensure everything works, I would recommend to add a meta tag in your master pages:

<meta http-equiv="X-UA-Compatible" content="IE=7" />  (You can take a look at www.msn.com to get the idea)

When IE8 Beta 2 reads this line, it will automatically switch to Compatibility View and make everything right. This can be seen as a temporary solution. IE8 native mode will be supported in future SharePoint v3 service pack.

Posted by opal | 2 Comments
Filed under: ,

Troubleshooting Windows Search CPU Consuming Problem

I have heard a lot of complains from my friends and the community that desktop search interferes their daily usage of the computer. These applications, such as Windows Search(also known as WDS, Windows Desktop Search, and the builtin search engine in Vista/2008), Google Desktop Search, index your files while the computer is idle. In this theory, it should not affect your PC’s performance. However, sometimes you can notice that your CPU usage goes up to 100% (or 50% for a dual core system, with 100% usage for one of the cores) for quite a few minutes, even when you are busily working with some applications. This not only brings down the performance of your current applications, but also affects your battery life if you are using a laptop in mobile, and boring fan noise.

So, it is really a nasty problem. When you bring up task manager, you can see something relates with Search is sucking power from your CPU. In my experience, the name is SearchFilterHost.exe. Let’s take a look at it with Sysinternal’s Process Explorer, to understand the relationship of the services.

snap099

This is a child process of Windows Search. Without any thoughts, I killed this process, and after a short while it restarted, again with 50%+ CPU usage. Nasty. It’s quite hard to identify Windows Search problem because it cannot tell you what kind of thing it is working on (except those guys who can understand minidumps and can trace into the process, who is really a minority in our IT guys). But by its name, I can know it is a filter host, and with another child process in this tree, I can understand the two process is working on search job, one for the protocol of the file, one for the filter of the file.

Filter daemon is a common part of Microsoft Search architecture. SharePoint, SQL and Windows Search using this daemon to load ifilters, and extract information from different types of files. If this process takes a lot of CPU power, it is quite possible the ifilter is suffering from some problem. And it’s quite likely, the ifilter encountered something it cannot process.

Let’s take a look at the threads of SearchFilterHost. You can notice that one of the thread, RPCRT4.dll, is sucking CPU power. RPC stands for Remote Procedure Call, this can be another evidence.

snap100

Now, I’m suspecting there’s a corrupt or misformat file caused the problem. Because it’s corrupt, the ifilter might not be able to process it correctly, and the dead cycles drained all the processing power of the core.  But with no log of activity from Windows Search, how can I know which file caused the problem?

Process Monitor is the tool this time. It is also from Sysinternal, as a combination or replacement for FILEMON/REGMON. Run it with filter setting to include all related processes, monitor only file activities, and wait for the problem to reappear.

snap102

After a short while, CPU usage goes up again. Stop the capture, and take a look at the log.

snap103

Only SearchIndexer is working, and this already lasts for quite a moment. This is abnormal, because it should load protocol daemon and filter daemon to process different files. Another evidence for the suspect of corrupt file. Now scroll up, try to find what is the last file it accessed.

snap101

Now it is clear, the indexer loaded “KurzfassungvonInhalt.docx” into memory, and stuck there for a few minutes. That file, should be the root cause of the problem.

I really didn’t have a idea that why this file is on my harddisk. But then I remembered this file was sent to me by one of my friend in Germany, she told me she had a word doc which she cannot open any more, and asked me to try to fix it if possible. If you open this file by Word, you will see an error notice.

snap105

I removed the file, and the CPU usage problem went away.

In a similar case, I observed some doc files which produced by WPS Pro edition(a Office clone in China, while its personal edition does not have the problem) caused the same problem. These files can be opened in MS Word, but cannot be processed by the ifilter. I don’t have the idea with doc files from OpenOffice,  but these experiences might help you to identify the reason if you are suffering from the same problem.

Process Explorer and Process Monitor can be downloaded from http://technet.microsoft.com/en-us/sysinternals/default.aspx, or www.sysinternals.com. Don’t capture too many events with Process Monitor at one time, otherwise your RAM will run out.

SharePoint Search - Lotus Notes Indexing Best Practice

Many people have been asking for the best practice or a guide to properly maintain Lotus Notes indexing function in SharePoint Search. So here it is, this is not a official guide, but our experience in several big customers. I will write this in a Q/A format, so you can navigate to see which question applies to your current problem.

Q1. How many Lotus Notes content source can I crawl at the same time?
A1: One content source per Domino Server. If all of your stuff are put on a single Domino Server, you have to crawl them one by one. But If you have several Domino servers to index, then you can index them at the same time. This is a limitation of IBM Lotus Notes C++ API. So you may need to carefully set schedules to crawl these content sources.

Q2. How many Lotus Notes content source shall I crawl at the same time?
A2: The only difference from the 1st question is CAN/SHALL. There should be a limit on this number,  but what is this number? I don’t have the direct answer for the question, because this number depends on your hardware performance, memory usage, network legacy and bandwidth…. so many factors. For a recent hardware with 8GB ram, I would recommend 3,with scheduled memory recycling – we will talk about this later.

Q3. I have a Notes database indexed, but how come the time of full crawl is nearly the same with incremental crawl?
A3: During an incremental crawl, SharePoint search engine will check LastModifiedTime property of target documents/items, and to determine if the target object should be fully retrieved back to its index. However, for certain content source, this property is not retrieved or mapped to something else by mistake, therefore, the engine can only get all the content back to check if there’s any difference. I’m checking a possible solution for this problem, and will update if I can find something.

Q4. Should I use x86 or x64 for Lotus Notes indexing?
A4: Because of the limitation of IBM Notes C++ API, Notes Protocol Handler can only run on a x86 box. However, you can still use x64 query servers and WFEs. Remember: the same tier should not be mixed with x64/x86 boxes, but you can have x86 indexer tier with x64 query and x64 wfe tiers, this is recommended for Notes search in SharePoint 2007/Search Server 2008. (IBM released x64 version of their API recently, but it’s impossible to make current NotesPH to work with that, many things changed)

Q5. You mentioned memory recycling – what does that mean?
A5: Due to x86 limitation, the memory per process is limited to certain number. And because we are calling Notes client through API, it’s quite possible MSSEARCH/MSSDMN process will hit memory limit after a crawl of large numbers of documents. So I recommend you to recycle these processes for every certain amount of time. This can prevent possible stuck of the crawl. In order to do this, you might need to write your own schedule program with SharePoint search administration APIs, and restart osearch service when it’s need. I will also add this function to SharePoint Search Admin 0.81 and later in a few days.

Q6. Any ideas about security trimming support? What should I do in Domino side?
A6: You can use Lotus Notes users and groups to control security, and map them to AD users to achieve search result security trimming in SharePoint. But it is generally advised to not use Lotus Notes Roles for security control, as there’s no correspond thing in active directory.

Q7. To be added.
A7:

Btw, I’m moving to a new position in IW PMG, as a Technical Product Manager to drive SharePoint IT Pro readiness. So in future there would be more things like SharePoint Governance appear on this blog:).

SharePoint Search Admin 0.80b – Now with Keywords and Bestbets Backup and Restore Feature

It has been a long time since last update. In this version of SharePoint Search Admin, I added a feature by very popular request: Keywords and Bestbets.

I mark it a “Beta” because there’re still something I didn’t finish due to time limitation, so I disabled “Add Keyword” and “Add Bestbet” button. They are not the focus because using SSA to add them cannot provide a better experience than default SharePoint UI.

Let’s take a look at the interface.

snap087

Still ugly :) I promise I’ll improve it.

So you can remove individual Keyword, remove all keywords by a single click, backup and restore whole keywords/bestbets collection in a few seconds with this new feature.

A sample XML generated by SSA Keywords backup.

snap086

Why the clock under Windows 2K/XP/2K3 show “four quick seconds and one slow second”?

If you click the clock at the right-bottom corner of Windows, it will open a small animated clock, which has seconds showed on it. But if you watch it carefully for a few moment, you will notice that every four seconds are the same length, but the fifth one is much longer than that. What happened?

The reason is quite interesting. As we know, in Windows C++ coding, we use WM_TIMER to set the time trigger for many events. The resolution of WM_TIMER is at about 50ms, which cannot be used under time-critical real time processing. But for a clock, it is enough. In the code of clock application, the following code is used to set the Timer(You can also get these pieces of code from VC98 CDROM Samples, if you still have it, I doubt):

SetTimer (hWnd, TimerID, OPEN_TLEN, 0L);

OPEN_TLEN is the length of the timer, it is a constant. So when we look at clock.h, you will get the number, which is 450.

What does this 450 mean? This means, every 450ms the timer will be triggered, it will detect time changes and redraw the clock.

So, we can draw a small table to make it clear, the first row is the number of trigger, the second row is the time:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
450 900 1350 1800 2250 2700 3150 3600 4050 4500 4950 5400 5850 6300 6750 7200 7650 8100 8550 9000 9450 9900

You will notice that within the first second, the timer gets triggered twice, and it will be updated at the 3rd trigger. The same situation is for 2nd, 3rd and 4th second. But within the fifth second, which is marked as red, the timer gets triggered for three times, and the clock will be updated at 4th trigger.

So this is why the fifth second update is much longer than the other four.

Why this have to be 450? Well, I’m not a developer nor a program manager, I don’t know their decision. But I guess this is a consideration about performance and resources. After all, how many people will use that clock as an accurate timer?

BTW – the clock application under Vista/2K8 is completely rewritten, so you may not have that problem. But if you watch it for a minute, you will still notice a quite “quick” second. :)

Posted by opal | 2 Comments
Filed under: ,

Crash when loading/encoding video by Windows Media Encoder

If you are a Zune Lover and want to use some free stuff to make high quality WMVs for your Zunes, you may have the same problem with me – after you make a avi backup of your DVD and want to convert it to WMV then put it to your Zune, wmencoder just crashed and report something like “faulting module wmenceng.dll” in event log. (BTW, Zune player program does not do an excellent job for my video collections, esp. the old ones – I used to be an video encoding fan years ago)

No matter you are using GUI or the command line script(WMCmd.vbs), the result is the same.

So what happened?

The problem is caused by incompatibility  of wmencoder and your Video/Audio codec. Some of the 3rd party DirectShow decoders interface are poorly written, therefore created abnormal streams that wmencoder cannot identify(you may ask the question that why it crashes instead of reporting an error, well, crash sometimes is better to prevent further attack).  Most of the time, this is caused by some certain version of "ffdshow”.

So if you are suffering from this problem, first, reinstall or find another build of ffdshow, or just modify settings in your ffdshow configuration interface to make sure it won’t decode certain streams, like mp3.

Then, problem solved.

And – if you are on Vista, don’t forget to apply the vista patch of windows media encoder, otherwise it won’t work at all.

Posted by opal | 0 Comments

“Evil” way to federate search results through a password protected proxy

In real world environment, people sometimes use password protected proxy to make company employees to access the Internet. Most of the time, that is a basic authentication. So in this kind of environment, the federated search webpart of Microsoft Search Server 2008 will not work out-of-the-box because we only support non-password protected proxy.

But is there any way to workaround?

Yes, otherwise why I’m talking about it?

For the word “evil”, I’m not referring to the definition of the word from a “not evil” company. My “evil” is always some kind of tricks, or hacks, and you will love them because they can really solve problems. BTW – in MMO RPGs I’m always a chaotic neutral character, but I like evil ones - that’s my best description.

The theory is to make a data tunnel through the password protected proxy, so we can map external website to local port, and federate the search result. There’re some applications which can do the job, but here we will use HTTPort as an example.

Here’re the steps!

1. Get a copy of HTTPort from www.htthost.com, the newest version is HTTPort 3.SNFM. Install it.

2. In proxy configuration window, fill in your proxy server name and port, check “Proxy requires authentication” and then input your username and password for accessing this proxy.

snap047

3. Check the RSS feed website domain name you want to federate. In this example, we are using Live Search China. The domain name is “cnweb.search.live.com”.

snap058

4. Click Port mapping tab of HTTPort, and add a new port tunnel. Fill in a local port, for example, 991, then fill in remote host name and port.

snap054

5. Switch back to proxy tab. press start button in lower right corner.

snap053

6. Check if it works in your browser: replace domain name of your RSS feed with 127.0.0.1:991. If everything is going on well, and you are lucky enough, the RSS feed will be there and you can make Search Server federate it through this new local URL!

snap056

7. Just some note: Not every service can be federated like this. If the target website has more security check, for example Yahoo search, the RSS feed cannot be fetched through such tunnel. Therefore you have to consider other ways, or spend some time to imporve this evil hack:).

HTTPort is a free software written by Dmitry Dvoinikov.

More Posts Next page »
 
Page view tracker