Leonid Lyublinski released his second major version of Faceted Search today, this is a great tool that can instantly improve your SharePoint Server/Search Server user experience!
It provide the following benefits:
and in version 2, he enhanced:
I would suggest that everyone who use SharePoint and Search Server to install this open source and free tools, it's really great.
http://www.codeplex.com/FacetedSearch
Live demo(lower right corner)
http://www.wssdemo.com/Search/Pages/results.aspx?k=search
THE BOSS said: MOSS should not be called MOSS, it should only be refered as Microsoft Office SharePoint Server, or SharePoint Server for short.
It's really strange and hard to change my habit for typing MOSS and SPS. I think this change may be good for newbies to sharepoint -- they don't know what's MOSS and what's SPS. And even a Microsoft employee may not understand the difference. One of the marketing managers in my team always use the term "SPS" in her customer invitation letter, I had to explain to her everytime that "SPS" stands for "SharePoint Portal Server 2003" and "MOSS" for "Microsoft Office Sharepoint Server 2007", but she couldn't remember at all, or I guess she just didn't want to remember. So, this change may help with such situations.
But what about Microsoft Search Server? Better they remain to be MSS and MSSE I think.
And...my codeplex project will change its name to SharePoint Search Admin!
http://www.codeplex.com/searchadmin
How foolish I am...
1. On MOSS box, CPU usage seems very high for several hours. Target system may also suffer from low performance.
This will happen in several situation, especially after you changed crawl impact rules. By default, MOSS/MSS will request for 8 files at one time for a single server, you can change it to 64 at the most. But remember, although sometimes this can help with crawl speed, it will hurt performance of both MOSS and target systems. So, if there's no special needs, do not set this value to too high. For low performance servers, you may want to increase the interval between two file requests. Meanwhile, crawl schedules should be adjusted to prevent target system from being impacted in business hours.
This will happen in several situation, especially after you changed crawl impact rules. By default, MOSS/MSS will request for 8 files at one time for a single server, you can change it to 64 at the most. But remember, although sometimes this can help with crawl speed, it will hurt performance of both MOSS and target systems. So, if there's no special needs, do not set this value to too high. For low performance servers, you may want to increase the interval between two file requests.
Meanwhile, crawl schedules should be adjusted to prevent target system from being impacted in business hours.
2. Crawl time takes too long. Only ~30,000 files can be crawled per hour.
Check the bottleneck first. You can use some program to monitor the bandwidth, cpu usage, sql box performance... But don't forget to check your NIC. Let's say you have a 100Mbits connection to the intranet. So on average, you can get 8~10Mbytes per second, which means 480~600Mbytes per minutes, 29~36Gbytes per hour. Considering other factors, it is about less than 30Gbytes. Then take a look at the content you are crawling. If the average size of your files is about 1Mbytes, which is very common if that is a mixed set of PPT/DOC/XLS files, you can of course only crawl about 30,000 files per hour. So, increase your network bandwidth is a key to crawl speed. Sometimes, nothing wrong about the MOSS box, nothing wrong about your network bandwidth, it's just because your target system is too slow, for example an old Domino server. In this case please refer to point 1.
Check the bottleneck first. You can use some program to monitor the bandwidth, cpu usage, sql box performance... But don't forget to check your NIC. Let's say you have a 100Mbits connection to the intranet. So on average, you can get 8~10Mbytes per second, which means 480~600Mbytes per minutes, 29~36Gbytes per hour. Considering other factors, it is about less than 30Gbytes.
Then take a look at the content you are crawling. If the average size of your files is about 1Mbytes, which is very common if that is a mixed set of PPT/DOC/XLS files, you can of course only crawl about 30,000 files per hour.
So, increase your network bandwidth is a key to crawl speed.
Sometimes, nothing wrong about the MOSS box, nothing wrong about your network bandwidth, it's just because your target system is too slow, for example an old Domino server. In this case please refer to point 1.
To be continued...
*If you are looking for something to transcode audio, please do a google on LAME, besweet. You can also take a look at one of my old work http://paradiso.cn/converter/any2wav.htm
*If you are looking for cmdline video encoding, please try mencoder, ffmpeg, etc. You can look for help in Doom9 forum.
The tool I talk here is only for TEXT encoding problems.
Well, this is a pretty simple and stupid tool. It contains no more than 10 lines of useful C# code, and the performance is not very good. But sometimes, when you want to deal with stupid problems, you have to use such tool. I like GNU's iconv, but there's no good port on Windows.
So, I have to write one for my own usage.
http://cid-8007edf5c56fc334.skydrive.live.com/self.aspx/Public/ec.rar
Usage: ec inputfile outputfile [input Encoding] [output Encoding]
No wildcard support, but you can simply do a trick in command shell.
For example, you want to convert all xml files in every sub-directory from GB2312 to UTF8, you need to type the following:
for /R %%i in (*.xml) do (ec %i %i GB2312 UTF8)
Then, job done.
Another way is to use powershell.
PS C:\temp> $a = type gb2312.txtPS C:\temp> out-file -filepath utf8.txt -inputobject $a -encoding utf8
It's also very easy, but sometimes you cannot control all the process...