January, 2008

  • Jie Li's GeekWorld

    Faceted Search 2.0 for SharePoint/Search Server


    Leonid Lyublinski released his second major version of Faceted Search today, this is a great tool that can instantly improve your SharePoint Server/Search Server user experience!

    It provide the following benefits:

  • Grouping search results by facet
  • Displaying a total number of hits per facet value
  • Refining search results by facet value
  • Update of the facet menu based on refined search criteria
  • Displaying of the search criteria in a Bread Crumbs
  • Ability to exclude the chosen facet from the search criteria
  • Flexibility of the Faceted search configuration and its consistency with MOSS administration
  • and in version 2, he enhanced:

  • Multi- thread processing. 1st thread runs for up to 500 facets synchronously, while the 2nd thread is running asynchronously against up to ~30,000 facets
  • Client side refresh (not AJAX based) that updates only Facets web part w/o page refresh
  • Web part connection to pass Facet Settings to the Bread Crumbs
  • Extended facet schema now supports:
    • Facet icons. Default icon per Facet name complimented by an icon per Facet value
    • Friendly names for facet values
    • Built-in wild-card match, e.g. FileExtension=”ASPX?ID%” will match all “ASPX?ID=” records (useful for exclusions)
    • Exclusions. Allow exclude facet when values match pattern
    • Improved search syntax, added supports for sentences and quoted phrases
  • I would suggest that everyone who use SharePoint and Search Server to install this open source and free tools, it's really great.


    Live demo(lower right corner)


  • Jie Li's GeekWorld

    No more MOSS or SPS, only SharePoint Server


    THE BOSS said: MOSS should not be called MOSS, it should only be refered as Microsoft Office SharePoint Server, or SharePoint Server for short.

    It's really strange and hard to change my habit for typing MOSS and SPS. I think this change may be good for newbies to sharepoint -- they don't know what's MOSS and what's SPS. And even a Microsoft employee may not understand the difference. One of the marketing managers in my team always use the term "SPS" in her customer invitation letter, I had to explain to her everytime that "SPS" stands for "SharePoint Portal Server 2003" and "MOSS" for "Microsoft Office Sharepoint Server 2007", but she couldn't remember at all, or I guess she just didn't want to remember.  So, this change may help with such situations.

    But what about Microsoft Search Server? Better they remain to be MSS and MSSE I think.

    And...my codeplex project will change its name to SharePoint Search Admin!


    How foolish I am...

  • Jie Li's GeekWorld

    What may happen when I crawl MILLIONS of files in MOSS/MSS? Part I


    1. On MOSS box, CPU usage seems very high for several hours. Target system may also suffer from low performance.

    This will happen in several situation, especially after you changed crawl impact rules. By default, MOSS/MSS will request for 8 files at one time for a single server, you can change it to 64 at the most. But remember, although sometimes this can help with crawl speed, it will hurt performance of both MOSS and target systems. So, if there's no special needs, do not set this value to too high. For low performance servers, you may want to increase the interval between two file requests.

    Meanwhile, crawl schedules should be adjusted to prevent target system from being impacted in business hours.

    2. Crawl time takes too long. Only ~30,000 files can be crawled per hour.

    Check the bottleneck first. You can use some program to monitor the bandwidth, cpu usage, sql box performance... But don't forget to check your NIC. Let's say you have a 100Mbits connection to the intranet. So on average, you can get 8~10Mbytes per second, which means 480~600Mbytes per minutes, 29~36Gbytes per hour. Considering other factors, it is about less than 30Gbytes.

    Then take a look at the content you are crawling. If the average size of your files is about 1Mbytes, which is very common if that is a mixed set of PPT/DOC/XLS files, you can of course only crawl about 30,000 files per hour.

    So, increase your network bandwidth is a key to crawl speed.

    Sometimes, nothing wrong about the MOSS box, nothing wrong about your network bandwidth, it's just because your target system is too slow, for example an old Domino server. In this case please refer to point 1.

    To be continued...

  • Jie Li's GeekWorld

    A Command Line Encoding Convertor in .Net


    *If you are looking for something to transcode audio, please do a google on LAME, besweet. You can also take a look at one of my old work http://paradiso.cn/converter/any2wav.htm 

    *If you are looking for cmdline video encoding, please try mencoder, ffmpeg, etc. You can look for help in Doom9 forum.

    The tool I talk here is only for TEXT encoding problems.

    Well, this is a pretty simple and stupid tool. It contains no more than 10 lines of useful C# code, and the performance is not very good. But sometimes, when you want to deal with stupid problems, you have to use such tool. I like GNU's iconv, but there's no good port on Windows.

    So, I have to write one for my own usage.


    Usage: ec inputfile outputfile [input Encoding] [output Encoding]

    No wildcard support, but you can simply do a trick in command shell.

    For example, you want to convert all xml files in every sub-directory from GB2312 to UTF8, you need to type the following:

    for /R %%i in (*.xml) do (ec %i %i GB2312 UTF8)

    Then, job done.

    Another way is to use powershell.

    PS C:\temp> $a = type gb2312.txt
    PS C:\temp> out-file -filepath utf8.txt -inputobject $a -encoding utf8

    It's also very easy, but sometimes you cannot control all the process...

  • Page 1 of 1 (4 items)