Engineering Windows 7

Welcome to our blog dedicated to the engineering of Microsoft Windows 7

Windows Desktop Search

Windows Desktop Search

One of the points of feedback has been about disabling services and optionally installing components—we’ve talked about our goals in this area in previous posts.  A key driver around wanting this type of control (but not the only driver) is a perception around performance and resource consumption of various platform components.  A goal of Windows is to provide a reliable and consistent platform for developers—one where they can count on system services as being available, as well as a set of OS features that all customers have the potential to benefit from.  At the same time we must do so in a way that is efficient in system resource usage—efficient enough so the benefit outweighs the cost.  We recognize that some percentage of customers believe solving this equation can only be done manually—much like some believe that the best car performance can only come from manual transmission.  For this post we’re going to look into the desktop search functionality from the perspective of the work we’re doing as both a broadly available platform component and to provide the rich end-user functionality, and also look at the engineering tradeoffs involved and techniques we use to build a great solution for everyone.  Chris McConnell, a principal SDE on the Find and Organize team, contributed this post.  --Steven

Are you one of those folks who believes that search indexing is the cause of your drive light flashing like mad? Do you believe this is the reason you’re getting skooled when playing first person shooters with friends? If so, this blog post is for you! The Find and Organize team owns the ‘Windows Search’ service, which we simply refer to as the ‘indexer’. A refrain that we hear from some Vista power-users is they want to disable the indexer because they believe it is eating up precious system resources on their PC, offering little in return. Per our telemetry data, at most about 1.5% of Vista users disable the indexing service, and we believe that this perception is one motivator for doing so.

The goal of this blog post is to clarify the role of the indexer and highlight some of the work that has been done to make sure the indexer uses system resources responsibly. Let’s start by talking about the function of the indexing service – what is it for? why should you leave it running?

Why Index?

Today’s PCs are filled with many rich types of files, such as documents, photos, music, videos, and so on. The number of files people have on their PC is growing at a rapid pace, making it harder and harder for them to find what they’re looking for, no matter how organized their files may (or may not) be. Increasingly, these files contain a good deal of structure, with metadata properties which describe their contents. A typical music file contains properties which describe the artist, album name, year of release, genre, duration of the song, and others which can be very useful when searching for music.

Although search indexing technologies date back to the early days of Windows, With Windows Vista Microsoft introduced a consumer operating system that brought this functionality to mainstream users more prominently. Prior to Vista, searching was pretty rudimentary – often a brute force crawl through the files on your machine, looking only at simple file properties such as file name, date modified, and size, or an application specific index of application specific data. Within Windows, a more comprehensive search option allowed you to also examine the contents of the files, but this wasn’t widely used. It was fairly basic functionality – it treated all files just the same, without the tapping in to the rich metadata properties available in the files.

In Windows Vista, the indexing service is on by default and includes expanded support in terms of the number of file formats and properties which are indexed. The indexer watches specific folders on your PC and catalogues their contents to facilitate fast searching of those files. When Windows indexes your music files, it also knows how to extract the music-specific properties which you’re most likely to search for. This enables support for more powerful searches and richer views over your files which wasn’t possible before. But this indexing doesn’t come free, and this is where engineering gets interesting. There’s a non-zero cost (in terms of system resources) that has to be paid to enable this functionality, and there are trade-offs involved in when and how you pay that price. There is nothing unique to indexing—all features have this cost-benefit tradeoff. 

Trade-Offs

Many search solutions follow(ed) the traditional “grep” model which means every search will read all of the files you wanted to search. In this case, you paid with your time as you waited for the search to execute. The more files you searched, the longer you waited each time you searched. If you wanted to perform the same search again, you would “pay” again. And the value you were getting in return wasn’t very good since the search functionality wasn’t particularly powerful. With Windows Vista , the indexer tries to read all of your files before you search so that when you search, it’s generally quicker and more responsive. This requires the indexer to scan all of your files just once initially, and not each and every time you perform a search. If the file were to change, the indexer would receive a notification (a “push” event) so that it could read that file again. When the indexer reads a file, it extracts the pertinent information about the file to enable more powerful searches and views. The challenge is to do this quickly enough so that the index is always up to date and ready for you to search, but also doing so in such a way that it doesn’t impact the performance of your system in a negative way. This is always a balancing act requiring trade-offs, and there are a number of things the indexer does to maintain its standing as a good Windows citizen while working to make sure that the index is always up-to-date.

A Model Citizen

A lot of work has gone into making the indexer be a model Windows citizen. We’ve written an extensive whitepaper on the issue, but it’s worth covering some of the highlights here. First and foremost, the indexer only monitors certain folders, which limits the amount of work it needs to do to just those files that you’re most likely to search. The indexer also “backs off” when you are actively using your PC. It indexes files more slowly, or stops entirely depending on the level of activity on the PC. When the indexer is reading files it uses low priority I/O and CPU and immediately releases the file if another application needs access.

It’s critical that we get all of these issues right for the indexer, because it’s not only important for the features that our team builds (like Windows Search), but it’s important to the Windows platform as a whole. There are a host of applications which require the ability to search file contents on the PC. Imagine if each one of those applications built their own version of the indexer! Even if all of these applications did a great job, there will be a lot of unnecessary and redundant activity happening on your PC. Every time you saved one of your documents there will be a flurry of activity as these different indexers rushed to read the new version. To combat that, the indexer is designed to do this work for any application which might choose to use it and provide an open platform and API with flexibility and extensibility for developers. The API designed to be flexible enough to meet needs across the Windows ecosystem. Out of the box, the indexer has knowledge of about 200 common file types, cataloging nearly 400 different properties by default. And there is support for applications to add new file types and properties at any time. Applications can also add support for indexing of data types that aren’t file-based at all, like your e-mail. Just a few of the applications that are leveraging the indexer today are Microsoft Office Outlook and OneNote, Lotus Notes, Windows Live Photo Gallery, Internet Explorer 8, and Google Desktop Search. As with all extensible systems, developers often find creative uses for components for the system services. One example of this is the way the Tablet PC components leverage the index contents to improve handwriting accuracy.

Constantly Improving

We’re constantly working to improve the indexer’s performance and reliability. Version 3 shipped in Windows Vista.  Major improvements in this version included:

  • The indexer runs as a system service vs. as a per user process.  This minimizes impact on multi-user scenarios e.g. only one catalog per system results in reduction in catalog size and prevents re-indexing of the same content over and over.  Additional benefit is gained from the robust nature of services.
  • The indexer employs low priority I/O to minimize impact of indexing on responsiveness of PC.  Before Windows Vista, all I/O was treated equally.

We’ve already released Windows Search version 4 as an enhancement to either Windows XP or Vista which goes even further in terms of performance and stability improvements, such as:

  • Significant improvements across the board for queries which involve sorting, filtering or grouping. Example improvements on Vista include:
    1. Getting all results while sorting or grouping has been improved. Typical query improvements  are up to 38% faster.
    2. CPU time has been reduced by 80%
    3. Memory usage has been reduced by 20%
  • Load on Exchange servers is reduced over 95% when Outlook is running in online mode.  With previous versions of Windows Search, large numbers of Outlook clients running in online mode could easily overwhelm the Exchange server.
  • Reliability improvements including:
    1. We made a number of fixes to address user-reported situations that previously caused indexing to stop working.
    2. We improved the indexer’s ability to both prevent and recover from index corruptions.  Now, when catalog corruption is detected it is always rebuilt automatically – previously this only happened in certain cases.
    3. We added new logging and events to help track down and fix reliability issues.

And we’ve done even more to improve performance and reliability for the indexer in Windows 7 which you’ll soon see at the PDC. If you still believe that the indexer is giving you trouble, we’ve got a few things for you to try:

  • Download and install Windows Search 4 (on Vista or XP).
  • Download and install the Indexer Gadget from the Windows Live Gadget Gallery (Vista only). This gadget was written by one of our team members, and gives you a quick way to view the number of items indexed. It also allows you to pause indexing, or to make it run full-speed (without backing off).
  • If you‘re one of those people who like to get under the hood of the car and poke around the engine, you can use the Windows Task manager and/or Resource Monitor to monitor the following processes: SearchIndexer, SearchFilterHost, SearchProtocolHost.

If you feel as though your system is slow, and you suspect the indexer is the culprit, watch the gadget as you work with your PC. Is the number of indexed items changing significantly when you’re experiencing problems? If you pause the indexer, does your system recover? We’re always looking to make our search experience better, so if you are still running into issues, we want to hear about them. Send your feedback to idx-help@microsoft.com.

Chris McConnell

Find and Organize

Leave a Comment
  • Please add 6 and 4 and type the answer here:
  • Post
  • @ pndragon

    many people have large Hard drives with thousands of files. search does well. Search has been in windows for a long time, and it wont ever go away, not now with the 500GB and even 1TB hard drives

  • I personally dislike desktop search. While I don't have any problems with performance or space issue, I do have a problem with security. There is always a slightly higher chance that security is compromised by extra services or that my files can potentially be known from the index.

    The main reason why I don't like using desktop search is because of how rare I search. And whenever I do search, I usually restrict it to a specific folder so there's really no problem with time.

    While I can see why some people would like this feature, personally, I would prefer an option while installing Windows to remove desktop search (and all the other Windows programs and services I never use, like Help, Calendar, Mail, Media Player, Messenger, etc.) I have seen nLite, but that doesn't seem to work properly for me, but its close to what I desire from Windows.

  • Sorry, but I'm still too mad about you guys using WSUS to push Windows Search on all my machines to even think about downloading Windows Search 4.

  • @marcinw -- We think we've been trying to write about topics critical to a modern OS--the launching of programs, managing windows, booting the OS, graphics subsystem, and storage are all things I believe you would find as "core" parts of a modern OS beyond processes, scheduling, memory management, and so on (all of which are darn important as well and also part of an OS).  

    In the first post we described how the PDC and WinHEC are the events where we will provide the in-depth look at the features of Windows 7 (and attendees will get the code).  This blog set out to "provide context over the next 2+ months with regular posts about the behind the scenes development of the release and continue through the release of the product."  

    Once code is out there we will start to blog more regularly about the details of the release that are supported in the code.

    I hope that makes sense,

    --Steven

  • I think many of the people who disable the search service are those same people who read those dodgy "optimize your PC" guides which recommend you disable all kinds of services (I've seen ones which recommend *disabling* "Windows Installer"!)

  • Windows will always have problems untill a total rewrite comes in to play.

    By rewrite I mean start off with a brand new code from scrartch

  • Ha, yeah, because a rewrite from scratch ALWAYS makes things better...

  • I'm a big fan of fast search, but there are a couple of things on Vista that I'd like to see improved.

    1. I really hate attempting to double-click on search results, only to have the one I'm clicking on change position in the list before I hit it with the mouse, so I launch the wrong file. They should just come up in whatever order the search finds them, and I'll resort later if needed.

    2. Search folders: These are too slow (and I'm on a quad-cpu x64 rig). I've set up a custom one as a sidebar link to pull up my recent project folders so I don't have to navigate through a drive full of them. However, it takes so long to pull this up, that I can probably find it faster without. Is it possible to have the system run this periodically in the background and cache the results so it would be instant? (then update it on a click only if the results change? or something?) Macs have a way of seeing recently accessed folders from open/save dialogs that is basically instant. It would be nice if Windows had a way of doing that.

    3. That annoying "Searches might be slow" warning is really irritating on every single search. That should really go away, so if I don't want to index a folder I'm not pestered to do so.

  • haha, gonzc900, stop reading lame tech blogs and go find something else to do... this 'rewrite' rubbish has been going around for years...

    Anyway, personally I love search, but can definetly see need for improvement, although most of that has been covered in the above comments.

    What I would like to see more than anything is instant search results when searching just the start menu and user accounts.

    If I'm trying to locate a folder somewhere I'm okay with waiting, but when I want to launch Word I just wanna push start, type 'word' and launch it, no waiting for 3-5 seconds...

    Also a little more intuitivness would be nice as well... for example search learning mis-spellings in a similar way as Launchy does... so if I type 'wrod' I still get 'Word' found...

    Also things like if you type "uninstall programs" you won't get anything, you have to type "programs and features" or "features" whatever.... the point is it's probably not the most user-friendly... for people just wanting to uninstall programs typing that would make sense...

  • Maybe you can shed some light on this:

    I have some excel code that runs for about 20 seconds, in office 2003 (about 33 seconds in office 2007, but that’s another story)

    While it’s running, I run russinovich’s process monitor to see what’s happening. Well, out of 15,438 events, 7103 of the events are search related. 6317 are related to searchindexer.exe. I thought it was supposed to be dormant while the computer was busy?

    You want the logfile, I’ll gladly send it to whoever wants to analyze it.

  • Excuse me OT!

    Name of Windows 7 =

    http://windowsvistablog.com/blogs/windowsvista/archive/2008/10/13/introducing-windows-7.aspx

    Nice!! I love this name  , thank's :D

  • I for one was not very happy with the indexing and search features in XP. When Vista came out I was fairly happy with the improved performance.

    My only hesitation with using WDS is, as others have mentioned, the GUI. While the performance has been a bonus and decrease the time/resources spent finding files I have often found the interface a pain when trying more advanced searches.

    Trying to explain how to use this feature to less computer savy clients has often left them confused.

    I am looking forward to seeing a more user friendly interface with streamlined access to advanced functions in Windows 7.

  • The Indexer Gadget is one of a select few that are actually useful. The only problem I have with it is that I've never had a problem with the indexing and have had heaps of problems with Superfetch (more to follow). This gadget has wet my lips and left me wanting one for Superfetch.

    Superfetch doesn't appear to have a concept of file sizes from my experience. I have 2GB of RAM, of which typically 1GB is used for caching. Superfetch trawls my hard drives and notices a folder I accessed once the day before. The folder has 4 DVD ISO's and decides to try to cache 2 of them. Using the Resource Monitor I can tell that the service is reading the full 6GB of ISO's into memory, noticing the cache is full, and writes 6GB worth of data to pagefile.sys. I'm assuming the remaining cache in RAM is the tail end of one of the ISO's which is not particularly useful.

    I was running a couple of virtual machines and tried to launch a third, and noticed a huge performance hit as I ran out of RAM. The system slowed to a crawl as I was thrashing to pagefile.sys. After eventually being able to kill off one of the virtual machines, I expected the system to become more responsive. I watched in horror as Superfetch came along and started filling up the RAM that was just freed instead of having the working set of data in pagefile.sys being pulled back into memory.

    Superfetch has caused more problems than it is worth for me. I would like to be able to turn it off, but I have been unsuccessful in finding a solution to stop it. Disabling the service doesn't appear to have an effect. I can see how it would be useful when there are a few small commonly used program files and a small amount of data, but it doesn't seem to do a good job of deciding what to load. Disk IO is a precious resource, it shouldn't be unnecessarily abused in order to speed up other operations.

    Steven, can you try to make sure that a Superfetch article is done sometime soon?

  • I have to second the hate on Superfetch.

    After I first installed Vista I noticed that the hdd was constantly thrashing.  CONSTANTLY, as in every minute of every day, nothing but hdd activity.

    The level of activity made the system responsiveness incredibly unpredictable.  Sometimes I could open an application right away, other times the hdd would just thrash a bit harder for 20 seconds while I waited.

    At first I figured Vista was just indexing everything and it would eventually settle down.  But no ... a month later and the madness continued.

    So I started disabling services.  I first disabled the index service.  Then I disabled superfetch.  Suddently the hdd stopped thrashing, and the slugishness was gone.

    Months later I re-enabled the indexing service because I needed to search my inbox in Outlook.  Much to my surprise, the hdd didn't start thrashing and the slughishness did not return.  So I've left it on, somewhat happier that I can now search for things.

    So for the love of Pete, could someone please fix superfetch?  :)

  • Right, indexing is actually a nice thing. I would like to see a better UI for those panels. dockable, pinable,and shortcuts etc.

    But also would be good if you post an article about the accessories you have in windows. i.e. Notepad, Wordpad, Calculator, Paint, system tools etc.

    They were ridiculously outdated even in XP not talking about Vista. Ugly Win 95 UI and lacking modern features. OS-vise there are tons of features that should be accessible from everywhere.

    I understand that big revenue comes from Office, but Notepad, Wordpad is a joke.

Page 3 of 10 (138 items) 12345»