Notes on comments.
Welcome to our blog dedicated to the engineering of Microsoft Windows 7
The discussion and email about desktop search offered an opportunity for us to have a deeper architectural discussion about engineering Windows 7. There were a number of comments suggesting alternate implementation methods so we thought we’d discuss another approach and the various pros and cons associated with it. It offers a good example of the engineering balance we are striving for with Windows 7. Chris McConnell wrote this follow-up. --Steven (See you at the PDC in a week!)
Thanks for all the great feedback on our first blog post on Windows Desktop Search. I’ve summarized a number of points that have been made and added some comments about the architectural choices we have made and why.
As some posters have pointed out, one possible implementation is to integrate indexing with the file system so that updating a file immediately updates the indices. Windows Desktop Search takes a different approach. There are two aspects of file system integration: knowing when a file changes and actually updating the indices before a file is considered “closed” and available. On an NTFS file system, the indexer is notified whenever a file changes. The indexer never scans the NTFS file system except during the initial index. It is on the second point—updating the indices immediately when a file is closed that we made a different choice. Updating immediately has the benefit that a file is not available until it is indexed, but it also comes with a number of potential disadvantages. We chose to decouple indexing from file system operations because it allows for more flexibility while still being almost real-time. Here are some of the benefits we see in the approach we took:
A number of people expressed frustration with the lack of an advanced query UI. Microsoft has many advanced query user-interfaces in many products, but these are generally focused on well-defined query languages (SQL) or on specific domains (like the Advanced Find in Outlook). With Vista we wanted to address the query problem in a manner more familiar to people today—a single edit control. Our implementation supports a rich query language within that edit control. This is the same approach people are familiar with for web searching for both standard and advanced queries.
We had two observations that led to this approach:
In designing Vista we incorporated the feedback that it is desirable to do precise queries. The approach taken in Vista was to support a rich query language which allows all properties and a fairly natural syntax. For example typing “from:gerald sent:today” will find all email from “Gerald” sent today! The big issue is that people do not know or the query language. In Windows 7, we have focused on helping people see how to use the query language in context. For now, you can see the following for some information on Vista’s query syntax. Much of this syntax and experience is similar to web search that we all use today.
A number comments were about substring matches in filenames, which we do not currently support. This is part of the overall discussion about advanced queries. In order to efficiently execute queries, the indexer builds indices that are based on individual words. In Vista we introduced “searching as you type” to our search UI. Under the hood this is implemented as prefix matches on the indexed words. So when you type, ‘foo’, we look for all terms that start with those letters including ‘food’ and ‘football’. Even more interesting if you type ‘foo net’ we will match on items that have the words ‘food’ and ‘network’ in them. (If what you really want is to match the phrase “foo net” then typing those words inside quotes will do that—another example of advanced query syntax) We have focused primarily on searching for terms found in any property, but there is no question that filenames are special. In recognition of that we support suffix queries on filenames. If you type ‘*food’ then we will return files that end in ‘food’ like “GoodFood”. We do this by reversing the filename and then indexing it as a word. For example, the reverse filename of “GoodFood” would be “DooFdooG” which we index as a word. The suffix query ‘*food” is transformed into a prefix query “doof*” over the reverse filename index—clever, no? So we support prefix matches for all properties and suffix matches for filenames, but we do not support substring matches.
A number of comments focused on improving performance and citizenship—and we definitely agree on this input. We are always striving to make Windows do more with fewer resources. For those who have turned off indexing all together we hope that our continued improvements will make you reconsider. Even if you organize all of your files and don’t find search useful for files, perhaps you will find start menu search, email search or Internet Explorer 8 address bar search useful. We have worked hard at improving performance and citizenship across Windows. Some of this progress is visible in WS4 and soon in Windows 7. We have improved along all of our dimensions including indexing cost, battery life, citizenship, query speed and scrolling speed. We have some tremendous tools that help us track down performance problems. If you want to help, please contact firstname.lastname@example.org and we will tell you how to collect performance traces we can analyze so that we can continue to make improvements.
Find and Organize
I use wds 4 on xp
the searchindexer process takes a significant amount of memory sometimes.When indexing is complete and when the system is idle the average is about 10,000kb .It sometimes takes a peak values close to 63,000kb(even when indexing is complete).
It never drops below 3,000 kb,even when I am using the system heavily.
Is this normal behavior?
Any way to control this?
I installed search 4 on XP after the previous post on search. I did not know about the file name search error (error in my opinion), and then it took longer to get to the old working search companion.
Played with it, it worked quite nice, query language logic etc., but as I am searching file names and content of recently changed file, most of the time, I uninstalled. Now the working search is back.
"Windows is all about supporting diversity".
I hope the old search stays an option to support guys like my who don't mind waiting a few seconds for a basic search to find sub strings in file names. 100% hit.
Also supply enough API's so that third parties can also implement alternative "unwieldy" search GUIs.
O, did I have trouble to find a place to configure, expected it to be available on the search interface. Later I notice yet another new icon cluttering the notification area