Engineering Windows 7

Welcome to our blog dedicated to the engineering of Microsoft Windows 7

Follow-up: Windows Desktop Search

Follow-up: Windows Desktop Search

  • Comments 77

The discussion and email about desktop search offered an opportunity for us to have a deeper architectural discussion about engineering Windows 7.  There were a number of comments suggesting alternate implementation methods so we thought we’d discuss another approach and the various pros and cons associated with it.  It offers a good example of the engineering balance we are striving for with Windows 7.  Chris McConnell wrote this follow-up.  --Steven (See you at the PDC in a week!)

Thanks for all the great feedback on our first blog post on Windows Desktop Search.  I’ve summarized a number of points that have been made and added some comments about the architectural choices we have made and why.

Integration with the File System

As some posters have pointed out, one possible implementation is to integrate indexing with the file system so that updating a file immediately updates the indices.  Windows Desktop Search takes a different approach.   There are two aspects of file system integration: knowing when a file changes and actually updating the indices before a file is considered “closed” and available.   On an NTFS file system, the indexer is notified whenever a file changes.   The indexer never scans the NTFS file system except during the initial index.  It is on the second point—updating the indices immediately when a file is closed that we made a different choice.  Updating immediately has the benefit that a file is not available until it is indexed, but it also comes with a number of potential disadvantages.  We chose to decouple indexing from file system operations because it allows for more flexibility while still being almost real-time.   Here are some of the benefits we see in the approach we took:

  1. Fewer resources are used.  Inverted indices are global.  An inverted index maps from a word found in a property to a list of every document that contains that word.  Indexing a single file requires updating an index for every single unique word found in the file.   A single document might then update a very large number of individual indices.  Making these changes and committing them with the same robustness found on individual files would be very expensive.  The design of the indexer allows scheduling and aggregating these changes so that much less work is done overall—that means less CPU and less disk I/O.  The system can be more robust because indexing doesn’t only happen when a file is closed—and it can even be retried if necessary.
  2. File system operations are prioritized over indexing.  Getting files robustly updated and available is necessary for applications to use them.  We don’t want to delay that availability by forcing the cost of indexing into file close operations.   Searching over files is important, but is less important than actually working with files.  We wouldn’t want applications to decide individually if the indexer should be turned on or off just because they were seeking the best performance with respect to the file system.
  3. There are lots of file types.  Microsoft supplies extractors (IFilter/IPropertyHandler) for many common file types as part of Windows.  There are many other file types as well so it is important to allow non-Microsoft developers to write their own extractors.  In Vista (and Windows 7), these extractors run in a locked down process that ensures that they are secure and do not affect the performance of the whole system.  If indexing had to happen before a file was available, then an extractor could impact (intentionally or not) all file system operations.  
  4. Some files are more valuable to index then others.  If indexing happened when a file is closed, then there is no control over the order files are indexed.  Decoupling allows prioritizing indexing some files over others.  For example, searching for music is much more likely than searching for binary files.  If both music files and binary files have changed, then the indexer ensures it indexes the music files first.  Some files are not worth indexing at all for most people.  Several comments suggested that we should index the whole drive.  We can do that—and for those who would find it valuable it easy to add folders to be indexed.  (You can also remove them, but that is much less common so that is controlled through the control panel “Indexing Options.”)  For most people indexing system files is just a cost—they would never search for them and would be confused if they showed up as the result of a search. 
  5. Not everything is a file in single file system.  Windows is all about supporting diversity.  There are many different file systems like FAT32 and CDFS and we would like to be able to search over those as well.   If we integrated with only NTFS, then we would have to still have a loosely coupled system for other file systems.  Many applications also have databases optimized for their own needs.  For example, Outlook has a database of email.  If only files were indexed, then the email in the database could not be indexed unless Outlook either compromised their experience by using files only, or complicated their implementation by duplicating everything in both the file system and the database.

Advanced Queries

A number of people expressed frustration with the lack of an advanced query UI.  Microsoft has many advanced query user-interfaces in many products, but these are generally focused on well-defined query languages (SQL) or on specific domains (like the Advanced Find in Outlook).  With Vista we wanted to address the query problem in a manner more familiar to people today—a single edit control.  Our implementation supports a rich query language within that edit control.  This is the same approach people are familiar with for web searching for both standard and advanced queries.

We had two observations that led to this approach:

  1. The most important part of a search are the search terms.  Usually a single term is enough (and as we know from web searching, the majority of searches are one or two words).   And for refinement the file system tools of thumbnails, sorting, and/or type ahead can be used to narrow the search.  
  2. It is reasonable to consider a design for an advanced query UI covering property based search, but it will generally be unwieldy for all but the bravest people.  As we mentioned, Windows Search covers over 300 properties by default so if you show every property then the UI is unusable.  If we only show the most commonly used properties then how do you handle all of the other properties?  Would properties be grouped by the common application or by attributes such as times, names, file attributes, etc.?  Some of you might value the Outlook Advanced Find… interface, but there you see some of the challenges and that is within a specific domain where the grouping or related properties probably can be understood. 

In designing Vista we incorporated the feedback that it is desirable to do precise queries.  The approach taken in Vista was to support a rich query language which allows all properties and a fairly natural syntax.  For example typing “from:gerald sent:today” will find all email from “Gerald”  sent today!   The big issue is that people do not know or the query language.  In Windows 7, we have focused on helping people see how to use the query language in context. For now, you can see the following for some information on Vista’s query syntax.  Much of this syntax and experience is similar to web search that we all use today.

A number comments were about substring matches in filenames, which we do not currently support.  This is part of the overall discussion about advanced queries.  In order to efficiently execute queries, the indexer builds indices that are based on individual words.  In Vista we introduced “searching as you type” to our search UI.  Under the hood this is implemented as prefix matches on the indexed words.  So when you type, ‘foo’, we look for all terms that start with those letters including ‘food’ and ‘football’.    Even more interesting if you type ‘foo net’ we will match on items that have the words ‘food’ and ‘network’ in them.   (If what you really want is to match the phrase “foo net” then typing those words inside quotes will do that—another example of advanced query syntax)   We have focused primarily on searching for terms found in any property, but there is no question that filenames are special.  In recognition of that we support suffix queries on filenames.  If you type ‘*food’ then we will return files that end in ‘food’ like “GoodFood”.  We do this by reversing the filename and then indexing it as a word.  For example, the reverse filename of “GoodFood” would be “DooFdooG” which we index as a word.  The suffix query ‘*food” is transformed into a prefix query “doof*” over the reverse filename index—clever, no?   So we support prefix matches for all properties and suffix matches for filenames, but we do not support substring matches. 

Performance and Citizenship

A number of comments focused on improving performance and citizenship—and we definitely agree on this input.   We are always striving to make Windows do more with fewer resources.  For those who have turned off indexing all together we hope that our continued improvements will make you reconsider.  Even if you organize all of your files and don’t find search useful for files, perhaps you will find start menu search, email search or Internet Explorer 8 address bar search useful.  We have worked hard at improving performance and citizenship across Windows.  Some of this progress is visible in WS4 and soon in Windows 7.  We have improved along all of our dimensions including indexing cost, battery life, citizenship, query speed and scrolling speed.  We have some tremendous tools that help us track down performance problems.  If you want to help, please contact idx-help@microsoft.com and we will tell you how to collect performance traces we can analyze so that we can continue to make improvements.

Chris McConnell

Find and Organize

Leave a Comment
  • Please add 2 and 2 and type the answer here:
  • Post
  • @WindowsFanboy

    Sure??

    http://img139.imageshack.us/my.php?image=senzanomelu1.png

  • I think it would be great if there was a shortcut for creating new folders.

    Thank you for listening !! Keep up the good work :)

  • I've watched several videos of the new docking feature in Windows 7 where moving a window to the right or left side of the screen resizes it to 50% horizontal and 100% vertical on the side of the screen you move it to.  Honestly, this seems like a very bad idea.  As an example, consider the following scenario:

    You are working on something, and there are 3 windows that you need to see.  Your screen is large enough to see 2.5 of them at once (and maybe you are even ok just seeing the 3rd window on the left side to see if there are any changes while you work.)  You try to move the third window out of the way (naturally, part way off the side of the screen), but instead of moving out of the way, it resizes itself and takes half your screen!  Try to move it out of the way again, and the same thing will happen!

    I'm on a very large display and I still move things out of the way like this all the time.  (Not that I don't even more on smaller displays.)

    Another scenario:

    You really do have two windows that you want to see at once, and you want the two of them combined to fill the whole screen.  But one needs to be 2/3 the size of the screen to show what you need to see without scrolling, and one needs 1/3.  Now you move the windows to the sides, and you must first shrink the one that needs to be smaller to get it out of the way so they don't overlap, then resize the larger one (hoping that you resized the smaller one enough.)

    I can think of almost no situation in which a 50/50 split would be the right layout for my windows.  

    On the other hand, the new taskbar is the biggest improvement I've seen to the Windows interface since 95 came out!  This design appears to be much more scalable and efficient, and is also more pleasant to look at.  This is a good start.

  • I think docking is very good idea. Not sure how W7 would work with more than 2 window, but if it is intelligent enough that can not be problem.

    Lets say after dock you can resize the windows just like in Visual Studio or Eclipse. Also (un)pinning the window is a good thing to have.

  • @UserOfManyOperatingSystems -

    The window snapping feature is only triggered when the *mouse* cursor hits the side of the window.  Therefore, you can still drag windows partially off-screen just as you could before.  I think you'll find it is very difficult to accidentally trigger the snap.

    The 50/50 split has been very useful for me.  I use it regularly when dragging files from Explorer to SmartFTP, or when writing a blog post in Live Writer and having IE open to a page I'm quoting on the other side.

  • @A_Pickle and WindowsFanboy  

    In Windows 7, the Games folder is now enumerated and searchable in the Start menu.

  • @bpaddock

    Right.  I saw that.  But I can imagine that that would cause frequent resizing.  Think about it this way:  The side of your screen is one of the largest targets on your desktop.  It's almost impossible to miss.  So now, if you want to move a window part way off a screen, you must select an area of the window you are moving, drag it right (or left), and then consciously try to avoid touching the side of your screen with your mouse pointer!  That means that you must choose, in advance, the portion of your window to click on to drag it that is farther to the left than what you want to have showing, because otherwise, you will touch the side of the screen and resize.  If you guess wrong, you either get an unexpected resize, or you have to drop the window, find another portion of it to click on, and try again.  Even if you guess right, you can't just pick the right point and slide your window over and use the screen edge to stop you in the right place.  You have to slow down and stop when you are *almost* there.

    Basically, this makes an infrequently used operation easy while making a frequently used operation much harder, and it makes some of the largest targets on your screen areas that you have to be careful to avoid if you DON'T want to resize the screen.  This doesn't strike me as good interface design.

    A better approach might be to drag into one of the corners on the side that you want.  Corners are harder to hit unintentionally, but are easy to hit when you want to.

    And even better approach might be to add window decoration widgets on the title bar that perform the operation with one click.  This would even fit in well with the other title bar widgets, since those also perform window sizing operations.

    Another great approach would be to enable focus follows mouse and raise window with focus by default.  I have this enabled on all OSs that I use that support it.  This makes it much easier to work with overlapping windows, because you don't have to click on each window to activate it.  

    You could even combine this with dragging in the following way:

    1) Select object you want to drag from window 1

    2) Drag it onto window 2

    3) Window 2 raises after some reasonable amount of time (I'd suggest 300 ms) to make it easier to see what you are dragging on to.

    4) Drop object onto Window 2

    The OS X Finder actually does this, and it works rather well.

  • 1,2,3,4

    Yes Windows does the same since '98 I think. Snapping is useful when you need to work on multiple document at the same time reading one and editing the other. It is easier the D&D if they are side by side. (ie you don't have to wait 300ms just drop it)

    I think the snapping feature is nice and useful. I do like it in every application I use.

    I would like to see hot-keys for file management between the snapped windows. I hate when I always have to get the mouse for renaming and copying

  • @mystere

    Starting with the  beta and on to the final version, Windows 7 will have the new taskbar with the "peek" button on the right side. This makes it super easy to peek at the gadgets - just drag your mouse into the bottom right of the screen. If you want to interact with one of the gadgets, just click while your mouse is there (it does the same thing as show desktop). It's no harder than moving your eyes to glance at the gadgets when they're on the sidebar.

  • @lyesmith:

    Russinovich's virtual desktop is fast (I've used it), but it has a major limitation: you can't move windows between desktops.  The problem is that in his implementation, each program is bound to a logical windows desktop.  Each desktop contains some set of program, and windows are children of programs.  A program can only draw a window on its own desktop.  So you can't, say, have Firefox open a window on Desktop 1, and then have the same Firefox instance open a window on Desktop 2.  This is a critical feature for any virtual desktop, so while Russinovich has made a good effort, it doesn't currently meet real world usage needs.

    Even worse, Russinovich's implementation can't be shut down before the computer is shut down, or programs running on other desktops will be stranded on those desktops, with no way to access them.

    On Linux and OS X, these operations are performed trivially.

    The lack of a *good* virtual desktop implementation is one of the major usability problems with Windows.

  • In Windows 7, search is so well integrated throughout the operating system (even in more apps than Vista), why can't it be implemented for Gpedit.msc/Local Group Policy?

  • @Fanboy

    You obviously don't undestand the use case.  It's not a matter of "easy", it's a matter of conscious action versus subconscious action.  I glance at my gadgets all the time subconsciously without distracting me from my work.  Moving my mouse to click a button or pressing a hotkey does distract me from my work, especially if my work is two 30 inch monitors over.

  • The Start Menu search in Windows Search 4 behaves very strangely...

    I have a folder in My Documents which is called Budgets, inside it there are many other folders like Budgets 2008, Budgets 2009, Budgets Revised and so on.

    If I click start and type in searchbox a single word Budgets, I do get returned some deeper folders called like Budgets Revised and others, but I never get the one I want: the overall top folder Budgets. Why?

  • I would like to get back control of the Search function!

    Although I hate the Ribbon in Office, I'd love it for Search and in Windows Explorer!

    Search in Vista has so many weird problems e.g. you put in a data CD and do a search for a file and you get "No items match your search" presumably because CDs aren't indexed. So you look for "Advanced Search" to search in non-indexed locations and it's nowhere to be seen!

    Also half the the Folder Options in Windows Explorer are ambiguous and don't even work! I like files to be shown in details view or list view (but with sorting categories like file name and date modified). You click "Apply to all folders" or any of the other ambiguous options and experiment with different combinations (e.g. "Remember each folder's view settings" checked and unchecked) and sometimes it seems to work then it doesn't.

    Finally, the Pictures folder always comes up with icons instead of details, and the Music folder comes up with all this rubbish about artists name and ratings that is a total disaster area if you actually want to find something.

    Bottom line: Give me more control please! You have no right to presume what I want.

  • Where's the DFS support. Can't use Windows Search (Desktop search) - really can't use Windows 7 libraries without that support.

    BTW Needs integration with Search Server as well. At the the core of this last point: One index please in Windows Server System.

Page 5 of 6 (77 items) «23456