Welcome to MSDN Blogs Sign in | Join | Help

The Search Developer Story in SharePoint 2010 - Query Interfaces

SharePoint 2010 includes a number of features that make the platform easier to use for developers. An improved Visual Studio integration, the addition of LINQ to the SharePoint platform, sandboxing for deployment, and the new developer dashboard are just a few examples of how developing and deploying SharePoint solutions have become much easier.

 

As a member of the enterprise search development team that has worked to bring FAST Search into SharePoint 2010, I can tell you that a lot has also been done to benefit developers of search-based solutions. SharePoint 2010 Search and the new FAST Search for SharePoint 2010 have been designed to share a common platform so that search developers can integrate with both SharePoint Search and FAST Search for SharePoint 2010 using the same query side interfaces. This means developers don’t have to learn new APIs or programming models, but can leverage the same object models, services and a common query language for both products.

 

SharePoint developers and architects implementing search-driven applications should understand the available integration options. Depending on requirements, tools, and preferences, one can choose from among several integration points, including a brand new object model in SharePoint 2010. Here’s a list of the different integration points with a brief description of each:

 

The Federation Object Model (OM)

This is a new search object model in SharePoint 2010. It provides a unified interface for querying against different locations (search providers), giving developers of search-driven Web Parts a way to implement end-user experiences that are independent of the underlying search engine. The object model also allows for combining and merging results from different search providers. Out-of-box Web Parts in SharePoint 2010 are based on this OM, and SharePoint 2010 ships with 3 different types of locations; SharePoint Search, FAST Search and OpenSearch. The Federation OM is also extensible, should you want or need to implement a custom search location outside of the supported types.

The Query Web Service

This is the integration point for applications outside of your SharePoint environment, such as standalone, non-web based applications, or Silverlight applications running in a browser. The Query Web Service is a SOAP based ASMX web service, and supports a number of operations, including:

  • Querying and getting search results
  • Getting query suggestions
  • Getting meta data, e.g. a list managed properties

The same schema is shared for SharePoint Search and FAST Search, and both products support the same operations. For querying, clients can easily switch the search provider by setting a ResultsProvider element in the request XML. A number of extensions are available for FAST Search, e.g. refinement results, advanced sorting using a formula, issuing queries using the FAST Query Language.

 The Query RSS Feed

Certain scenarios, like simple mashups, may need only a simple search result list. The RSS feed is an alternative, lightweight integration point for supplying applications outside of SharePoint with a simple RSS result list. The Search Center - the default search front-end in SharePoint 2010 - includes a link to a query-based RSS feed. Switching the engine to the RSS format is done by simply setting a URL provider. Because of its intended simplicity, there are some limitations to what can be returned and customized in the query RSS feed. The object models or the web service integration scenarios are recommended for more advanced applications.

The Query Object Model

This is the lowest level object model, used by the Federation object model, the Query Web Service and the Query RSS feed. Both SharePoint Search and FAST Search support the KeywordQuery object in this object model. While the Federation OM returns XML (to Web Parts), the Query OM returns data types.

The Search Web Parts

 Search Web Parts in SharePoint 2010 are common in SharePoint Search and FAST Search, and are now based on the common Federation OM.  The Web Parts on a page communicate through a shared Query Manager, a central component of the Federation OM. This makes adding new Web Parts that interact with existing Web Parts simpler than before. For example, a new Tag Cloud Web Part for visualizing the query results can utilize the shared Query Manager for getting results. Developers will also be able to extend out-of-box Web Parts as they now are public in SharePoint 2010 (no longer sealed).

The Common Query Language

Both SharePoint Search and FAST Search support the Keyword Query Language syntax. This is the default query language for both products, and the end-user language supported from the Web Parts in the search centers (including the advanced search page). 

FAST Search Extensions

FAST Search has a number of extensions beyond the standard SharePoint Search that are available on both the Federation and Query object models, and as well as on the query web service. Some examples are:

    • The FAST Query Language, which supports advanced query operators like XRANK for dynamic (query time) term weighting and ranking.
    • Deep refiners over the whole results set, and the possibility of adding refiners over any managed property
    • Advanced sorting using managed properties or a query-time sort formula.
    • Advanced duplicate trimming, with the ability to specify a custom property on which to base duplicate comparisons.
    • “Similar documents” matching.
    • FAST Search Admin Object Model for promoting documents or assigning visual best bets to query keywords/phrases.

Building powerful search applications is easier than ever in SharePoint 2010. FAST Search is now integrated into the SharePoint platform and developers of search-driven solutions and applications can leverage a common platform and common APIs for both SharePoint Search and FAST Search. This means applications can be built to support both search engines and then extended if and when desired to take advantage of the more advanced features available with FAST Search, such as dynamic ranking, flexible sort formulae, or deep refiners for insight into your full result set.

 

Arnt Schøning, Senior Development Engineer | Microsoft Enterprise Search Group

(o on Twitter as @aschoning)

 

 

 

 

Enterprise Search and Bing Services – Part 2: Bing Geo-coding Structured/Unstructured Text for Search

Web-based mapping services from Microsoft and others have been around for a number of years, but all are naturally dependent on data that contains well defined geo-coordinate information – usually latitude and longitude pairs. But what if you have content without clearly defined geo-tagged? For example, news stories may contain references to places around the world, but they don’t always come pre-tagged with geo-coordinate information for these places. Is it possible to infer this information and still be able to use a map for search results?

The short answer is yes. By combining mapping services like the Bing Maps Web Services SDK 1.0 with information extraction techniques available in more advanced enterprise search platforms, we can create interesting geo-aware search applications even against content that contains no explicit geo-coordinate data.

Let’s take a look at a simple example. Specifically, we’ll look at an example that takes advantage of the FAST Enterprise Search Platform (ESP) and ability to automatically recognize words and phrases by type (entities) within full text. For this application, we care about the “location” extractor. Once location information is extracted, we can use the Bing Maps SDK Geocoding abilities to tag each document with the appropriate geo-coordinate information.  For the front-end, we’ll use a Silverlight 3 prototype UI based on the Bing MAPS Silverlight Control.

The steps for pulling this example together are:

1.      Crawl a news source (unstructured text) and retrieve sample news articles (using the FAST ESP Enterprise Crawler).

2.      Tag each document with extracted locations that are identified as most important in each article (using FAST ESP’s built-in location entity extractor).

3.      Create a geocoder client (sample code) using Bing Maps SDK to submit extracted locations, and retrieve lat/long.

4.      Call the geocode client from your content processor to submit extracted location to client, retrieve geodata and tag document with lat/long (using the FAST ESP content processing pipeline)

5.      Finally, tie this all up in a nice UI using Bing Maps Silverlight Control.

A more detailed “how to” with some sample code is posted here.

Below is a screen shot of the resulting example app showing a query for “Swine Flu”.  The larger the red pin the hotter that location is for that query in the target time range we searched:

Bing Map and Search

This UI can also be used to track newly arriving news articles (aka live feed) and providing a visualization of where the emerging hot spots in the news. Below is a video of a simulated and accelerated day (5 seconds = 1 hour) showing incoming news articles and how news spreads around the world through the course of a day.

 

This is just one simple example of extending search results to a map where geo coded data is not readily available. I hope it helps to inspire other ideas for combining search with the functionality in the Bing Map SDK.

 

Cem Aykan

Senior Consultant

Microsoft Enterprise Search Group

 

 

 

 

FAST meets SharePoint - What's Coming in Search for SharePoint 2010

Last week was the 2009 SharePoint Conference in Las Vegas. The sold-out attendance of 7400 doubled the number from the previous SharePoint conference 1 ½ years ago. This is not too surprising given the incredible momentum of SharePoint and the fact that much of the event was dedicated to disclosure of the highly anticipated SharePoint 2010 release. Surprising or not, it was gratifying for us search guys to see the level of interest in the new search capabilities being disclosed for 2010. Several of the search-specific break-out sessions had as many people in the audience (>1000) as the entire attendance of our FASTforward’09 search conference in February earlier this year.

At that FASTforward conference in February, Microsoft announced plans to deliver enterprise search targeting two general solution areas: 1) business productivity applications, where the emphasis is on search driving employee efficiency, and 2) Internet business applications, where search is used to drive customer service and revenue. The disclosure of the new search options in SharePoint 2010 at last weeks SharePoint Conference amounts to the first deliverable of this strategy.

SharePoint as a whole has evolved from the original content management and portal platform of earlier releases into a complete “business collaboration platform”, and there are *a lot* of enhancements and new capabilities in SharePoint 2010. I won’t even attempt to summarize them all here. Instead, check out Jeff Teper’s blog post from early last week, which provides a remarkably good summary of everything that’s coming in SharePoint 2010. As Jeff points out in his blog, search is just one of several major categories of capabilities in SharePoint 2010, but “enterprise search is a big investment area for Microsoft” and an area where “we’ve added depth at all levels in 2010”.

There are two main enterprise search options coming with the SharePoint 2010 release:

1)      SharePoint Server 2010 Search – the out-of-the-box SharePoint search for enterprise deployments.

2)      FAST Search Server 2010 for SharePoint – a brand new add-on product based on the FAST search technology that combines the best of FAST’s high-end search capabilities with the best of SharePoint.

SharePoint Server 2010 Search represents an important upgrade to the existing search for SharePoint, while FAST Search for SharePoint 2010 is a completely new offering and the first new product based on the FAST technology since FAST was acquired by Microsoft in April 2008. Customers and partners familiar with search in previous versions of SharePoint will see many important improvements in 2010, regardless of which product they deploy. For example, there is a new People Search feature for expertise identification and search-driven collaboration, to name just one (see Jeff’s post for a good summary of these general improvements).

FAST Search for SharePoint 2010 adds a whole new level of search capabilities that are a superset of what comes in the out-of-the-box SharePoint 2010 Search option. Since there are now two search options in 2010, it’s useful to understand what is unique in FAST Search for SharePoint and when you might consider using it over the out-of-the-box SharePoint 2010 search. With that in mind, here are my 10 reasons to consider FAST Search for SharePoint 2010:

1)      Content Processing Pipeline

 

For people familiar with the FAST Enterprise Search Platform (ESP), the good news is that the most valued capabilities of ESP have been brought into FAST Search for SharePoint 2010 and made easier to access and deploy through tight integration with the SharePoint management and development tools.  The open framework in FAST ESP for creating custom content processing pipelines is a good example. Since it was first introduced in version 3 way back in 2002, FAST customers and partners have leveraged advanced content processing and advanced linguistic features to create a wide variety of novel search applications. This highly valued aspect of the FAST ESP will be available in FAST Search for SharePoint and has been architected and enhanced to take advantage of the SharePoint management interfaces and development tools like PowerShell.

 

2)      Meta-data Extraction

 

Meta-data is used in search for faceted refinement, relevancy tuning, targeted queries (e.g. search only the authors field), and other general techniques designed to improve findability. The problem is that unstructured documents are often devoid of useful meta-data. The ability to automatically extract meta-data to create useful structure on otherwise unstructured documents is a feature of FAST ESP that will also available in FAST Search for SharePoint 2010. Importantly, FAST Search for SharePoint 2010 takes advantage of simple administrative tools and the concept of “managed properties” in SharePoint to support adding custom meta-data extractors very quickly.

 

3)      Structured Data Search

 

Structured data search is possible with both search options in SharePoint 2010, but FAST Search for SharePoint 2010 adds an extra level of sophistication for searching data that contains numbers, dates, and other encoded and structured information. To start, the full FAST Query Language (FQL) is available to application developers who want the richness and expressiveness that FQL provides. This includes support for numeric and date data types, formula-based query operations, term weighting with the XRANK operator, and much more. Also, integration with the new Microsoft Business Data Connectivity services in 2010 means that ingesting structured data from external Line of Business applications is much easier in FAST Search for SharePoint.

 

4)      “Deep” Refinement (Faceted Search)

Previously only available in SharePoint search through 3rd party add-ons, faceted search, called “refiners” in the default search interface (SharePoint Search Center), is now native in the out-of-box SharePoint 2010 search. FAST Search for SharePoint adds to this the ability to deliver faceted search across results sets of any size while retaining precise counts on the refinement facets. This is critical for research and analysis applications where precise counts on facets are important decision making criteria. (You can see examples of deep refiners on FAST ESP powered sites like scirus.com and dell.com.)

5)      Visual Search (Document Thumbnails and Previews)

 

Visual document thumbnails and previewer Web Parts will be out-of-the-box with FAST Search for SharePoint 2010 to help users more quickly judge what is relevant in a search result list. This includes a graphical previewer for PowerPoint presentations based on Microsoft Silverlight that allows users to quickly find the “one slide” of interest without having to open up the entire presentation.

 

6)      Advanced linguistics

The quality of search against text data is highly dependent on the ability to apply the right language-specific processing techniques. FAST Search for SharePoint 2010 builds on the FAST ESP heritage and Microsoft tools to include advanced language processing (linguistics) for dozens of languages, including optimized processing for Chinese/Japanese/Korean.

7)      Visual best bets

SharePoint already supports the concept of search Best Bets – managed results delivered with the search for specific queries. FAST Search for SharePoint adds to this the ability to render visual best bests in the form of images and even videos. Management of search best bets, both standard and visual, is through the standard SharePoint administrative console.

8)      Best-in-class development platform

FAST Search for SharePoint 2010 builds on the comprehensive development framework of SharePoint 2010. The customization options range from configuring out-of-the-box search behavior (best bets) and user interface controls (Web Parts), to extending existing functionality using public Web Part code and SharePoint Designer, to creating brand new components and functionality with the available APIs. For FAST ESP aficionados, no compromises have been made in the area of extensibility with FAST Search for SharePoint, but many of the customizations in ESP are now much easier to do.

9)      Custom search experiences (per user/profile)

 

FAST Search for SharePoint 2010 includes the same level of relevancy tuning available to FAST ESP. It will be possible, as it is in ESP, to create custom relevancy models tuned to differences in content sources, application needs, and user contexts. User context simply means that different users can have different search “contexts” that enable experiences optimized for their specific business needs. User context can be used to set the search sources, relevance rank profile, linguistic processing features, and other search features by user or user group. In an enterprise search setting, this means that a Sales Director does not have to see the exact same results as a Product Designer for a given query, even if they are searching the same sources.

 

10)  Extreme Scale and Performance

 

Scale and performance of the out-of-the-box SharePoint 2010 Search has been dramatically improved – with proven scalability to 100 million documents and more. For FAST Search for SharePoint 2010, the exact same scale-out model that exists in FAST ESP has been preserved to enable extremes of content (e.g. number of documents to search), queries (e.g. the number of queries or query rate), or both. This means search solutions that can support billions of documents and thousands of queries per second.

 

There is much to like about what is coming with search in SharePoint 2010 and more information than I’m able to share in one blog post. You can add to the list above the general benefits of search enhanced by all the other tools and services of the SharePoint platform - including content management, communication and collaboration, business intelligence, system administration and monitoring, application development, and so on. As I’ve pointed out in previous posts, search doesn’t exist in a vacuum, and the ability to integrate and interoperate with other business productivity functions is critical to actually acting on a search result. From this point of view, SharePoint and it’s compendium of integrated services, simply makes search better.

The first public beta for SharePoint 2010 will be available in a few weeks. This will include beta bits for the standard search in SharePoint 2010 and FAST Search for SharePoint 2010. I hope you’ll be able to try out these new search products and features. In the mean time, you can learn more about what’s coming in search for SharePoint 2010 by going to the SharePoint 2010 preview site at http://sharepoint2010.microsoft.com/.

Nate

Enterprise Search and Bing Services – Part 1: The Bing Translator

(In May of this year, Microsoft launched its Bing search service for the Web. While Bing has shown steady growth in the Web search market, it’s not well known that Bing also includes a collection of services that can be accessed programmatically to enhance enterprise applications. This is the first of series of guest posts that explore how to combine some of these Bing services with Microsoft enterprise search offerings.  nt)

 

If your organization has customers or employees in multiple regions around the world, chances are you have the need to search across content in multiple languages. Earlier this year Microsoft Research announced the Bing Translator AJAX API (http://www.microsofttranslator.com/dev/ajax/) – an interface that enables developers to integrate the translator into any Web-based application. In this article we will take a high level look at how to integrate the Bing Translator with enterprise search – specifically with the FAST Enterprise Search Platform (ESP).

The Bing Translator AJAX API is a remote service that currently supports 20 different languages. The API features include auto detection of language as well as translation between any two languages. For applications with secure data, the API supports the HTTPS protocol for secure connections over the Internet.

The screenshot below shows one example of a FAST ESP powered search application with results containing documents from different languages.

Cross language search

In this example, not only is the user’s query translated and expanded to include other languages (French, German, and Chinese), but the user has the ability to translate the teasers or the entire document using the Bing Translator. The search results also include query highlighting for each of the multiple translations of the query. Finally, the user can use the slider bar (or the visual navigator) to favor documents written in certain languages. Any slider action causes the result set to update automatically. The relevance control behind this slider widget is actually a feature of FAST ESP, but it shows another way of surfacing cross-lingual search.

There are various ways to display and expose query translation features to an end user, and the example above is just one. While this example applies query translation automatically, it’s better, in our experience, to allow the user to select it as an option. Alternatively, the application can display translated results in separate tabs.

Integrating with FAST ESP

This example implementation integrating the Bing Translator was done as a Query Transformer (QT) in the FAST Query Result Server (QRServer). Depending on the query, the query transformer can also suggest query translations to the user. Implementing the translator as a query transformer means it can be used with any of the supported ESP search API’s, across multiple UX implementations, and is platform independent.

query = office hours, translate = on, language = en

The query transformer changes the above input query to the multiple variations of the terms based on its original language. The input language is known since the users are usually coming from a regional portal or have a language preference set in their browser. The QT gets back multiple terms from the Bing Translator by connecting to the API through remote services (over HTTP). The original query is then expanded to search for all translated terms. All query terms are cached to minimize traffic going over the wire. Any other FAST query time linguistics features, like stemming, spell checking, and synonyms, will still apply on the translated terms.

This is just one example of integrating enterprise search with Bing services. If you’re interested in including this particular capability in your search application, or you have any more questions, please feel to reach out to me at Runar.Olsen@Microsoft.com.

Runar Olsen, Senior Architect | Microsoft Enterprise Search Group, Services

Searching for Virtue - Virtuous Cycles as a Model for Successful Search Implementations

I like design patterns. I like the idea that there are right ways to do things and wrong ways (anti-patterns). Of course I understand that the world is not so black and white, but collecting and cataloging the collected wisdom of what works and what doesn’t when designing software systems seems like a very good idea to me.

I’ve written about this before.  A couple of months ago, I blogged about the increased interest in HCIR and best practice patterns for search user experience (UX). In that post I wrote:

Having a set of discrete and generic (UX) patterns is helpful, but even better will be having best practice patterns that are oriented toward specific business processes where search is used. Understanding these meta patterns in enterprise search is especially important in order to understand user experience differences between search for Research, search for eCommerce, search for Customer Service, search for eDiscovery, etc... Some of these differences are in the search features themselves, others are in how search interfaces with other non-search features and workflow (e.g. shopping carts in eCommerce or communication tools for collaborative research).

I’ve bolded one of the sentences in the excerpt above because it is a lead-in to the topic of this post.  

Most work on design patterns for search has focused on techniques for how people search, or methods for improving findability. This is important and relevant work, but is missing, imo, the higher order patterns that help us evaluate these applications in the context of why people search, not just how. More to the point, the question I’m looking to answer is whether there is a simple model we can use to test if an application of search is likely to be successful or not – that is, whether it will optimally help the user accomplish his or her task.

Enter “virtuous cycles”.

Virtuous Cycles Defined

I first heard the phrase virtuous cycles used in the context of information systems from Chris Pratley, General Manager of Microsoft Office Labs. The idea Chris has promoted is that users work with information within a “virtuous cycle” of Consumption, Creation, and Connection (see diagram below) and that designing for virtuous cycles is a key to the adoption and success of information systems.

Virtuous Cycle for Information Systems 

Virtuous Cycle for Information Systems

I liked this model from the moment I first saw it. Not just because it’s simple and memorable, unlike other general “process models” I’ve seen, but because it’s actually useful.  In contrast, I remember the various cycles for “knowledge management” that were the rage 10 years ago. I’ve personally never felt that those diagrams were useful to anyone but the KM consultants who developed them.

The idea of a virtuous cycle for information systems is that an application is more likely to be used and successful if it helps a user to easily go from 1) consuming information to 2) creating new information from what they consume to 3) connecting this information with other users. Importantly, mechanisms for these steps to repeat indefinitely help to ensure that the application continues to get used. This reinforcement is what makes a virtuous cycle.

Why is the virtuous cycle model useful?

It’s useful because, with it, an application can at least be subjectively tested for whether it includes unwanted obstacles to any of the steps in the cycle. These obstacles are what prevent the user from easily completing his or her task, and if you know what the obstacles are you can fix them. The virtuous cycle idea gives you a model to help you recognize them.

Search and Virtuous Cycles

Since this is a search blog, I have to connect this idea of virtuous cycles to search.  I could argue that I already made this connection, at least partially, in a previous post about Actionable Search a few months ago. The point I made then was that people don’t search for the sake of searching; they search to accomplish a task or to achieve an outcome of some sort.

The Bing guys get this. Bing has been optimized to support common user tasks on the Web, with the initial release focused on eCommerce transactions (e.g. searching for digital cameras). Bing knows that when people shop online, they want to do more than just read through a list of blue links pointing to product pages. Part of the retail experience online includes product and price comparisons, inspecting (visualizing) the product, adding desired products to a shopping cart, and making a purchase.

Some of these things are part of the Bing experience, but there are limits to how far Bing, as a general Web search engine, can go. You can’t within Bing itself, for example, add products to a shopping cart, complete a product purchase transaction, or email what you find directly to a friend. Rather, that’s done at the individual eTail shopping sites that Bing makes searchable or using browser features*.

The applicability of virtuous cycles is much more obvious when viewed in the context of focused enterprise search applications. Research, in particular, is a general application that is ripe for virtuous cycles.  

*Many of the “actions” you might take on an item found in a Bing search can be handled directly or indirectly by the browser and do not necessarily need to be in the Bing application itself. For example, all major browsers support the notion of quickly emailing a link to a friend or group, so the “Connect” step is represented there.  IE 8 goes even further by supporting “Accelerators” – custom actions that can be applied to any text snippet on a Web page. For example, an accelerator exists for directly sharing information on a Web page to Facebook.

Example: Virtuous Cycle for Research

Research is a general workflow that maps very neatly to the Consume, Create, and Connect steps of the virtuous cycle model. It is also an example of a general process that clearly includes search, but that also includes more than just the steps of search and retrieval. Just finding something during research is not the end game.

Some of us remember the note cards we were told to use in school when we did research reports. The idea then was to write down on the cards any facts you found during your research, along with information about the source of each fact. Once collected, you would then synthesize the facts into a report. The technology may have changed, but the process hasn’t.  The sequence of general steps in research still goes something like this:

-          Search for and consume information about your research topic

-          Gather and synthesize the facts, include some of your own interpretation, and create a report of your findings

-          Publish, share, communicate, or otherwise connect your report with other people or content

The virtuousness of the cycle comes when your report, and the information and facts it contains, becomes something that you and others can use (consume) later to create new reports, insights, and knowledge - thereby starting the cycle over again. Wash, rinse, and repeat.

When Chris P talked about the virtuous cycle model for information systems, he referenced search as a step in the Consumption stage, but I’ll go further and say that search, or more generally, search-related technologies are enablers of each step in a virtuous cycle and that designing search with the entire cycle in mind is a generalized way of designing for search success.  For research, this means that search and related capabilities are relevant to helping find information (information retrieval) and discover, collect, and synthesize facts (text mining or information discovery). Search capabilities can even help communicate a report by identifying potentially interested colleagues (collaborative filtering and recommendation systems).

Connecting with the Microsoft Enterprise Search Vision

By now it should be clear that the “virtuous cycle” model involves capabilities that can include not just search but content authoring and management, collaboration, business intelligence, and many other IT disciplines. The Microsoft vision for enterprise search is for capabilities that are pervasive, intrinsic to, and supportive of every business process. This vision, combined with the virtuous cycle model, calls for both getting search right and for getting the integration points between search and other capabilities, like content management, right. This perspective is shaping Microsoft’s enterprise search roadmap for both “productivity” applications like Research in its various forms, as well as customer facing applications like online retail and customer service.  The advantage that Microsoft has over other enterprise search providers in pursuing this vision is a complete “stack” of the capabilities that address each major step in the virtuous cycle model.

You can test the virtuous cycle idea yourself. In your enterprise search application (or any search application), what capabilities, search or other, are there to help the user through a virtuous cycle of information consumption, creation, and connection? What discrete capabilities are missing that you wish you had?

Nate

 

Coping with Hype in Enterprise Search Marketing

Not long ago, I was invited to participate at a customer’s annual conference. It was an amazing experience. I’ve been to conferences of all sorts, but I confess I’ve never attended an event quite like this one. Let’s just say that I’m used to.. well… less energetic IT conferences. This particular company is *extremely* good at marketing and really understands the power of hype. The combination of pounding dance music, an elaborate stage set up, spectacular lighting, and, most importantly, well crafted and super hyped product announcements had the 20,000+ attendees in a frenzy.

Now, before you start thinking that I just insulted this customer’s business by using the word “hype” twice in describing their event, understand that I mean it as a sincere complement. Hype, short for “hyperbole”, means “deliberate or extravagant exaggeration” and is a well established and ancient promotional technique. Let’s face it, marketing hype has become so fundamental to our attention economy that, with a nod to Joel Gray and Liza Minnelli, we might say that hype, not money, makes the world go ‘round. Successful businesses, like this customer of ours, know how to walk the fine line between powerful marketing messages that attract customers and ridiculous exaggerations that turn them off.

 

After I got back from that conference, I started thinking about hype and its use in enterprise search marketing. I looked at what my own company has produced in the way of marketing material and then took a look at some competitor’s sites. I found nothing particularly outrageous. Some of us are more prone to hyperbole in our marketing than others, but relative to other industries, we are pretty tame and rather typical for the IT industry I think.

Even so, I thought it would be interesting to look at the more common examples of the hyped up statements I found and to try to offer my own translations. An example of a translation looking something like this:

“He’s as big as a house!” (hyperbole)

Really means…

 “He’s a large man.” (translation)

So, here goes. The top 5 hyperbole statements in enterprise search marketing and my translations are below. I’ve paraphrased them, so don’t bother trying to do a Web search to find the sources. You may still find something if you do, but it’ll be a pure coincidence.

1.       “Access Any Content Source”

Really means…

We provide an application programming interface (API) that you can use to write content feeding applications for our platform. You can use this to submit documents, db records, or any other types of information objects in order to make them searchable, as long as the objects you send in comply with the APIs protocol and are in a format that our platform can recognize and translate.

We may also provide, directly or through partners, a set of content source “connectors” or “adaptors” to standards-based (e.g. Web, file system, database) and proprietary information (e.g. Lotus Notes, MSFT SharePoint) sources. Some of these connectors may work in a way that simplifies keeping the search engine in sync with the source. That is, they may be configured to periodically look for new, changed, and deleted records or documents in the source. Fewer of these connectors will work with the native update mechanism of the source system so that the search engine can be notified of additions, changes, and deletions the moment they happen. Lastly, still fewer of these connectors will respect and pass through to the search system the native access control information of the source so that searchers don’t see things they’re not supposed to see.

2.       “Infinite Scalability”

Really means…

Our enterprise search platform has been designed to scale, theoretically, to an infinite amount of content. This is because of an architecture that can be distributed across multiple machines and, therefore, can grow to include more capacity by adding hardware. You should not conclude, however, that it can scale up cheaply or easily. To get to very large volumes of content and queries, you may have to buy a lot of hardware and data center space. If the platform supports any features beyond very simple search, you may also find that turning on these different features impacts how effectively it scales – that is, what you can get out of your hardware investment. And it may or may not include consideration for scaling on both the content side and the query side, so if you grow the amount of content your system is serving, it’s possible your query capacity (the number of queries your system can handle within a unit of time) may actually decrease. (See previous post on search system scaling.)

3.       “Access to Any File Type”

Really means…

We have file format converters for a great many different file types, but then most vendors can legitimately claim support for more than >300 formats including all the major office suite document formats and associated versions. In practice, most enterprises with intranet search application requirements will care about only a dozen or so of these – Web or HTML/XML documents, Microsoft Office formats, Adobe Acrobat (PDF), and a few more.

Image file formats can be searched if they have associated meta data, or if you incorporate object character recognition (OCR) capabilities for scanned document images. The OCR feature may be provided directly or through a partner. Similarly, audio and video content may be searchable through meta data, or, if the ability to search full-text transcripts is desired but transcripts aren’t available, through speech-to-text conversion technology.

     4.       “Support for over XX languages”

Really means…

We are very confident that our platform can handle the XX languages that use the standard character encoding sets we can handle, even if we maybe haven’t tested every one of those languages. However, more involved linguistic processing for things like synonyms, spell checking, entity extraction, and other advanced features may only be available for a small subset of languages and we may or may not provide 3rd party options or the tools to build these capabilities yourself for languages that we don’t cover.

5.        “Language Independent”

(A variant on the last one, but worth its own translation, I think.)

Really means…

The core of our platform is based on statistical model for calculating relevancy of items in a search result. Because it’s completely statistical, it is theoretically language independent. However, the quality of the search results is improved by, and some features depend on, knowing the language of the content and the query. For example, one of the most basic things a search engine might do is break the text in a document down to individual words and/or phrases in order to build an efficient retrieval model (index). Since the rules for how to do this “tokenization” vary by language (e.g. in Japanese, you can’t always rely on whitespace as a separator for words), it is helpful to have language specific extensions. So, while you may be able to use the system with any language, it may not work very well and your users who depend on that language may complain – bitterly.

 Bonus – “100% Precision (or Accuracy)”

(Ok, I admit that I’ve only heard of a vendor making this particular claim once or twice and I have never seen it in writing, but it’s my personal favorite. I don’t know quite how to translate it, but maybe it’s like this… )

Really means…

Our search software has God-like powers. It can not only read your mind and understand your information need beyond what you can comprehend yourself, but it has access to all known and unknown sources of information in the Universe.

 

There you have a few examples - ones you may see, or hear, in one form or another in enterprise search marketing (except for that last one, hopefully). If you know of other examples, please share them. My hope in providing these translations is that maybe they will give potential buyers something to think about when sifting through marketing literature.

Nate

Posted by ntreloar | 2 Comments
Filed under:

A Focus on Search User Experience

It’s happening…  slowly … but it’s happening. 

Attention in search is finally shifting from a focus on low-level features and relevancy models to looking at the whole user experience for information access. I, for one, am very glad to see this trend. Of all the enterprise technologies out there, few are planted so squarely at the interface of humans and machine as search. And yet, for many users, the search input box and a list of blue links is still the pinnacle of a search user experience – a user interface model that hasn’t changed appreciably in over 10 years. There is room for improvement.

So what exactly is happening?  I’ll point out three things:

1)      New Books on Search User Interfaces and User Experiences

Three recently published books have put some focus on human-computer interaction (HCI) and search. The first book by Ryen White, from Microsoft Research, and Resa Roth, published earlier this year, covers the topic of exploratory search. From the abstract:

Exploratory search has gained prominence in recent years. There is an increased interest from the information retrieval, information science, and human-computer interaction communities in moving beyond the traditional turn-taking interaction model supported by major Web search engines, and toward support for human intelligence amplification and information use.

The second book, Daniel Tunkelang’s on faceted search, looks at a particular interaction pattern that is now a mainstay of most commercial search platforms. Daniel, co-founder and Chief Scientist at Endeca, can speak with some authority on the topic of faceted search since his company was essentially built on the idea.

The third book, and perhaps the most ambitious, is from Marti Hearst, a respected researcher in information retrieval and text mining at UC Berkeley, who has recently released for online reading (print version coming in September) a comprehensive review of search user interface research.

A general theme, with the first two books especially, is on user models for search that are interactive and iterative. They address, in part, the fact that users are not very precise in communicating their information needs in an ad hoc query. While there is some evidence that keyword queries are getting longer, the oft-referenced 2.3 term average query length still demands user experiences that don’t just try to deliver the best possible results on the first attempt, but that can help the user ask a better question through contextual navigation, iterative feedback and refinement options.

2)      New and Evolving Examples Online

Beyond the three more academic works above, there is also evidence that commercial search applications are focusing more on search-based user experience. In a post last month, I referenced a couple Microsoft/FAST customers, Oodle and Globrix, who have put a particular emphasis on user experiences built completely around search. Other sites, like Getty Images’ Catalyst search take advantage of the uniqueness of the domain (image search) to create rich and engaging experiences built on search.

On the wider Web, Microsoft launched the Bing “decision engine” in May with query disambiguation features built in. Even Google has relaxed its keep-it-simple position by adding search options to enrich the user experience. Compared to domain-specific enterprise search applications, the Web search engines are just beginning to dip their toes in the water, but otherwise the same theme exists:  search user experiences that are more interactive, iterative, and conversational.

3)      Search UI Design Patterns

Finally, the past couple years have seen efforts to formalize UI design patterns for search. Peter Morville has championed this idea and posted a nice compendium of discrete search patterns with example screen shots on Flickr (also see his wiki). The idea of cataloging UI patterns for search is so that the good patterns - those that have been proven to work well and to result in a positive user experience - can be promoted and reused. There is also the concept of “anti-patterns” or patterns that have been shown to have a neutral or negative impact on user experience.   (As an aside, Peter’s catalog of patterns focuses on GUI patterns – many of which will be familiar even to non-practitioners. In my post on search and Natural User Interfaces , or NUIs, I mentioned that these new “touch and gesture” UIs do not have established patterns yet for search. It is truly a greenfield and it will be interesting to see what patterns emerge.)

 

As said, I’m a fan of this focus on user experience in search and also of the formalization of best practice design patterns. I’d like to see it all go a little further, though. Having a set of discrete and generic patterns is helpful, but even better will be having best practice patterns that are oriented toward specific business processes where search is used. Understanding these meta patterns in enterprise search is especially important in order to understand user experience differences between search for Research, search for eCommerce, search for Customer Service, search for eDiscovery, etc... Some of these differences are in the search features themselves, others are in how search interfaces with other non-search features and workflow (e.g. shopping carts in eCommerce or communication tools for collaborative research).  For example, the “product comparison” view is something common in eCommerce applications and, while not obviously a search UI element, its rendering is dependent on search results.

In time, I expect these meta patterns to evolve into user experience and UI templates (customizable) that will help organizations quickly stand up search front-ends that take into consideration not just how people search (functional patterns), but why people search (process patterns).

Nate

Observations from the Text Analytics Summit 2009

One of the hard parts about organizing a conference like the 5th annual Text Analytics Summit, held last week in Boston, must be selecting the industry case studies. Text analytics is a highly specialized, but broad reaching topic that has applications in life sciences, financial service, legal, retail, government, media, and entertainment, to name a few. Any one of these industries could have filled the conference with interesting examples.

As it was, most of the case studies and vendor briefings at this conference were about Voice of the Customer or Market Intelligence. I suspect that some attendees might have preferred a little more variety in the cases presented. The absence of any government case studies, for example, was conspicuous, but understandable given the special nature of that domain. We’d all probably have needed security clearances to attend those sessions anyway.  Overall, I appreciated the more commercial/consumer focus and felt that the conference organizers did a great job of finding representative examples and balancing the practical (vendor briefings and case studies) with the theoretical.

As a first time attendee to the conference, I was interested in just getting the lay of the land in text analytics, but I was also interested to learn how people were answering the “what’s next” question. It came up several times over the 2 days during Q&A and panel sessions and there were different takes, but I paid close attention to three, in particular, that resonated with my own observations looking through the lens of enterprise search.

Trend 1:   ETL-like Tools

Ok, this is not really a trend in text analytics, but it is one in enterprise search that is informed by text and data analytics.

Many of the vendors at the conference demonstrated graphical tools designed to simplify the process of building text analysis “pipelines”. These tools look very much like the Extract, Transform, and Load (ETL) tools that have been around for many years in the data integration world. The difference is that the text analysis versions of these tools focus on operations for handling unstructured text. For example, named entity recognizers are a common text analytics task for automatically recognizing and tagging things like person names, company names, and locations in text.

This ETL “pattern” exists in enterprise search, as well, where information must be extracted from a source repository (e.g. an email archive), transformed into an enhanced, canonical representation (e.g. annotated XML), and loaded into a database or index for searching.  The demand for graphical tools to manage the ETL process for search has not been as high as for text or data analysis. I think this partly because, for search applications, it is usually a one-time set up process and not an iterative modeling exercise as it is with text analytics. It may also be because historically the operations performed on content before it’s indexed for search have not been as sophisticated as the operations performed for in-depth text analytics.

This is changing.  To start, extensible pipeline processing frameworks that incorporate advanced text analysis capabilities have become more common in enterprise search products. By now, most mainstream enterprise search platforms include entity extractors, for example. We are also seeing more ETL-like graphical consoles for managing content integration and analysis.

The adoption of these tools and techniques for enterprise search is motivated, in part, by a desire to more easily harness text analytics features that increase search precision and create richer search experiences. It’s also the case that, while text analytics shares a heritage more with information retrieval (search) than with business intelligence (BI), it includes technologies relevant to both and sits smack in the middle of the convergence between these two spaces. Sue Feldman and Hadley Reynolds of IDC reinforced this role of text analytics by describing it as a cornerstone of Unified Information Access during their Market Report at the conference. Given this, it shouldn’t be surprising to see that, as text analytic tools and concepts have found their way into BI applications, traditional BI tools and concepts, like ETL, are finding a place within enterprise search.

Trend 2:  Empowering the End User

Another topic that popped up at various times during the conference was the challenge of delivering the richness of text analysis tools to users other than specially trained analysts. As with traditional BI tools, many text analysis tools assume a trained user or “analyst” capable of designing sophisticated workflows or advanced analytical models. One question posed to a speaker after he finished describing his text mining process was “when do you think you’ll be out of your job?”  - meaning, when will the tools be so easy to use that your end users won’t need you to do their investigation for them?  

I’m sure this exact question was asked at a conference of professional research librarians some 15-20 years ago - back when online search services and later Internet search engines were becoming easier and easier to use and obviating the need for “professional searchers”. The answer is likely the same, too. There will always be specialists and “power users”, but as the tools become easier to use, end users will become more empowered to do their own increasingly advanced analysis.

In practice, we are seeing more applications that combine conventional search with advanced text analytics in ways that bring a more powerful search experience to relatively unsophisticated end users. Silobreaker.com is a clever site that combines the richness of text analytics within what is fundamentally a news search application. Unlike other news search sites, Silobreaker offers options and tools that help to uncover and discover interesting and potentially novel connections and patterns in the news. There are still some usability challenges with a consumer site like Silobreaker, but I like it as an example of ad hoc search converging with iterative knowledge discovery.

The trend toward empowering users with more than just a search box and list of blue links also reaches into less “analytical” consuemr applications. Two examples are www.oodle.com and www.globrix.com. Both sites show the power of applying analytics to both structured and textual data (classifieds in the case of Oodle, real estate postings in the case of Globrix) in what are otherwise fundamentally search applications.  

Trend 3:  Taking Sentiment Analysis to the next level

Sentiment analysis is the ability to recognize the mood, opinion, or intent of a writer by analyzing written text. It is sometimes called the “thumbs up, thumbs down” problem because the most common application is establishing whether a writer is positive or negative on a particular subject. In this form, it is often used to analyze written product reviews (see this example on Microsoft’s new Bing Web search).

Sentiment was a much mentioned topic at the conference. This is not surprising given the focus on Voice of the Customer and Market Intelligence – two areas where accurately establishing the sentiment of customers and consumers toward products, services, and brand is highly desirable. One of the presenters at the conference was Roddy Lindsay from Facebook. I missed that session, but it doesn’t take much imagination to appreciate the possible applications for text analytics and sentiment analysis, in particular, with the information available on Facebook and other social networking platforms.

Every vendor present had something to show or say on the subject of sentiment analysis, but all the panelists in the vendor-only panel acknowledged the difficulties of increasing the precision of sentiment classification. Currently, the number tossed around is 80%. That is, a sentiment classifier will get it right about 80% of the time compared to human judgments. This number is higher in some applications - for example, when analyzing short, strongly opinionated product reviews. It is lower when analyzing longer pieces of text where just fixing the subject can difficult – like this blog post.

Progress is being made, though. The first step has been a shift away from “document-level” sentiment to “topic-level” sentiment. This allows sentiment classification to be more accurate when confronting documents, like this post, that touch on and offer opinion on multiple topics. It also helps with more concrete problems like the ones represented in this sentence:

“Acme’s new P40 digital camera has a good viewer, but its controls are awkward.”

While it’s relatively easy for a human, it takes some heavy linguistic lifting for a machine to recognize that the sentiment of this opinion is directed not just at Acme or at the P40 digital camera, but specifically at the viewer (positive sentiment) and the controls (negative sentiment). It’s ever trickier establishing what the word “its” refers to in the 2nd part of the sentence. Is it the Acme P40 itself, or just the viewer?

Sentiment is admittedly a niche topic, even within text analytics, but getting it right matters a lot for enterprise search applications in eCommerce (think product reviews), Market Intelligence (reputation tracking and competitive intelligence), eDiscovery, and Government Intelligence. One presenter suggested that all the remaining hard problems in sentiment analysis will be solved, at least academically, in a couple years. It will be interesting to see how soon these improvements surface in real-life applications.

Nate

Thinking Big – Search Scale and Performance on a Budget

I recently came across Paul Nelson’s informative post on search scalability. I don’t know how long it’s been up there, but reading it made me think of customers I’ve spoken with recently who are looking to scale up their search deployments, but, due to tight budgets, want to do so without simply buying more hardware.

Paul focuses on document count as the main consideration for architecting scalable search, saying:

There is really only one dimension of size: The total count of documents in the system.

He goes on to describe several useful strategies for scaling search for “large” systems – those with document counts of >500 million. Importantly, imo, he also points out that even medium sized systems (10-100 million docs) will have special scaling needs depending on their performance requirements:

If these systems have any kind of query or index performance requirements — for example, it is a public web site with 10-30 queries per second, or that new documents arrive at a rate of 10 documents per second — then you will likely need an array of machines to handle your needs.   

I mostly want to reinforce and build on this second point. Effective scaling search means getting the most out of your search infrastructure (i.e. maximizing the number of documents per unit of hardware), but scale and performance are two sides of the same coin, and whether a system can squeeze ten thousand or ten billion documents on a machine, it must still satisfy the applications performance requirements.

If you can’t just add hardware, what then? Well, there are still options for getting more capacity out of a search system that provides the right level of control for optimization and tuning. Understanding these options requires understanding how search system performance is measured and the associated trade-offs that exist. Paul alludes to some of these trade-offs, but it’s worth providing a few more details and examples to drive this point home.

Search System Performance Metrics

Metrics for search system performance typically fall into two categories: query performance and indexing performance. In turn, these categories each have two measures associated with them:

Query performance

·         Query latency (or response time) – the time it takes for a query to be processed and results to be returned.

·         Query rate – the rate at which the system can process queries. Usually measured in queries per second (or QPS).

Indexing performance*

·         Indexing latency – the time it takes for a document to be indexed and made available to search.

·         Indexing rate – the rate at which the system can process and index documents. Measured in documents per second.

*Indexing performance assumes systems that actually create an index or some other sort of database optimized for information retrieval. This rules out “federated search” engines, which rely on other systems to create and manage these indices.

There are some variations on these measurements. For example, you can track average or peak values for each.  Document count per node (where a node = a Processing/Memory/Storage unit on a network) impacts all of these measures, but there’s a balance between query performance and index performance that also influences how many documents you can squeeze onto a single node.  The perhaps obvious explanation is that the more system resources you allocate to serve query performance, the fewer resources you’ll have available for indexing, and vice versa.

Applications with rapidly changing content or with very time sensitive data place high demand on indexing performance. Other applications, like highly trafficked Web sites, place high demand on query performance. Different applications place different demands on scalability depending on the performance requirements across these dimensions. To take a specific example, consider an eDiscovery application that provides search across 100s of millions of archived emails. The query rate and indexing latency requirements for this type of application are typically lower than what a reasonably popular social networking site with an equivalent document count might see. As a result, eDiscovery search applications are able to squeeze more documents per node than highly trafficked Web sites – even if they serve the same total number of documents.

For another comparison, large eCommerce sites can have extreme query performance requirements - in some cases handling several thousand queries per second during peak traffic times, while still delivering sub-second responses. Even with these extreme query requirements, these sites can have relatively modest indexing performance requirements when compared to, say, financial news applications where content “freshness” and, so, low index latency are a priority.

Impact of Features

An often neglected factor that impacts performance is feature set. Features like faceted searching, results clustering, automatic query completion, and advanced query operators can each add incremental overhead to indexing performance, query performance, or both, depending on the feature and the system. For example, queries used for eDiscovery are sometimes crafted by teams of lawyers. This can result in queries made up of dozens or even hundreds of carefully selected search terms combined in a maze of (also carefully selected) Boolean, proximity, and other types of search operators.

I remember one FAST partner describing how their legacy eDiscovery tool (built on relational database technology) took up to 2 weeks to process a particularly long and complex query. Needless to say, they were delighted when we demonstrated the same query taking only a few seconds. It was not sub-second, but the point is that they would have been happy with this particular query if it came back in a few hours. In fact, our conversations on optimization included whether we could squeeze more capacity (docs per node) by relaxing the query response time requirements to 10-15 seconds for these queries in their application.

Different search systems are better (faster) than others, but parsing and evaluating very long and complex queries will generally take more cycles and resources than the usual 1 or 2 term ad hoc query. Relative to absolute document count, the individual impact on performance and scale of any one feature may be small, but taken as a whole and for certain applications, like the one in the example above, they can represent meaningful tuning options.

Know Your Options

The moral of the story is that getting enterprise search scale and performance right for large systems can be somewhat nuanced - especially if you’re on a tight budget. If you’re embarked on, or about to embark on a large scale enterprise search project, make sure you understand these performance considerations. Best of breed enterprise search platforms support many tuning strategies that factor in all the key dimensions of search performance and scale. Read your system’s deployment guide (if it comes with one) to understand these options.

Lastly, if you’re not sure if your project has what might be considered demanding scale or performance requirements, consider getting some expert advice. Below are some good online forums you can tap for expert advice and to get a sense for whether your system might be considered “demanding”.

http://tech.groups.yahoo.com/group/search_dev/  (Search Engine Developers group on Yahoo)

http://www.linkedin.com/groups?gid=161594  (Enterprise Search Engine Professionals on LinkedIn)

Nate

Posted by ntreloar | 3 Comments

Actionable Search – From What to Why?

Day 1 at the Enterprise Search Summit in NYC is wrapping up and I’ve just listened to Lisa Denissen from Shearman & Sterling talk about Actionable Search. Actionable search is a key tenet of Microsoft’s enterprise search strategy, so it was good to see promotion of the concept.

For many organizations, just adding basic, no-frills search to an intranet can have a big impact on employee productivity, but to really create an optimal search experience it helps to understand the processes that drive users to search in the first place. Too often search is treated as an end unto itself, without consideration for the larger processes that it ultimately serves. Users care about finding relevant information, sure, but they care even more about using that information to complete tasks and achieve outcomes. These tasks and desired outcomes are what ultimately define success for an enterprise search application and, it may be argued, for any type of search app.

Understanding what motivates people to search means going beyond capturing requirements like “I need to be able to search all of Product Marketing’s PowerPoints” to addressing more precise needs like  “I need to quickly assemble targeted presentations for sales prospects based on existing marketing material”. This second statement doesn’t sound like a search problem, but it speaks clearly to a desired outcome (“targeted presentations”) and the word “quickly” suggests that search may offer some help here. Importantly, the statement also focuses on the question of why the user is searching, not just what they hope to find.

The phrase “actionable search” refers to the idea that items in search results can be directly acted on in a way that moves the searcher toward completion of a specific task – an outcome. While general Web search engines have us accustomed to results sets that contain only links to relevant Web pages, the richness of applications and content types in the enterprise and on targeted Internet sites promise a bit more than just a blue link. eCommerce sites have supported actionable search for years by allowing users to directly add items from a search result to a shopping cart. Facebook provides contextual actions directly from its general search results that let you Join Groups, Add Friends, Join Events, or Send Messages. To take the earlier example, once a relevant PowerPoint presentation is found, an actionable search experience would be to offer the user help with the next steps of finding the right individual slide and then quickly incorporating that slide into their work-in-progress presentation.

One argument for enterprise search starts with the question “What good is an enterprise content management and collaboration if you can’t easily find the information you create, manage, and share? We might switch the question around and ask, “What good is enterprise search if you can’t easily act on the information you find?”  Actionable search promises to close this gap between information access and outcomes.

Nate

Search and Natural User Interfaces - Part 2

In my first post on this subject last week, I referred to a scene in the movie “Minority Report” as a visionary example of a natural user interfaces (NUIs) and, more to the theme of this blog, a visionary example of ad hoc search within a NUI.  I realize that I didn’t offer a definition of NUIs in that post, so, before I go back to the search connection, here’s a quick primer.

NUIs Defined 

Natural user interfaces or NUIs rely on natural expressions like touches and gestures to directly and intuitively control the experience of a software application. The word “natural” means that the interaction is not controlled through an artificial device, like a mouse or keyboard. (I take this to imply that a Nintendo Wii is not an example of a NUI, since there are still artificial controllers involved. Other opinions and thoughts on this are welcomed).

NUIs have been described as the next evolutionary step in human-computer interaction – the successor to graphical user interfaces (GUIs), which succeeded command line interfaces (CLIs), which succeeded physical input devices like card readers. Touch screens on hand-held devices are the most common examples of NUIs, but there are number of other emerging NUI platforms and technologies. This article on touch computing from PC Magazine offers a catalog of some of the systems currently available.

Microsoft Surface 

One of the technologies mentioned in the PC Magazine story is Microsoft Surface.  Microsoft Surface is a Windows powered device in the form factor of a table - a coffee table, if you will - with a surface that supports touch and gesture interaction. There are other NUI platforms, but there are a couple things that make Microsoft Surface different and interesting.

First, the Microsoft Surface form factor and interface are designed to allow multiple users to interact with the device at the same time. The interface can detect and track dozens of touch points simultaneously. It can even recognize the orientation of fingers prints and infer, in turn, the physical orientation of a user relative to the display. Because of these capabilities, many applications created for Microsoft Surface emphasize multi-user collaboration and interaction – for example, there are multi-user games and other collaborative consumer applications for things like music and picture sharing.

Second, Microsoft Surface devices have built-in cameras that can not only track touches and gestures, but can recognize digitally tagged objects and can initiate specific actions when these objects are placed on the table. For example, Infusion Development has created an application designed to enhance the doctor patient consultation experience. By placing a tagged card on Microsoft Surface, doctors can use and access interactive cardiac images, dynamic charts and clinical documents to help explain medical conditions and procedures to their patients.

NUIs:  Where’s the Search?

I was wowed by my first experience with Microsoft Surface - as many are when the first get a chance to play with one - but being a search guy, I looked for applications that included some sort of search function. So far, of the NUI applications I’ve seen to date, whether on Microsoft Surface or in other NUI technologies, I’ve seen very few that provide true ad hoc search. In one or two examples I’ve seen, a virtual keyboard is used to enter search terms and traditional GUI search metaphors are used to render search results. More often, though, finding information requires the user to navigate through some pre-defined structure. Even this TouchWall demo by Bill Gates from last year’s CEO Summit focused on navigation. Where’s the search?

I’ll grant that structural navigation metaphors in NUIs are really cool and work pretty well.  For example, I’ve seen a medical app that allows you to visually navigate a representation of the human body to explore different anatomical concepts. You can tap on the virtual head to explore the brain and then drill down further to learn about neurons. It looks like a fun and an interesting way to explore human anatomy, but the problem with this navigation-only approach is that if you don’t happen to know that neurons are in the brain, it will take you a while to find them. It is browsing, not ad hoc search and, as we learned from the Yahoo Directory experience back in the 90s, people tend to prefer searching over browsing.

A Prototype and a Request

At our FASTforward’09 user conference in Las Vegas in February, we showed a prototype application, built in collaboration with a very sharp team of developers at EMC Consulting, which brought together ad hoc search and the natural user interface experience of Microsoft Surface. You can see a short video of this demo here, or the longer keynote presentation from the event here.

 

When Mark Stone, Global Enterprise Search Lead at EMC Consulting, and I first conceived this demo, we were inspired by three things:

1)      The dramatic growth and potential of NUI technologies, particularly Microsoft Surface.

2)      The dearth of search examples in all these NUI applications.

3)      The potential for creating transformative user experiences that combine search and NUIs .

You can judge for yourself how successful the team was in combining ad hoc search with Microsoft Surface by looking at the demos, but one thing is for sure, we were in uncharted waters when building this app. The user interface patterns for search within a NUI are not well established. Even without considering search, building user interfaces in Microsoft Surface requires setting aside the old GUI models and learning brand new patterns and metaphors. As for search in a NUI, well, what are the equivalents to the search box, the search result list, navigation facets, document links, and all the interaction patterns around this “controls”?  How can we use a 3rd dimension (“depth”) and what role does “zoom” play in search? Working within a NUI environment even challenges the basic containers of information. Should you first show documents, or just extracted facts and information summaries? All these questions and more came up during the development of this prototype. Some of the answers are now known, or at least we have a better feel for the right direction to go, but others require more research and experimentation.

There is the opportunity here, and a challenge to be met by the search community. NUIs are here to stay and are demanding new patterns for true ad hoc search that satisfy the intuitive and natural interaction requirements of these environments. Reverting to browsing metaphors is not the answer; nor is simply recreating the GUI patterns of keyword search boxes and lists of blue links.

I’m very interested in this topic and am on a hunt for any good examples of true search within NUIs. If you know of an example, please send whatever pointer you can - links to demo videos, screen shots, academic papers, … anything. You can respond to this post or email me directly.

 

I look forward to seeing your examples and will summarize what I find in a future post.

In the mean time, I feel like we need a new name for search interfaces within NUIs. I like the phrase “Natural Search Interface” used by the Microsoft Germany Partner site in reference to the Microsoft/EMC Consulting prototype. I’ll use that.

Nate

Search and Natural User Interfaces (NUIs) - Part 1

About five years ago or so, I participated in a conference panel where the question was asked: “What will search interfaces look like 20 years from now?”. I had just seen Steven Spielberg’s sci-fi film “Minority Report” starring Tom Cruise, so I referred to the scene where Cruise’s character is interacting with a futuristic looking visual display and using appropriately dramatic gestures to grab, spin, shrink, expand, and otherwise manipulate the various news stories and images floating on the display.

I heard later that Spielberg, while developing the script for the film, had consulted a number of futurists to create as realistic picture of the year 2050 as possible (from the point of view of those futurists at least). Interestingly, over the past several years, that scene has become a conceptual benchmark for so-called natural user interfaces (NUIs), to the point where if you search for “minority report” in your favorite Web video search engine you’re as likely to find examples of prototype NUI products as you are trailers for the actual film. It’s not a stretch, imo, to say that the film has inspired and perhaps even accelerated advancements in NUI products and technology.

There are now many good and real examples of NUIs and even some actual products that come close to the vision in "Minority Report", but despite the impact the film appears to have had on the development of NUIs, there is a very strong connection to search that gets overlooked. Cruise’s character in that scene is searching. His various gestures and other contortions are queries, navigation, and refinements intended to help him find answers and collect information. Granted the depiction is not quite up to the vision of the smooth-voiced computer on Star Trek, but it’s a step beyond the keyboard and mouse and, if you look past the theatrics, I think it paints a realistic view of not just the future of natural user interfaces, but of the type of natural search-driven user interfaces we will be seeing soon… in much less than 20 years time.

Nate

One Year with Microsoft – a FAST Perspective

After years of writing customer proposals, internal memoranda, and various stuffily formal documents, it feels like a luxury to be able to just write what I think about enterprise search.  It’s actually part of my job these days and I’m looking forward to sharing a perspective from 13 years in the industry – the past 6 years with FAST and, most recently, with Microsoft.

As a reminder, it’s been a more than a year since the original offer came down from Microsoft to acquire FAST. To be precise, the bid was announced on January 8th, 2008 and the deal closed on April 25th, 2008. The FAST team now makes up a large part of the new Enterprise Search Group (ESG) within the Microsoft Business Division (MBD) – the division that makes SharePoint, the Office line of products, Exchange, etc… . 

When I get asked about my reaction to the FAST acquisition by Microsoft, I tend to point out that, while those of us in the business have always understood the value of search, nothing says “Ata boy!” like having the largest software company in the world take notice. Maybe we could ask why it took so long, but even if you didn’t happen to work at FAST, you can’t help but feel that Microsoft’s move is validation of our growing corner of the IT industry.

I admit that the answer above, while maybe heartwarming, doesn’t get to the core of what people really want to know. Not surprisingly, folks are more interested in Microsoft’s vision for enterprise search and plans for the FAST people, products, partners, and customers than they are in my emotions.  Now, with a year under the belt at Microsoft, I have a few more insights to offer than just the initial “nice validation” response.

In his keynote presentation at FASTforward’09 in February, Kirk Koenigsbauer addressed three key topics related to Microsoft’s interest in enterprise search (a transcript of Kirk’s keynote can be found here). These were:

·         Commitment (to enterprise search)

·         Vision

·         Product Plans

These topics provide a useful framework for sharing my own observations.

Commitment

There are a number of anecdotal facts that point to Microsoft’s commitment to being a leader in enterprise search. Kirk shared a few of these in his keynote – things like the percentage of Microsoft Research investment going to search (appx 15%), the size of the Enterprise Search Group R&D organization (several hundred engineers and growing),  and of course the investment itself to acquire FAST (US$1.2B). There are other supporting data points, like the announcement of Oslo (FAST’s headquarters) as a key R&D center for business search.

Any one of these facts is a strong indication of Microsoft’s ambitions in this space, but my take is that the evidence of Microsoft’s commitment to search comes from more than these metrics or executive statements. It comes from a growing grass roots interest in search across all of Microsoft.  For example, I often get a question like this from customers and partners:

“Have you guys talked with the folks over in Microsoft’s <product name> team?”

…and then…

Man, you should because FAST technology added to what they’re doing would be powerful combination.”

The usual answer is, yes, we’ve talked to the <product name> team and, yes, there are some very interesting ideas and even some specific activity that we mostly can’t talk about yet. In fact, what’s been most interesting and fun for us former FAST folks is the breadth of technologies that we can now include in our conversations with customers and partners. SharePoint is the “hero SKU”, as we say here, and the combination of FAST search with the capabilities of SharePoint makes for an impressive offering for both intranet and Internet applications that are focused on helping people consume and use information.  It’s not a leap to recognize that Microsoft has something to offer at almost every level of an IT solution “stack” complementing the capabilities of both SharePoint and search – from the operating system to application development tools and even cloud-based services. To put it in perspective, ask yourself how many companies offer both a world class enterprise search platform and a world class relational database.

To be honest, search is such a generally valued concept and the possibilities are so compelling when it’s combined with other Microsoft products and technology that it’s all we can do to stay focused on our main priorities. It’s a good problem.

Vision

At some point prior to the acquisition, the Microsoft enterprise search team came to a vision of search that matched what we had developed at FAST. Specifically, that search is more than just a search box and a list of blue document links, but represents a set of capabilities that are enabling new ways to engage users by creating personalized, conversational experiences that cater to the way people prefer to consume and interact with information. This vision was behind the principle theme for the FASTforward’09 conference this past February – “Engage Your Users”.

Whether the original Microsoft team came to this vision independently or after talking to FAST folks (ego would like to think the latter) is less important than the fact that it is now a shared vision throughout the Microsoft Enterprise Search Group and is shaping how we are investing in product development. It’s also a vision that is permeating into other areas within Microsoft. For example, I recently had a chance to apply this way of thinking about search to some other very interesting Microsoft technology, Microsoft Surface, but that’s a topic for another post.

Product Plans

At FASTforward’09 we announced our plans to target enterprise search in two areas: 

·         Business productivity – applications inside the firewall where, in particular, SharePoint provides the framework for content management and collaboration.

·         Internet business – “outside the firewall” applications for attracting, retaining, and otherwise monetizing customers.

The intentions are to have a common search platform supporting both of these general markets and to include application specific capabilities and templates that are unique to each. FAST had already started down this path. For example, FAST AdMomentum is an ad platform that interoperates with search and is relevant to monetization strategies in Internet Businesses, but not so obvious of a fit for inside the firewall apps.

This relatively straightforward strategy and message was very important to get out to the FAST customers base, especially given that Internet Businesses have made up well more than half of FAST’s business to date. Also, most industry pundits will tell you that the requirements for search inside the corporate firewall are simply different than search in consumer facing applications. Even so, what’s so promising to me about this strategy is that there are opportunities to “bleed” capabilities between these two application spaces. We saw this “consumerization” of search features happen more than once at FAST. Features that we initially designed for consumer search found their way into intranet search deployments (one simple example is the “best bets” concept like the one found in SharePoint). The opposite has also happened. Now, consider the capabilities in SharePoint, which is already powering many consumer facing Web sites, and you can see where this can lead.

There you have it, my first post for the Microsoft Enterprise Search Blog. Look for more posts from me in this general category of enterprise search vision and strategy. I welcome all comments on this and future entries.

Next up – Search plus Natural User Interfaces.

Nate

Posted by ntreloar | 2 Comments
Filed under: , ,

Microsoft Presents FAST forward 09: Engage Your User

The Mirage, Las Vegas, Feb 9-11

Since its inaugural conference in 2006, FASTforward has been a venue for though leadership and innovation in the field of search. This year, FASTforward’09 is the industry’s largest business and technology conference dedicated to search-driven innovation. Join the discussion! At FASTforward’09, we explore how businesses are responding – and evolving – in the face of rapid technological change and the growing demands for user control. As The User Revolution continues, we examine search’s critical role in helping companies engage their users. This year’s conference will also highlight Microsoft’s vision for enterprise search technology.

New this year, a SharePoint technology track covering Enterprise Search, Social Computing, Enterprise Content Management and more!  Other tracks include:

  • Monetization via Search (customer-facing)
  • Productivity via Search (internal enterprise)
  • FAST technology
  • Partner Solutions

Top Ten Reasons Why You Should Attend FASTforward’09:

1. Uncover new opportunities for using search

2. Hear what others have done with search technology

3. Learn industry best practices for search

4. Hear the Microsoft vision for search and FAST

5. Learn how SharePoint and FAST products are positioned

6. Gain insight on integration plans for SharePoint and FAST products

7. Understand how partners can help

8. Obtain access to Microsoft and FAST executives and industry luminaries

9. Network with colleagues

10. Attend convenient pre-conference technical training

Come spend three days with us at the Mirage in Las Vegas learning from industry thought leaders, customers, partners, and our own Microsoft experts!

Learn more at FASTforward ‘09. Register before January 9 and receive $400 off of the full registration fee. See you there!

Microsoft positioned in the Leaders Quadrant of the 2008 Information Access Magic Quadrant

We’ve got great news to share! Last month, Gartner published the 2008 Magic Quadrant for Information Access Technology, and Microsoft was positioned in the Leaders Quadrant. Since the completion of the acquisition, we’ve worked incredibly hard to communicate and demonstrate a combined vision and strategy to our customers and partners. It’s good to know we’re heading in the right direction!

When I talk with customers about search, it’s clear that organizations have very different needs. In fact, many people tell me that even within an organization the one-size-fits-all approach just doesn’t work. So over the last year, we’ve announced some bold moves designed to create a compelling portfolio of search applications. With the addition of Search Server Express and the acquisition of FAST, we now have a product line-up designed to meet a broad range of business needs:

  • Some departments or small organizations need search that is quick and easy to set up; we offer Microsoft Search Server Express as a free download so that you can get it up and running in about 30 minutes. We’re excited to see customers like St. Jude Medical and Urbis having quick successes with Express. We’re also seeing partners, such as StartReady, build solutions around Search Server Express to create a search appliance.
  • Many organizations need search as an integral part of a business productivity infrastructure; Search in Microsoft Office SharePoint Server is integrated with other key SharePoint productivity workloads such as portals, collaboration, ECM, business processes and BI. Customers like McCann Worldgroup and Jones Lang LaSalle are all deriving productivity increases with better search in SharePoint. In particular, both companies are promoting collaboration and leveraging in-house experts with people search enhanced by user profiles in MySites.
  • Some organizations face business problems that demand high-end search; FAST ESP offers best-in-class search with extreme scalability, query performance, and other advanced capabilities for sophisticated customer-facing or inside-the-firewall applications. For example, Aerotek and TEKsystems, two of the world’s largest staffing companies, deliver job searching to more than 1.3 million users. In more than 164 million queries, greater than 99.5% of query results came back in less than 2 seconds. For inside-the-firewall productivity, they index more than 10 million complex candidate records with low latency during high volume index updates. We’re also excited to see Pfizer pushing the envelope with an Enterprise Collaboration Framework driven by FAST ESP on top of SharePoint

While our “Leaders Quadrant” position in the Magic Quadrant is an important milestone, we still think of this as the very beginning of our journey. We’re continuing to combine our deep technical expertise with our broad reach to deliver exciting innovations to the market – so you can and should expect great things to come. Stay tuned!

Kirk Koenigsbauer
General Manager,
SharePoint Business Group

Magic Quadrant for Information Access Technology (Gartner Research, Sept. 30, 2008) Microsoft is positioned in the Leaders Quadrant of Gartner, Inc.'s 2008 Magic Quadrant for Information Access Technology. This report assesses vendors with capabilities that go beyond enterprise search to encompass a range of technologies. Their capabilities include search; federated search, content classification, categorization and clustering; fact and entity extraction; taxonomy creation and management; information presentation (for example, visualization) to support analysis and understanding; and desktop search to address user-controlled repositories in order to locate and "invoke" documents, data, e-mail and intelligence.

The Magic Quadrant is copyrighted 2008 by Gartner, Inc. and is reused with permission. The Magic Quadrant is a graphical representation of a marketplace at and for a specific time period. It depicts Gartner's analysis of how certain vendors measure against criteria for that marketplace, as defined by Gartner. Gartner does not endorse any vendor, product or service depicted in the Magic Quadrant, and does not advise technology users to select only those vendors placed in the "Leaders" quadrant. The Magic Quadrant is intended solely as a research tool, and is not meant to be a specific guide to action. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

More Posts Next page »
 
Page view tracker