In today’s post, we’re going to dig into a very useful feature of the Windows Phone 8.0 speech platform, specifically how to use wildcards with the VoiceCommandService.

As those of you following along already know, in our first post, we built a WP8 speech app, called “Search On”, enabling users to “Search On Amazon” by voice for anything they wanted. In the second post, we added the ability to search on 15+ more sites by incorporating a PhraseList of many popular web sites.

Today, we’ll continuing adding to our “Search On” speech app, enabling our users to search any web site at all, even when it’s not one of our built in 15+ sites we know how to search. Specifically, we’ll extend the app to support the following scenarios:

User: "Search a web site"
Phone: "What web site would you like to search?"

User: "Whitehouse Dot Gov"
Phone: "Search Whitehouse.gov for what?"

User: "Elections results"
Phone: "Searching Whitehouse.gov for Election results”

… and, similarly, if the user doesn’t know if we know about the site name or not …

User: "Search Whitehouse Dot Gov"
Phone: "What web site would you like to search?"

User: "Whitehouse Dot Gov"
Phone: "Search Whitehouse.gov for what?"

User: "Elections results"
Phone: "Searching Whitehouse.gov for Election results”

In the 2nd example above, it’s better to help the user complete her task, than just beeping at her, or complaining that we don’t know what to do at all. We DO know at least part of the task: She wants to search a web site, but we just didn’t know what the site was she said…

Now, let’s dig in and see how WP8’s VoiceCommandService can help us achieve this…

Let’s start with the Voice Command Definition file, vcd.xml. We’re going to add a new Command, named “searchSiteCatchAll”. We’ll add a few <ListenFor> elements for that Command, that will “capture” all the websites that we don’t know how to search yet.

   1:  <Command Name="searchSiteCatchAll">
   2:    <Example>a web site</Example>
   3:    <ListenFor>[a] [specific] [web] site</ListenFor>
   4:    <ListenFor>{*} </ListenFor>
   5:    <ListenFor>{*} {dotComOrNet} </ListenFor>
   6:    <Feedback>What site would you like to search?</Feedback>
   7:    <Navigate Target="MainPage.xaml" />
   8:  </Command>

Remember how we used a PhraseList in yesterday’s post to contain a list of websites? Then, we added “{siteToSearch}" as the content to the <ListenFor> element in the <Command> element? Right? Well, the VoiceCommandService wildcard feature, or garbage rule as it’s sometimes called, is very similar. Instead of referring to a PhraseList by Label, just use an asterisks inside the same squiggly brackets we put around the PhraseList Label. So it’s “{*}” instead of “{siteToSearch}”.

Now, you can see that I also added a <ListenFor> that doesn’t use the wildcard. That will allow the user to say things like, “Search On a website”, or “Search On a specific site”. That way, if the user already knows we don’t know about the site, she can get into the same flow, more deterministically.

OK, now you’ll see that there’s another new PhraseList being referred to here. Named “dotComOrNet”. Let’s add that PhraseList now:

   1:  <PhraseList Label="dotComOrNet" Disambiguate="false">
   2:    <Item>dot com</Item>
   3:    <Item>dot net</Item>
   4:    <Item>dot org</Item>
   5:    <Item>dot gov</Item>
   6:  </PhraseList>

We use that as the “tail” of the ListenFor to help the recognizer on WP8 to choose our application instead of the built in search experience in WP8. Without this “tail”, due to the fact that our application name is phonetically very similar to a build in command’s verb “Search”, the 1st party experience will get picked more often than we’d expect. I’ll save describing why that is for another day, though (for the very curious it’s because of some internal weighting we’ve done to ensure the built in WP8 speech features can more reliably be accessed, even if/when poorly designed 3rd party applications might be on the phone with phonetically similar voice command phrases).

Also, did you notice the “Disambiguate=”false”” part? What’s that do? Good question … Basically, it tells the VoiceCommandService to not worry about asking the user to pick between the different choices if there’s ever any ambiguity. We don’t really need to know what they said, for our built in items. We’ll take a look at this more closely in a future post.

OK. Now, I’ve also updated the “siteToSearch” command, to also take advantage of this “tail”. Combined with the catch all command, it looks like this:

   1:  <Command Name="searchSiteCatchAll">
   2:    <Example>a web site</Example>
   3:    <ListenFor>[a] [specific] [web] site</ListenFor>
   4:    <ListenFor>{*} </ListenFor>
   5:    <ListenFor>{*} {dotComOrNet} </ListenFor>
   6:    <Feedback>What site would you like to search?</Feedback>
   7:    <Navigate Target="MainPage.xaml" />
   8:  </Command>
   9:   
  10:  <Command Name="searchSite">
  11:    <Example>Amazon, Bing, Facebook, Twitter, ...</Example>
  12:    <ListenFor>{siteToSearch} </ListenFor>
  13:    <ListenFor>{siteToSearch} {dotComOrNet}</ListenFor>
  14:    <Feedback>Search on {siteToSearch} for what?</Feedback>
  15:    <Navigate Target="MainPage.xaml" />
  16:  </Command>

On to the code!

First, let’s look at the HandleVoiceCommand function.

   1:  private void HandleVoiceCommand(IDictionary<string, string> queryString)
   2:  {
   3:      switch (queryString["voiceCommandName"])
   4:      {
   5:          case "searchSite":
   6:          case "searchSiteCatchAll":
   7:              SearchSiteVoiceCommand(queryString);
   8:              break;
   9:      }
  10:  }

Now it’s set up to handle more than one voice command, but for now, it’s going to do the same thing for both searchSite and searchSiteCatchAll commands. For that to work, though, we’ll have to also change the SearchSiteVoiceCommand method to look like this:

   1:  private async void SearchSiteVoiceCommand(IDictionary<string, string> queryString)
   2:  {
   3:      string siteName, findText;
   4:   
   5:      if (null != (siteName = await GetSiteName(queryString)) &&
   6:          null != (findText = await GetTextToFindOnSite(queryString, siteName)))
   7:      {
   8:          await Speak(string.Format("Searching {0} for {1}", siteName, findText));
   9:   
  10:          string siteUrlTemplate = GetUrlTemplateFromSiteName(siteName);
  11:          NavigateToUrl(string.Format(siteUrlTemplate, findText, siteName));
  12:      }
  13:  }

Basically, we’ll get the site name, from a new method called GetSiteName, and we’ll get the text to find from a new method as well, called GetTextToFindOnSite. Let’s take a look at both of those methods:

   1:  private async Task<string> GetSiteName(IDictionary<string, string> queryString)
   2:  {
   3:      return queryString.ContainsKey("siteToSearch")
   4:          ? queryString["siteToSearch"]
   5:          : NormalizeDotComSuffixes(
   6:                  await RecognizeTextFromWebSearchGrammar("Ex. \"msdn blogs\""));
   7:  }
   8:   
   9:  private async Task<string> GetTextToFindOnSite(IDictionary<string, string> queryString, string siteName)
  10:  {
  11:      if (!queryString.ContainsKey("siteToSearch"))
  12:      {
  13:          await Speak(string.Format("Search {0} for what?", siteName));
  14:      }
  15:   
  16:      return await RecognizeTextFromWebSearchGrammar("Ex. \"electronics\"");
  17:  }

In the first method, GetSiteName, we’ll check the dictionary, and if it has the siteToSearch, we’ll use that. But, if it doesn’t, we’ll get that site name from the same method we were getting the text to search for before, RecognizeTextFromWebSearchGrammar. Previously, though, that method had a hard coded example string, but now that we’re using it in two different ways, we’ll pass the example text string in as a parameter.

The second new method, GetTextToFindOnSite, similarly calls off to RecognizeTextFromWebSearchGrammar, but it also optionally prompts the user. This is because in the case of the “searchSite” voice command, the prompt will have already been played, by the VoiceCommandService’s use of that command’s <Feedback> element. However, for “searchSiteCatchAll”, the prompt from the VoiceCommandService is asking the user to will be asking what site to search. So, once the user responds to that, we’ll need to prompt them for the text they’d like to search for. Clear as mud? Well, hopefully it’s at least as clear as chicken broth. :-)

The actual implementation of RecognizeTextFromWebSearchGrammar is almost identical; only two lines are different: line 1 has a new parameter being passed in, the exampleText, and line 9, where we use that instead of the hard coded example from before.

   1:  private async Task<string> RecognizeTextFromWebSearchGrammar(string exampleText)
   2:  {
   3:      string text = null;
   4:      try
   5:      {
   6:          SpeechRecognizerUI sr = new SpeechRecognizerUI();
   7:          sr.Recognizer.Grammars.AddGrammarFromPredefinedType("web", SpeechPredefinedGrammar.WebSearch);
   8:          sr.Settings.ListenText = "Listening...";
   9:          sr.Settings.ExampleText = exampleText;
  10:          sr.Settings.ReadoutEnabled = false;
  11:          sr.Settings.ShowConfirmation = false;
  12:   
  13:          SpeechRecognitionUIResult result = await sr.RecognizeWithUIAsync();
  14:          if (result != null && 
  15:              result.ResultStatus == SpeechRecognitionUIStatus.Succeeded &&
  16:              result.RecognitionResult != null &&
  17:              result.RecognitionResult.TextConfidence != SpeechRecognitionConfidence.Rejected)
  18:          {
  19:              text = result.RecognitionResult.Text;
  20:          }
  21:      }
  22:      catch 
  23:      {
  24:      }
  25:      return text;
  26:  }

OK… The next change we’re going to make is to the GetUrlTemplateFromSiteName method. Previously, this was just looking up the site name in the dictionary of strings for our 15+ sites, and if it couldn’t find it, it would use a default URL template. Very similar to that still, now it also uses the “site:____.__” feature of Bing search, as long as it thinks that the siteName is actually a web site. I’m checking that in a fairly lame fashion for this example, but it works pretty well. Feel free to do a better check here yourself (see lines 9-12). :-)

   1:  private string GetUrlTemplateFromSiteName(string siteName)
   2:  {
   3:      string url = null;
   4:   
   5:      if (_siteUrlTemplateDictionary.ContainsKey(siteName))
   6:      {
   7:          url = _siteUrlTemplateDictionary[siteName];
   8:      }
   9:      else if (siteName.Length > 4 && siteName[siteName.Length - 4] != '.')
  10:      {
  11:          url = string.Format(_defaultUrlTemplateForUnknownSites, siteName + "%20{0}");
  12:      }
  13:      else
  14:      {
  15:          url = string.Format(_defaultUrlTemplateForUnknownSites, "site:" + siteName + "%20{0}");
  16:      }
  17:   
  18:      return url;
  19:  }

Last substantive change coming up … In our new GetSiteName method up above, I snuck in a call to a 3rd new method, NormalizeDotComSuffixes. It, well … Normalizes the site name we get back from voice search by trying to convert the tail of the string from things like “ dot com” and “dot O. R. G.” into “.com” and “.org”. That’ll be pretty important to make the new “site:____.___” Bing site specific search feature work in our app. Don’t forget to add a “using System.Text.RegularExpressions;” up at the top as well.

   1:  private string NormalizeDotComSuffixes(string input)
   2:  {
   3:      Regex re1 = new Regex(@"\s+dot\s+(?<suffix>(com|net|org|gov))\z");
   4:      if (input != null && re1.IsMatch(input))
   5:      {
   6:          input = re1.Replace(input, @".${suffix}").Replace(" ", "");
   7:      }
   8:   
   9:      Regex re2 = new Regex(@"\s+dot\s+(?<l1>[A-Z])[.]\s+(?<l2>[A-Z])[.]\s+(?<l3>[A-Z])[.]\z");
  10:      if (input != null && re2.IsMatch(input))
  11:      {
  12:          input = re2.Replace(input, @".${l1}${l2}${l3}").Replace(" ", "");
  13:          input = input.Substring(0, input.Length - 4) + input.Substring(input.Length - 4).ToLower();
  14:      }
  15:   
  16:      return input;
  17:  }

OK … Now, here’s the final change before we can test out our 3rd version of “Search On” …

Remember when the app was launched from the Start menu, or in any other non-VoiceCommandService type of way yesterday? When that happened, we did a search over to my blog. That’s neat and all, but … Now that we have the ability to ask the user what site they want to search, let’s just go ahead and start there in the conversation if our app is launched in t non-VoiceCommand kind of way.

To do that, we’ll make a super simple change to OnNavigatedTo (see line11, and a new function, called HandleNonVoiceCommandInvocation):

   1:  protected override void OnNavigatedTo(NavigationEventArgs e)
   2:  {
   3:      base.OnNavigatedTo(e);
   4:   
   5:      if (e.NavigationMode == NavigationMode.New && NavigationContext.QueryString.ContainsKey("voiceCommandName"))
   6:      {
   7:          HandleVoiceCommand(NavigationContext.QueryString);
   8:      }
   9:      else if (e.NavigationMode == NavigationMode.New)
  10:      {
  11:          HandleNonVoiceCommandInvocation();
  12:      }
  13:      else if (e.NavigationMode == NavigationMode.Back && !System.Diagnostics.Debugger.IsAttached)
  14:      {
  15:          NavigationService.GoBack();
  16:      }
  17:  }
  18:   
  19:  private async void HandleNonVoiceCommandInvocation()
  20:  {
  21:      await Speak("What site would you like to search?");
  22:      SearchSiteVoiceCommand(new Dictionary<string, string>());
  23:  }

Basically, we’ll just prompt the user, and then do a site search just like we do when we get a “searchSiteCatchAll” voice command. That’s it.

OK. Time to give it a run! Press F5 and let’s take it for a spin.

You: "Whitehouse Dot Gov"
Phone: "Search Whitehouse.gov for what?"

You: "Elections results"
Phone: "Searching Whitehouse.gov for Election results”

Did it work? It did for me. Yeah!

Well, what about the voice command path into the code? Let’s try that out too … Press and hold the “Start” button and follow along below…

User: "Search a web site"
Phone: "What web site would you like to search?"

User: "Whitehouse Dot Gov"
Phone: "Search Whitehouse.gov for what?"

User: "Elections results"
Phone: "Searching Whitehouse.gov for Election results”

… and, similarly …

User: "Search Whitehouse Dot Gov"
Phone: "What web site would you like to search?"

User: "Whitehouse Dot Gov"
Phone: "Search Whitehouse.gov for what?"

User: "Elections results"
Phone: "Searching Whitehouse.gov for Election results”

Did those both work too?

OK. So that’s it for today… What’d we learn? Hopefully, you learned a few things:

  • You can use {*} very similarly to how you use {phraseListLabel}, without defining the list
  • You can have more than one <Command> element in the <CommandSet>
  • You can use the “Disambiguate” attribute on your <PhraseList> elements to tell the VoiceCommandService to not worry about clarifying which one the user said, in cases of ambiguity.
  • And … Last by not least … You’ve learned that I really enjoy typing three dots in a row in all my posts … Like that … And like this … :-)

Keep scrolling down for the full listings for our two most important source files. Catch you later!

vcd.xml

<?xml version="1.0" encoding="utf-8"?>
 
<VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.0">
 
  <CommandSet xml:lang="en-US">
 
    <CommandPrefix>Search On</CommandPrefix>
    <Example>Amazon, Bing, Facebook, Twitter, ...</Example>
 
    <Command Name="searchSiteCatchAll">
      <Example>a web site</Example>
      <ListenFor>[a] [specific] [web] site</ListenFor>
      <ListenFor>{*} </ListenFor>
      <ListenFor>{*} {dotComOrNet} </ListenFor>
      <Feedback>What site would you like to search?</Feedback>
      <Navigate Target="MainPage.xaml" />
    </Command>
 
    <Command Name="searchSite">
      <Example>Amazon, Bing, Facebook, Twitter, ...</Example>
      <ListenFor>{siteToSearch} </ListenFor>
      <ListenFor>{siteToSearch} {dotComOrNet}</ListenFor>
      <Feedback>Search on {siteToSearch} for what?</Feedback>
      <Navigate Target="MainPage.xaml" />
    </Command>
 
    <PhraseList Label="siteToSearch">
      <Item>Amazon</Item>
      <Item>Bing</Item>
      <Item>CNN</Item>
      <Item>Dictionary.com</Item>
      <Item>Ebay</Item>
      <Item>Facebook</Item>
      <Item>Google</Item>
      <Item>Hulu</Item>
      <Item>IMDB</Item>
      <Item>Kayak</Item>
      <Item>Linked In</Item>
      <Item>MSN</Item>
      <Item>Netflix</Item>
      <Item>Twitter</Item>
      <Item>Weather.com</Item>
      <Item>YouTube</Item>
      <Item>Zillow</Item>
    </PhraseList>
 
    <PhraseList Label="dotComOrNet" Disambiguate="false">
      <Item>dot com</Item>
      <Item>dot net</Item>
      <Item>dot org</Item>
      <Item>dot gov</Item>
    </PhraseList>
 
  </CommandSet>
 
</VoiceCommands>

MainPage.xaml.cs

using System;
using System.Collections.Generic;
using System.Net;
using System.Windows;
using System.Windows.Navigation;
using Microsoft.Phone.Controls;
using Windows.Phone.Speech.VoiceCommands;
using Windows.Phone.Speech.Recognition;
using Windows.Phone.Speech.Synthesis;
using System.Threading.Tasks;
using Microsoft.Phone.Tasks;
using System.Text.RegularExpressions;
 
namespace SearchOn
{
    public partial class MainPage : PhoneApplicationPage
    {
        public MainPage()
        {
            InitializeComponent();
 
            VoiceCommandService.InstallCommandSetsFromFileAsync(new Uri("ms-appx:///vcd.xml"));
        }
 
        protected override void OnNavigatedTo(NavigationEventArgs e)
        {
            base.OnNavigatedTo(e);
 
            if (e.NavigationMode == NavigationMode.New && NavigationContext.QueryString.ContainsKey("voiceCommandName"))
            {
                HandleVoiceCommand(NavigationContext.QueryString);
            }
            else if (e.NavigationMode == NavigationMode.New)
            {
                HandleNonVoiceCommandInvocation();
            }
            else if (e.NavigationMode == NavigationMode.Back && !System.Diagnostics.Debugger.IsAttached)
            {
                NavigationService.GoBack();
            }
        }
 
        private async void HandleNonVoiceCommandInvocation()
        {
            await Speak("What site would you like to search?");
            SearchSiteVoiceCommand(new Dictionary<string, string>());
        }
 
        private void HandleVoiceCommand(IDictionary<string, string> queryString)
        {
            switch (queryString["voiceCommandName"])
            {
                case "searchSite":
                case "searchSiteCatchAll":
                    SearchSiteVoiceCommand(queryString);
                    break;
            }
        }
 
        private async void SearchSiteVoiceCommand(IDictionary<string, string> queryString)
        {
            string siteName, findText;
 
            if (null != (siteName = await GetSiteName(queryString)) &&
                null != (findText = await GetTextToFindOnSite(queryString, siteName)))
            {
                await Speak(string.Format("Searching {0} for {1}", siteName, findText));
 
                string siteUrlTemplate = GetUrlTemplateFromSiteName(siteName);
                NavigateToUrl(string.Format(siteUrlTemplate, findText, siteName));
            }
        }
 
        private async Task<string> GetSiteName(IDictionary<string, string> queryString)
        {
            return queryString.ContainsKey("siteToSearch")
                ? queryString["siteToSearch"]
                : NormalizeDotComSuffixes(
                        await RecognizeTextFromWebSearchGrammar("Ex. \"msdn blogs\""));
        }
 
        private async Task<string> GetTextToFindOnSite(IDictionary<string, string> queryString, string siteName)
        {
            if (!queryString.ContainsKey("siteToSearch"))
            {
                await Speak(string.Format("Search {0} for what?", siteName));
            }
 
            return await RecognizeTextFromWebSearchGrammar("Ex. \"electronics\"");
        }
 
        private async Task<string> RecognizeTextFromWebSearchGrammar(string exampleText)
        {
            string text = null;
            try
            {
                SpeechRecognizerUI sr = new SpeechRecognizerUI();
                sr.Recognizer.Grammars.AddGrammarFromPredefinedType("web", SpeechPredefinedGrammar.WebSearch);
                sr.Settings.ListenText = "Listening...";
                sr.Settings.ExampleText = exampleText;
                sr.Settings.ReadoutEnabled = false;
                sr.Settings.ShowConfirmation = false;
 
                SpeechRecognitionUIResult result = await sr.RecognizeWithUIAsync();
                if (result != null && 
                    result.ResultStatus == SpeechRecognitionUIStatus.Succeeded &&
                    result.RecognitionResult != null &&
                    result.RecognitionResult.TextConfidence != SpeechRecognitionConfidence.Rejected)
                {
                    text = result.RecognitionResult.Text;
                }
            }
            catch 
            {
            }
            return text;
        }
 
        private async Task Speak(string text)
        {
            SpeechSynthesizer tts = new SpeechSynthesizer();
            await tts.SpeakTextAsync(text);
        }
 
        private void NavigateToUrl(string url)
        {
            WebBrowserTask task = new WebBrowserTask();
            task.Uri = new Uri(url, UriKind.Absolute);
            task.Show();
        }
 
        private string GetUrlTemplateFromSiteName(string siteName)
        {
            string url = null;
 
            if (_siteUrlTemplateDictionary.ContainsKey(siteName))
            {
                url = _siteUrlTemplateDictionary[siteName];
            }
            else if (siteName.Length > 4 && siteName[siteName.Length - 4] != '.')
            {
                url = string.Format(_defaultUrlTemplateForUnknownSites, siteName + "%20{0}");
            }
            else
            {
                url = string.Format(_defaultUrlTemplateForUnknownSites, "site:" + siteName + "%20{0}");
            }
 
            return url;
        }
 
        private string NormalizeDotComSuffixes(string input)
        {
            Regex re1 = new Regex(@"\s+dot\s+(?<suffix>(com|net|org|gov))\z");
            if (input != null && re1.IsMatch(input))
            {
                input = re1.Replace(input, @".${suffix}").Replace(" ", "");
            }
 
            Regex re2 = new Regex(@"\s+dot\s+(?<l1>[A-Z])[.]\s+(?<l2>[A-Z])[.]\s+(?<l3>[A-Z])[.]\z");
            if (input != null && re2.IsMatch(input))
            {
                input = re2.Replace(input, @".${l1}${l2}${l3}").Replace(" ", "");
                input = input.Substring(0, input.Length - 4) + input.Substring(input.Length - 4).ToLower();
            }
 
            return input;
        }
 
        private string _defaultUrlTemplateForUnknownSites = "http://m.bing.com/search?q={0}";
 
        private Dictionary<string, string> _siteUrlTemplateDictionary = new Dictionary<string, string>()
        {
            { "Amazon", "http://www.amazon.com/gp/aw/s/ref=is_box_?k={0}" },
            { "Bing", "http://www.bing.com/?q={0}" },
            { "CNN", "http://www.cnn.com/search/?query={0}" },
            { "Dictionary.com", "http://dictionary.reference.com/browse/{0}" },
            { "Ebay", "http://www.ebay.com/sch/i.html?_nkw={0}" },
            { "Facebook", "http://www.facebook.com/search/results.php?q={0}" },
            { "Google", "https://www.google.com/search?hl=en&q={0}" },
            { "Hulu", "http://www.hulu.com/#!search?q={0}" },
            { "IMDB", "http://www.imdb.com/find?s=all&q={0}" },
            { "Linked In", "http://www.linkedin.com/search/fpsearch?keywords={0}" },
            { "MSN", "http://www.bing.com/search?scope=msn&q={0}" },
            { "Twitter", "http://twitter.com/search/{0}" },
            { "Weather", "http://www.weather.com/info/sitesearch?q={0}" },
            { "YouTube", "http://www.youtube.com/results?search_query={0}" },
            { "Zillow", "http://www.zillow.com/homes/{0}/" },
        };
    }
}