Kinect Toolbox 1.1 : Template based posture detector and Voice Commander - Eternal Coding - HTML5 / JavaScript / 3D development - Site Home - MSDN Blogs

Kinect Toolbox 1.1 : Template based posture detector and Voice Commander


 

Kinect Toolbox 1.1 : Template based posture detector and Voice Commander

  • Comments 44

In a previous article I introduced the Kinect Toolbox : http://blogs.msdn.com/b/eternalcoding/archive/2011/07/04/gestures-and-tools-for-kinect.aspx.

image

Kinect Toolbox v1.1 is now out and this new version adds support for some cool features:

  • Templated posture detector
  • Voice Commander
  • NuGet package

You can find the toolbox here : http://kinecttoolbox.codeplex.com or you can grad it using NuGet : http://nuget.org/List/Packages/KinectToolbox

Templated posture detector

Using the same algorithm as TemplatedGestureDetector, you can now use a learning machine and a matching system to detect postures. In the sample attached with the toolbox I detect the ”T” posture (i.e. when you body is like the T letter):

image

To do that, I developed a new class : TemplatedPostureDetector which uses an internal learning machine (like the gesture detector) :

public class TemplatedPostureDetector : PostureDetector
{
    const float Epsilon = 0.02f;
    const float MinimalScore = 0.95f;
    const float MinimalSize = 0.1f;
    readonly LearningMachine learningMachine;
    readonly string postureName;

    public LearningMachine LearningMachine
    {
        get { return learningMachine; }
    }

    public TemplatedPostureDetector(string postureName, Stream kbStream) : base(4)
    {
        this.postureName = postureName;
        learningMachine = new LearningMachine(kbStream);
    }

    public override void TrackPostures(ReplaySkeletonData skeleton)
    {
        if (LearningMachine.Match(skeleton.Joints.ToListOfVector2(), Epsilon, MinimalScore, MinimalSize))
            RaisePostureDetected(postureName);
    }

    public void AddTemplate(ReplaySkeletonData skeleton)
    {
        RecordedPath recordedPath = new RecordedPath(skeleton.Joints.Count);

        recordedPath.Points.AddRange(skeleton.Joints.ToListOfVector2());

        LearningMachine.AddPath(recordedPath);
    }

    public void SaveState(Stream kbStream)
    {
        LearningMachine.Persist(kbStream);
    }
}

To use this class, we only need to instantiate it and give it some templates (using the [Capture T] button or using a previously saved file). After that, the class can track postures for each skeleton it receives:

Stream recordStream = File.Open(letterT_KBPath, FileMode.OpenOrCreate);
templatePostureDetector = new TemplatedPostureDetector("T", recordStream);
templatePostureDetector.PostureDetected += templatePostureDetector_PostureDetected;
templatePostureDetector.TrackPostures(skeleton);
void templatePostureDetector_PostureDetected(string posture)
{
    MessageBox.Show("Give me a......." + posture);
}

Voice Commander

One thing worth noting when you develop with Kinect is that you will spend your time getting up and sitting down Sourire. In the previous article, I introduced the replay system which is very useful to record a Kinect session.

But when you are alone, even the recording is painful because you cannot be at the same time in front of the sensor and in front of your keyboard to start/stop the record.

So here enters the Voice Commander (tadam!!). This class can use a list of words and raise an event when it detect one of them (using the microphone array of the sensor). So for example, you can use “record” and “stop” orders to launch and stop the recording session while you stay in front of the sensor!

The code is really simple (thanks to Kinect for Windows SDK and Microsoft Speech Platform SDK):

public class VoiceCommander
{
    const string RecognizerId = "SR_MS_en-US_Kinect_10.0";
    Thread workingThread;
    readonly Choices choices;
    bool isRunning;

    public event Action<string> OrderDetected;

    public VoiceCommander(params string[] orders)
    {
        choices = new Choices();
        choices.Add(orders);
    }

    public void Start()
    {
        workingThread = new Thread(Record);
        workingThread.IsBackground = true;
        workingThread.SetApartmentState(ApartmentState.MTA);
        workingThread.Start();  
    }

    void Record()
    {
        using (KinectAudioSource source = new KinectAudioSource
        {
            FeatureMode = true,
            AutomaticGainControl = false,
            SystemMode = SystemMode.OptibeamArrayOnly
        })
        {
            RecognizerInfo recognizerInfo = SpeechRecognitionEngine.InstalledRecognizers().Where(r => r.Id == RecognizerId).FirstOrDefault();

            if (recognizerInfo == null)
                return;

            SpeechRecognitionEngine speechRecognitionEngine = new SpeechRecognitionEngine(recognizerInfo.Id);

            var gb = new GrammarBuilder {Culture = recognizerInfo.Culture};
            gb.Append(choices);

            var grammar = new Grammar(gb);

            speechRecognitionEngine.LoadGrammar(grammar);
            using (Stream sourceStream = source.Start())
            {
                speechRecognitionEngine.SetInputToAudioStream(sourceStream, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));

                isRunning = true;
                while (isRunning)
                {
                    RecognitionResult result = speechRecognitionEngine.Recognize();

                    if (result != null && OrderDetected != null && result.Confidence > 0.7)
                        OrderDetected(result.Text);
                }
            }
        }
    }

    public void Stop()
    {
        isRunning = false;
    }
}

Using this class is really simple:

voiceCommander = new VoiceCommander("record", "stop");
voiceCommander.OrderDetected += voiceCommander_OrderDetected;

voiceCommander.Start();
void voiceCommander_OrderDetected(string order)
{
    Dispatcher.Invoke(new Action(() =>
    {
        if (audioControl.IsChecked == false)
            return;

        switch (order)
        {
            case "record":
                DirectRecord(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "kinectRecord" + Guid.NewGuid() + ".replay"));
                break;
            case "stop":
                StopRecord();
                break;
        }
    }));
}

Conclusion

With Kinect Toolbox 1.1, you have a set of tools to help you develop fun and powerful applications with Kinect for Windows SDK!

Leave a Comment
  • Please add 5 and 3 and type the answer here:
  • Post
  • Hi David,

    I have issue when I run Kinect Toolkit 1.1 :

    If I run it in 64 mashine - I get :

       Could not load file or assembly 'INuiInstanceHelper, Version=1.0.0.10, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its   dependencies. An attempt was made to load a program with an incorrect format.

    If I run it in 86 mashine - I get :

       Speech Recognition is not available on this system. SAPI and Speech Recognition engines cannot be found.

    Could you advise smth ?

    Thank you

    Regards, Anatoly

  • @Anatoly: I think it is installation issues. For your point 1, you may have not installed Kinect SDK 64 bits. For your x86 machine, you may have not installed SAPI and SR on the machine

  • I have the same problem as Anatoly on my 64-bit machine. I have tried everything--uninstalling and reinstalling the SDK, removing INuiInstanceHelper from the GAC and adding it again--everything. Nothing works. Please help!

  • Hi Mike,can you double check that:

    - You installed the last version of Microsoft Speech Platform SDK

    - You installed the last 64bits version of Kinect SDK

    Can you try on a 32 bits system too?

  • Hi, thanks for the quick response! I actually removed all Speech-Related code from the projects, because according to the programming guide (research.microsoft.com/.../ProgrammingGuide_KinectSDK.docx), the Speech components are x86 only. Unfortunately I don't have a 32-bit system I can try it out on... :/

  • You can try to compile the project using x86 target.

  • I already tried that. Same error... :/

  • I just noticed one difference in the error, actually: it says "PublicKeyToken=null" instead of "PublicKeyToken=31bf3856ad364e35", and the message "An attempt was made to load a program with an incorrect format." is no longer present.

  • I managed to find a 32-bit computer to test it on, and now the sample app doesn't give me that error on startup. However, it still gives me the same problem I had on the 64-bit machine, that it crashes when I click Capture Circle and then Stop Recording...

  • Can you debug to see the exception you get whe you click on Capture Circle?

  • OK, I just installed VS Express 2010, as well as all of the Speech components and the 32-bit SDK, on the 32-bit machine. Now when I run the sample app I get the message about INuiInstanceHelper again, and upon clicking on "Capture Circle" I get the following error:

    System.NullReferenceException was unhandled

     Message=Der Objektverweis wurde nicht auf eine Objektinstanz festgelegt.

     Source=GesturesViewer

     StackTrace:

          bei GesturesViewer.MainWindow.recordCircle_Click(Object sender, RoutedEventArgs e) in C:\Users\klein\Downloads\Kinect.Toolkit\Kinect.Toolkit\GesturesViewer\MainWindow.Gestures.cs:Zeile 23.

          bei System.Windows.RoutedEventHandlerInfo.InvokeHandler(Object target, RoutedEventArgs routedEventArgs)

          bei System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)

          bei System.Windows.UIElement.RaiseEventImpl(DependencyObject sender, RoutedEventArgs args)

          bei System.Windows.UIElement.RaiseEvent(RoutedEventArgs e)

          bei System.Windows.Controls.Primitives.ButtonBase.OnClick()

          bei System.Windows.Controls.Button.OnClick()

          bei System.Windows.Controls.Primitives.ButtonBase.OnMouseLeftButtonUp(MouseButtonEventArgs e)

          bei System.Windows.UIElement.OnMouseLeftButtonUpThunk(Object sender, MouseButtonEventArgs e)

          bei System.Windows.Input.MouseButtonEventArgs.InvokeEventHandler(Delegate genericHandler, Object genericTarget)

          bei System.Windows.RoutedEventArgs.InvokeHandler(Delegate handler, Object target)

          bei System.Windows.RoutedEventHandlerInfo.InvokeHandler(Object target, RoutedEventArgs routedEventArgs)

          bei System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)

          bei System.Windows.UIElement.ReRaiseEventAs(DependencyObject sender, RoutedEventArgs args, RoutedEvent newEvent)

          bei System.Windows.UIElement.OnMouseUpThunk(Object sender, MouseButtonEventArgs e)

          bei System.Windows.Input.MouseButtonEventArgs.InvokeEventHandler(Delegate genericHandler, Object genericTarget)

          bei System.Windows.RoutedEventArgs.InvokeHandler(Delegate handler, Object target)

          bei System.Windows.RoutedEventHandlerInfo.InvokeHandler(Object target, RoutedEventArgs routedEventArgs)

          bei System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)

          bei System.Windows.UIElement.RaiseEventImpl(DependencyObject sender, RoutedEventArgs args)

          bei System.Windows.UIElement.RaiseTrustedEvent(RoutedEventArgs args)

          bei System.Windows.UIElement.RaiseEvent(RoutedEventArgs args, Boolean trusted)

          bei System.Windows.Input.InputManager.ProcessStagingArea()

          bei System.Windows.Input.InputManager.ProcessInput(InputEventArgs input)

          bei System.Windows.Input.InputProviderSite.ReportInput(InputReport inputReport)

          bei System.Windows.Interop.HwndMouseInputProvider.ReportInput(IntPtr hwnd, InputMode mode, Int32 timestamp, RawMouseActions actions, Int32 x, Int32 y, Int32 wheel)

          bei System.Windows.Interop.HwndMouseInputProvider.FilterMessage(IntPtr hwnd, WindowMessage msg, IntPtr wParam, IntPtr lParam, Boolean& handled)

          bei System.Windows.Interop.HwndSource.InputFilterMessage(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)

          bei MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)

          bei MS.Win32.HwndSubclass.DispatcherCallbackOperation(Object o)

          bei System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)

          bei MS.Internal.Threading.ExceptionFilterHelper.TryCatchWhen(Object source, Delegate method, Object args, Int32 numArgs, Delegate catchHandler)

          bei System.Windows.Threading.Dispatcher.InvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)

          bei MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)

          bei MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)

          bei System.Windows.Threading.Dispatcher.PushFrameImpl(DispatcherFrame frame)

          bei System.Windows.Threading.Dispatcher.PushFrame(DispatcherFrame frame)

          bei System.Windows.Threading.Dispatcher.Run()

          bei System.Windows.Application.RunDispatcher(Object ignore)

          bei System.Windows.Application.RunInternal(Window window)

          bei System.Windows.Application.Run(Window window)

          bei System.Windows.Application.Run()

          bei GesturesViewer.App.Main() in C:\Users\klein\Downloads\Kinect.Toolkit\Kinect.Toolkit\GesturesViewer\obj\x86\Debug\App.g.cs:Zeile 0.

          bei System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)

          bei System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)

          bei Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()

          bei System.Threading.ThreadHelper.ThreadStart_Context(Object state)

          bei System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)

          bei System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)

          bei System.Threading.ThreadHelper.ThreadStart()

     InnerException:

  • The error on Circle_Click is normal as the toolbox failed to initialize all.

    I had just tried on my x64 computer and everything worked well :(

    I only had to set the target to x86

  • Can you tell me on which line you get the first error?

  • It's line 23 in MainWindow.Gestures.cs. I seem to recall someone having issues related to DirectX. I'll try (re)installing that on this machine next...

  • Is this the first error you get?

Page 1 of 3 (44 items) 123