• Kinect for Windows Product Blog

    Super Bowl ad showcases Kinect for Windows during surgery

    • 0 Comments

    It’s not often that we get to tell the Kinect for Windows story to millions of people at the same time, so being featured in a commercial during Super Bowl XLVIII Sunday, February 2, was a thrill. That Microsoft Super Bowl ad showed how technology is #empowering lives, including how GestSure, a Kinect for Windows solution, is helping surgeons and their patients.  Harnessing the power of Kinect for Windows to understand and respond to users’ movements, GestSure allows surgeons to use hand motions to study a patient’s medical images (X-rays as well as MRI and CT scans) on monitors in the operating room.  This eliminates the need for the surgeon to physically manipulate the images using a mouse or keyboard, and thus allows the surgery to continue unimpeded without the doctor having to leave the operating room to view images and spend time scrubbing back in. The result is a better flow of surgery and better care for patients.

    See more stories that celebrate what technology can do, including a short video that showcases GestSure.

    Kinect for Windows Team

    Key links

  • Kinect for Windows Product Blog

    Mysteries of Kinect for Windows Face Tracking output explained

    • 3 Comments

    Since the release of Kinect for Windows version 1.5, developers have been able to use the Face Tracking software development kit (SDK) to create applications that can track human faces in real time. Figure 1, an illustration from the Face Tracking documentation, displays 87 of the points used to track the face. Thirteen points are not illustrated here—more on those points later.

    Figure 1: Tracked Points
    Figure 1: Tracked Points

    You have questions...

    Based on feedback we received via comments and forum posts, it is clear there is some confusion regarding the face tracking points and the data values found when using the SDK sample code. The managed sample, FaceTrackingBasics-WPF, demonstrates how to visualize mesh data by displaying a 3D model representation on top of the color camera image.

    MeshModel - Copy
    Figure 2: Screenshot from FaceTrackingBasics-WPF

    By exploring this sample source code, you will find a set of helper functions under the Microsoft.Kinect.Toolkit.FaceTracking project, in particular GetProjected3DShape(). What many have found was the function returned a collection where the length of the array was 121 values. Additionally, some have also found an enum list, called “FeaturePoint”, that includes 70 items.

    We have answers...

    As you can see, we have two main sets of numbers that don't seem to add up. This is because these are two sets of values that are provided by the SDK:

    1. 3D Shape Points (mesh representation of the face): 121
    2. Tracked Points: 87 + 13

    The 3D Shape Points (121 of them) are the mesh vertices that make a 3D face model based on the Candide-3 wireframe.

    Figure 3: image from http://www.icg.isy.liu.se/candide/img/candide3_rot128.gif
    Figure 3: Wireframe of the Candide-3 model http://www.icg.isy.liu.se/candide/img/candide3_rot128.gif

    These vertices are morphed by the FaceTracking APIs to align with the face. The GetProjected3DShape method returns the vertices as an array of  Vector3DF[]. These values can be enumerated by name using the "FeaturePoint" list. For example, TopSkull, LeftCornerMouth, or OuterTopRightPupil. Figure 4 shows these values superimposed on top of the color frame. 

    FeaturePoints
    Figure 4: Feature Point index mapped on mesh model

    To get the 100 tracked points mentioned above, we need to dive more deeply into the APIs. The managed APIs, provide an FtInterop.cs file that defines an interface, IFTResult, which contains a Get2DShapePoints function. FtInterop is a wrapper for the native library that exposes its functionality to managed languages. Users of the unmanaged C++ API may have already seen this and figured it out. Get2DShapePoints is the function that will provide the 100 tracked points.

    If we have a look at the function, it doesn’t seem to be useful to a managed code developer:

    // STDMETHOD(Get2DShapePoints)(THIS_ FT_VECTOR2D** ppPoints, UINT* pPointCount) PURE;
    void Get2DShapePoints(out IntPtr pointsPtr, out uint pointCount);

    To get a better idea of how you can get a collection of points from IntPtr, we need to dive into the unmanaged function:

    /// <summary>
    /// Returns 2D (X,Y) coordinates of the key points on the aligned face in video frame coordinates.
    /// </summary>
    /// <param name="ppPoints">Array of 2D points (as FT_VECTOR2D).</param>
    /// <param name="pPointCount">Number of elements in ppPoints.</param>
    /// <returns>If the method succeeds, the return value is S_OK. If the method fails, the return value can be E_POINTER.</returns>
    STDMETHOD(Get2DShapePoints)(THIS_ FT_VECTOR2D** ppPoints, UINT* pPointCount) PURE; 

    The function will give us a pointer to the FT_VECTOR2D array. To consume the data from the pointer, we have to create a new function for use with managed code.

    The managed code

    First, you need to create an array to contain the data that is copied to managed memory. Since FT_VECTOR2D is an unmanaged structure, to marshal the data to the managed wrapper, we must have an equivalent data type to match. The managed version of this structure is PointF (structure that uses floats for x and y).

    Now that we have a data type, we need to convert IntPtr to PointF[]. Searching the code, we see that the FaceTrackFrame class wraps the IFTResult object. This also contains the GetProjected3DShape() function we used before, so this is a good candidate to add a new function, GetShapePoints. It will look something like this:

    // populates an array for the ShapePoints
    public void GetShapePoints(ref Vector2DF[] vector2DF)
    {
         // get the 2D tracked shapes
         IntPtr ptBuffer = IntPtr.Zero;
         uint ptCount = 0;
         this.faceTrackingResultPtr.Get2DShapePoints(out ptBuffer, out ptCount);
         if (ptCount == 0)
         {
            
    vector2DF = null;
             return;
         }
     
         // create a managed array to hold the values
         if (vector2DF == null || (vector2DF != null && vector2DF.Length != ptCount))
         {
             vector2DF = new Vector2DF[ptCount];
         }

         ulong sizeInBytes = (ulong)Marshal.SizeOf(typeof(Vector2DF));
         for (ulong i = 0; i < ptCount; i++)
         {
             vector2DF[i] = (Vector2DF)Marshal.PtrToStructure((IntPtr)((ulong)ptBuffer + (i * sizeInBytes)), typeof(Vector2DF));
         }
     } 

    To ensure we are using the data correctly, we refer to the documentation on Get2DShapePoints:

    IFTResult::Get2DShapePoints Method gets the (x,y) coordinates of the key points on the aligned face in video frame coordinates.

    The PointF values represent the mapped values on the color image. Since we know it matches the color frame, there is no need to do apply mapping. You can call the function to get the data, which should align to the color image coordinates.

    The Sample Code

    The modified version of FaceTrackingBasics-WPF is available in the sample code that can be downloaded from CodePlex. It has been modified to allow you to display the feature points (by name or by index value) and toggle the mesh drawing. Because of the way WPF renders, the performance can suffer on machines with lower end graphics cards. I recommend that you only enable these one at a time. If your UI becomes unresponsive, you can block the sensor with your hand to prevent FaceTracking data capturing. Since the application will not detect any face tracked data, it will not render any points, giving you the opportunity to reset the features you enabled by using the UI controls.

    ShapePoints
    Figure 5: ShapePoints mapped around the face

    As you can see in Figure 5, the additional 13 points are the center of the eyes, the tip of the nose, and the areas above the eyebrows on the forehead. Once you enable a feature and tracking begins, you can zoom into the center and see the values more clearly.

    A summary of the changes:

    MainWindows.xaml/.cs:

    • UI changes to enable slider and draw selections

     

    FaceTrackingViewer.cs:

    • Added a Grid control – used for the UI elements
    • Modified the constructor to initialize grid
    • Modified the OnAllFrameReady event
      • For any tracked skeletons, create a canvas and add to the grid. Use that as the parent to put the label controls

    public partial class FaceTrackingViewer : UserControl, IDisposable
    {
         private Grid grid;

         public FaceTrackingViewer()
         {
             this.InitializeComponent();

             // add grid to the layout
             this.grid = new Grid();
             this.grid.Background = Brushes.Transparent;
             this.Content = this.grid;
         }

         private void OnAllFramesReady(object sender, AllFramesReadyEventArgs allFramesReadyEventArgs)
         {
             ...
             // We want keep a record of any skeleton, tracked or untracked.
             if (!this.trackedSkeletons.ContainsKey(skeleton.TrackingId))
             {
                 // create a new canvas for each tracker
                 Canvas canvas = new Canvas();
                 canvas.Background = Brushes.Transparent;
                 this.grid.Children.Add( canvas );
                
                 this.trackedSkeletons.Add(skeleton.TrackingId, new SkeletonFaceTracker(canvas));
             }
             ...
         }
    }

    SkeletonFaceTracker class changes:

    • New property: DrawFraceMesh, DrawShapePoints, DrawFeaturePoint, featurePoints, lastDrawFeaturePoints, shapePoints, labelControls, Canvas
    • New functions: FindTextControl UpdateTextControls, RemoveAllFromCanvas, SetShapePointsLocations, SetFeaturePointsLocations
    • Added the constructor to keep track of the parent control
    • Changed the DrawFaceModel function to draw based on what data was selected
    • Updated the OnFrameReady event to recalculate the positions based for the drawn elements
      • If DrawShapePoints is selected, then we call our new function

    private class SkeletonFaceTracker : IDisposable
    {
    ...
        // properties to toggle rendering 3D mesh, shape points and feature points
        public bool DrawFaceMesh { get; set; }
        public bool DrawShapePoints { get; set; }
        public DrawFeaturePoint DrawFeaturePoints { get; set; }

        // defined array for the feature points
        private Array featurePoints;
        private DrawFeaturePoint lastDrawFeaturePoints;

        // array for Points to be used in shape points rendering
        private PointF[] shapePoints;

        // map to hold the label controls for the overlay
        private Dictionary<string, Label> labelControls;

        // canvas control for new text rendering
        private Canvas Canvas;

        // canvas is passed in for every instance
        public SkeletonFaceTracker(Canvas canvas)
        {
            this.Canvas = canvas;
        }

        public void DrawFaceModel(DrawingContext drawingContext)
        {
            ...
            // only draw if selected
            if (this.DrawFaceMesh && this.facePoints != null)
            {
                ...
            }
        }

        internal void OnFrameReady(KinectSensor kinectSensor, ColorImageFormat colorImageFormat, byte[] colorImage, DepthImageFormat depthImageFormat, short[] depthImage, Skeleton skeletonOfInterest)
        {
            ...
            if (this.lastFaceTrackSucceeded)
            {
                ...
                if (this.DrawFaceMesh || this.DrawFeaturePoints != DrawFeaturePoint.None)
                {
                    this.facePoints = frame.GetProjected3DShape();
                }

                // get the shape points array
                if (this.DrawShapePoints)
                {
                    this.shapePoints = frame.GetShapePoints();
                }
            }

            // draw/remove the components
            SetFeaturePointsLocations();
            SetShapePointsLocations();
        }

        ...
    }

    Pulling it all together...

    As we have seen, there are two types of data points that are available from the Face Tracking SDK:

    • Shape Points: data used to track the face
    • Mesh Data: vertices of the 3D model from the GetProjected3DShape() function
    • FeaturePoints: named vertices on the 3D model that play a significant role in face tracking

    To get the shape point data, we have to extend the current managed wrapper with a new function that will handle the interop with the native API.

    Carmine Sirignano
    Developer Support Escalation Engineer
    Kinect for Windows

    Additional Resources:

     

  • Kinect for Windows Product Blog

    Mirror, mirror, on the screen…who’s the fairest on the scene?

    • 0 Comments

    One of the highlights of the recent Consumer Electronics Show was the three-dimensional augmented reality Beauty Mirror shown by ModiFace, a leader in virtual makeover technology. With help from a Kinect for Windows sensor and a PC, the Beauty Mirror enables customers to view the simulated effects of skin-care and beauty products and anti-aging procedures from all angles of their face. It also allows customers to compare the before-and-after results side-by-side. This proprietary technology can simulate the application of blushes, lipsticks, eye shadows, and other makeup products. Moreover, it can display the impact of such anti-aging procedures as dark spot correction, facelift, browlift, cheek volume enhancement, and jaw contouring. According to Parham Aarabi, the CEO of ModiFace, “the Kinect for Windows sensor captures the customer’s image, enabling the creation of real-time, full-3D simulations that utilize ModiFace's patented photorealistic skin and shade simulation technology.”

    Kinect for Windows Team

    Key links

  • Kinect for Windows Product Blog

    Kinect for Windows sets tone at high-tech library

    • 0 Comments

    Interactive media wall, Jerry Falwell Library, Liberty UniversityPhotos by Kevin Manguiob, Liberty University (left and center)
    Lizzy Benson, Liberty University (right)

    Liberty University, one of the world’s largest online education providers, celebrated the grand opening of its new Jerry Falwell Library on January 15. The $50 million, 170,000-square-foot library hosts cutting-edge, interactive features, including a 24-by-11-foot media wall constructed of 198 micro tiles and equipped with Microsoft Kinect for Windows motion sensing technology. The media wall displays animated visualizations made up of photos from Liberty students and staff. Kinect for Windows enables library visitors to use gestures to grab the photos and view additional details. In order to meet the challenge of providing interaction across the whole width of the wall, Liberty University partnered with InfoStrat, a Kinect for Windows partner located in Washington D.C., to develop the visualizations and create a custom service that enables the use of three Kinect sensors simultaneously.

    Kinect for Windows Team

    Key links

  • Kinect for Windows Product Blog

    Jintronix makes rehabilitation more convenient, fun, and affordable with Kinect for Windows

    • 2 Comments

    A stroke can be a devastating experience, leaving the patient with serious physical impairments and beset by concerns for the future. Today, that future is much brighter, as stroke rehabilitation has made enormous strides. Now, Jintronix offers a significant advance to help stroke patients restore their physical functions: an affordable motion-capture system for physical rehabilitation that uses Microsoft Kinect for Windows.

    Jintronix offers a significant advance to help stroke patients restore their physical functions
    The folks at Montreal- and Seattle-based Jintronix are tackling three major issues related to rehabilitation. First, and most importantly, they are working to improve patients’ compliance with their rehabilitation regimen, since up to 65% of patients fail to adhere fully—or at all—with their programs.[1] In addition, they are addressing the lack of accessibility and the high cost associated with rehabilitation. If you have just had a stroke, even getting to the clinic is a challenge, and the cost of hiring a private physical therapist to come to your home is too high for most people.

    Consider Jane, a 57-year-old patient. After experiencing a stroke eight months ago, she now has difficulty moving the entire right side of her body. Like most stroke victims, Jane faces one to three weekly therapy sessions for up to two years. Unable to drive, she depends on her daughter to get her to these sessions; unable to work, she worries about the $100 fee per visit, as she has exhausted her insurance coverage. If that weren’t enough, Jane also must exercise for hours daily just to maintain her mobility. Unfortunately, these exercises are very repetitive, and Jane finds it difficult to motivate herself to do them. 

    Jintronix tackles all of these issues by providing patients with fun, “gamified” exercises that accelerate recovery and increase adherence. In addition, Jintronix gives patients immediate feedback, which ensures that they perform their movements correctly. This is critical when the patient is exercising at home.

    For clinicians and insurers, Jintronix monitors and collects data remotely to measure compliance and provides critical information on how to customize the patient’s regimen. Thus patients can conveniently and consistently get treatment between clinic visits, from the comfort of their own homes, with results transmitted directly to their therapist. This has been shown to be an effective method for delivering care, and for people living in remote areas, this type of tele-rehabilitation has the potential to be a real game changer.[2] Moreover, a growing shortage of trained clinicians—the shortfall in the United States was estimated to be 13,500 in 2013 and is expected to grow to 31,000 by 2016—means that more and more patients will be reliant on home rehab[3].

    Motion capture lies at the heart of Jintronix. The first-generation Kinect for Windows camera can track 20 points on the body with no need for the patient to wear physical sensors, enabling Jintronix to track the patient’s position in three-dimensional space at 30 frames per second. Behind the scenes, Jintronix uses the data captured by the sensor to track such metrics as the speed and fluidity of patients’ movement. It also records patients’ compensation patterns, such as leaning the trunk forward to reach an object instead of extending the arm normally.

    Jintronix then uses this data to place patients in an interactive game environment that’s built around rehabilitation exercises. For example, in the game Fish Frenzy, the patient's hand controls the movement of an on-screen fish, moving it to capture food objects that are placed around the screen in a specific therapeutic pattern, like a rectangle or a figure eight.

    There are other rehab systems out there that use motion capture, but they often require sensor gloves or other proprietary hardware that take a lot of training and supervision to use, or they depend on rigging an entire room with expensive cameras or placing lots of sensors on the body. “Thanks to Kinect for Windows, Jintronix doesn’t require any extra hardware, cameras, or body sensors, which keeps the price affordable,” says Shawn Errunza, CEO of Jintronix. “That low price point is extremely important,” notes Errunza, “as we want to see our system in the home of every patient who needs neurological and orthopedic rehab.”

    Jintronix developed the system by working closely with leaders in the field of physical rehabilitation, such as Dr. Mindy Levin, professor of physical and occupational therapy at McGill University. With strong support both on the research and clinical sides, the company designed a system that can serve a variety of patients in addition to post-stroke victims—good news for the nearly 36 million individuals suffering from physical disabilities in the United States[4].

    What’s more, Jintronix is a potential boon to the elderly, as it has been shown that seniors can reduce the risk of injury due to falls by 35% by following specific exercise programs.  Unfortunately, most home rehab regimens fail to engage such patients. A recent study of elderly patients found that less than 10 percent reported doing their prescribed physical therapy exercises five days a week (which is considered full adherence), and more than a third reported zero days of compliance.

    Jintronix is currently in closed beta testing in five countries, involving more than 150 patients at 60 clinics and hospitals, including DaVinci Physical Therapy in the Seattle area and the Gingras Lindsay Rehabilitation Hospital in Montreal. According to Errunza, “preliminary results show that the fun factor of our activities has a tangible effect on patients’ motivation to stay engaged in their therapy.”

    Jintronix is working to remove all the major barriers to physical rehabilitation by making a system that is fun, simple to use, and affordable. Jintronix demonstrates the potential of natural user interfaces (NUI) to make technology simpler and more effective—and the ability of Kinect for Windows to help high tech meet essential human needs.

    The Kinect for Windows Team

    Key links

     


    1 http://physiotherapy.org.nz/assets/Professional-dev/Journal/2003-July/July03commentary.pdf

    2 http://www.ncbi.nlm.nih.gov/pubmed/23319181

    3 http://www.apta.org/PTinMotion/NewsNow/2013/10/21/PTDemand/

    4 http://ptjournal.apta.org/content/86/3/401.full

  • Kinect for Windows Product Blog

    Kinect for Windows shines at the 2014 NRF Convention

    • 1 Comments

    This week, some 30,000 retailers from around the world descended on New York’s Javits Center for the 2014 edition of the National Retail Federation’s Annual Convention and Expo, better known as “Retail’s BIG Show.” With an exhibit space covering nearly four football fields and featuring more than 500 vendors, an exhibitor could have been overlooked easily—but not when your exhibit displayed retailing innovations that use the power of the Microsoft Kinect for Windows sensor and SDK. Here are some of the Kinect experiences that attracted attention on the exhibit floor.

    Visitors at the Kinect for Windows demo station

    NEC Corporation of America demonstrated a “smart shelf” application that makes the most of valuable retail space by tailoring the messaging on digital signage to fit the shopper. At the heart of this system is Kinect for Windows, which discerns shoppers who are interested in the display and uses analytics to determine such consumer attributes as age, gender, and level of engagement. On the back end, the data captured by Kinect is delivered to a dashboard where it can be further mined for business intelligence. Allen Ganz, a senior account development manager at NEC, praises the Kinect-based solution, noting that it “provides unprecedented actionable insights for retailers and brands at the point-of-purchase decision.”

    Razorfish displayed two different Kinect-based scenarios, both of which highlight an immersive consumer experience that’s integrated across devices. The first scenario engages potential customers by involving them in a Kinect-driven beach soccer game. In this dual-screen experience, one customer has the role of striker, and uses his or her body movements—captured by the Kinect for Windows sensor—to dribble the ball and then kick it toward the goal. The other customer assumes the role of goalie; his or her avatar appears on the second display and its actions are controlled by the customer’s movements—again captured via the Kinect for Windows sensor—as he or she tries to block the shot. Customers who succeed, accumulate points that can be redeemed for a real (not virtual) beverage from a connected vending machine. Customers can work up a sweat in this game, so the beverage is a much-appreciated reward. But the real reward goes to the retailer, as this compelling, gamified experience creates unique opportunities for sales associates to connect with the shoppers.

    The second scenario from Razorfish also featured a beach theme. This sample experience is intended to take place in a surf shop, where customers design their own customized surfboard by using a Microsoft Surface. Then they use a Kinect-enabled digital signage application to capture images of the customized board against the background of one of the world’s top beaches. This image is immediately printed as a postcard, and a second copy is sent to the customer in an email. Here, too, the real goal is to engage customers, pulling them into an immersive experience that is personal, mobile, and social.

    Razorfish demos their customized surfboard scenario

    Above all, the Razorfish experiences help create a bond between the customer and a brand. “Kinect enables consumers to directly interact personally with a brand, resulting in a greater sense of brand loyalty,” notes Corey Schuman, a senior technical architect at Razorfish.

    Yet another compelling Kinect-enabled customer experience was demonstrated by FaceCake, whose Swivel application turns the computer into a virtual dressing room where a shopper can try on clothes and accessories with a simple click. The customer poses in front of a Kinect for Windows sensor, which captures his or her image. Then the shopper selects items from a photo display of clothing and accessories, and the application displays the shopper “wearing” the selected items. So, a curious shopper can try on, say, various dress styles until she finds one she likes. Then she can add a necklace, scarf, or handbag to create an entire ensemble. She can even split the screen to compare her options, showing side-by-side images of the same dress accessorized with different hats. And yes, this app works for male shoppers, too.

    The common theme in all these Kinect-enabled retail applications is customer engagement. Imagine seeing a digital sign respond to you personally, or getting involved in the creation of your own product or ideal ensemble. If you’re a customer, these are the kinds of interactive experiences that draw you in. In a world where every retailer is looking for new ways to attract and connect with customers, Kinect for Windows is engaging customers and helping them learn more about the products. The upshot is a satisfied customer who's made a stronger connection during their shopping experience, and a healthier bottom line for the retailer.

    The Kinect for Windows Team

    Key links

  • Kinect for Windows Product Blog

    Robotic control gets a boost

    • 0 Comments

    From home vacuum cleaners to automotive welders to precision surgical aids, robots are making their way into every facet of our lives. Now NASA, no stranger to the world of robotics, is using the latest version of Kinect for Windows to control an off-the-shelf robotic arm remotely. Engineers at NASA’s Jet Propulsion Laboratory have created a system that links together an Oculus Rift head-mounted display with the gesture-recognition capabilities of Kinect for Windows, enabling a person to manipulate the robotic arm with great precision simply by gesturing his or her real arm. The ultimate goal is to enhance the control of robots that are far from Earth, like the Mars rovers, but NASA is also exploring partnerships with companies that seek to perfect the technology for Earth-bound applications.

    Kinect for Windows Team

    Key links

  • Kinect for Windows Product Blog

    Kinect for Windows expands its developer preview program

    • 0 Comments

    Last June, we announced that we would be hosting a limited, exclusive developer preview program for Kinect for Windows v2 prior to its general availability in the summer (northern hemisphere) of 2014. And a few weeks ago, we began shipping Kinect for Windows v2 Developer Preview kits to thousands of participants all over the world.

    It’s been exciting to hear from so many developers as they take their maiden voyage with Microsoft’s new generation NUI technology. We’ve seen early unboxing videos that were recorded all over the world, from London to Tokyo. We’ve heard about some promising early experiments that are taking advantage of the higher resolution data and the ability to see six people.  People have told us about early success with the new sensor’s ability to track the tips of hands and thumbs.  And some developers have even described how easy it’s been to port their v1 apps to the new APIs.

    Kinect for Windows v2 Developer Preview kit (Photo courtesy of Vladimir Kolesnikov [@vladkol], a developer preview program participant)
    Kinect for Windows v2 Developer Preview kit
    (Photo courtesy of Vladimir Kolesnikov [@vladkol], a developer preview program participant)

    But we’ve also heard from many people who were not able to secure a place in the program and are eager to get their hands on the Kinect for Windows v2 sensor and SDK as soon as possible. For everyone who has been hoping and waiting, we’re pleased to announce that we are expanding the program so that more of you can participate!

    We are creating 500 additional developer preview kits for people who have great ideas they want to bring to life with the Kinect for Windows sensor and SDK. Like before, the program is open to professional developers, students, researchers, artists, and other creative individuals.

    The program fee is US$399 (or local equivalent) and offers the following benefits:

    • Direct access to the Kinect for Windows engineering team via a private forum and exclusive webcasts
    • Early SDK access (alpha, beta, and any updates along the way to release)
    • Private access to all API and sample documentation
    • A pre-release version of the new generation sensor
    • A final, released sensor at launch next summer (northern hemisphere)

    Applications must be completed and submitted by January 31, 2014, at 9:00 A.M. (Pacific Time), but don’t wait until then to apply! We will award positions in the program on a rolling basis to qualified applicants. Once all 500 kits have been awarded, the application process will be closed.

    Learn more and apply now

    The Kinect for Windows Team

    Key links

  • Kinect for Windows Product Blog

    Using WebGL to Render Kinect Webserver Image Data

    • 1 Comments

    Our 1.8 release includes a sample called WebserverBasics-WPF that shows how HTML5 web applications can leverage our webserver component and JavaScript API to get data from Kinect sensors. Among other things, the client-side code demonstrates how to bind Kinect image streams (specifically, the user viewer and background removal streams) to HTML canvas elements in order to render the sequence of images as they arrive so that users see a video feed of Kinect data.

    In WebserverBasics-WPF, the images are processed by first sending the image data arriving from the server to a web worker thread, which then copies the data pixel-by-pixel into a canvas image data structure which can then be rendered via the canvas “2d” context. While this solution was adequate for our needs, being able to process more than the minimum required 30 frames per second of Kinect image data, we would start dropping image frames and get a kind of stutter in the video feed when we attempted to display data from more than one image stream (e.g.: user viewer + background removal streams) simultaneously. Even when displaying only one image stream at a time there was a noticeable load added to the computer's CPU. This reduced the amount of multitasking that the system could perform.

    Now that the latest version of every major browser supports the WebGL API we can get even better performance without requiring a dedicated background worker thread (that can occupy a full CPU core).  While you should definitely test on your own hardware, using WebGL gave me over 3x improvement in image fps capability—and I don’t even have a high-end GPU!
    Also, once we are using WebGL it is very easy to apply additional image processing or perform other kinds of tasks without adding latency or burdening the CPU. For example, we can use convolution kernels to perform edge detection to process this image
    Original

    and obtain this image:
    EdgeDetected

    So let’s send some Kinect data over to WebGL!

    Getting Started

    1. Make sure you have the Kinect for Windows v1.8 SDK and Toolkit installed
    2. Make sure you have a WebGL-compatible web browser installed
    3. Get the WebserverWebGL sample code, project and solution from CodePlex. To compile this sample you will also need Microsoft.Samples.Kinect.Webserver (also available via CodePlex and Toolkit Browser) and Microsoft.Kinect.Toolkit components (available via Toolkit Browser).


    Note: The entirety of this post focuses on JavaScript code that runs on the browser client. Also, this post is meant to provide a quick overview on how to use WebGL functionality to render Kinect data. For a more comprehensive tutorial on WebGL itself you can go here or here.

    Encapsulating WebGL Functionality

    WebGL requires a non-trivial amount of setup code so, to avoid cluttering the main sample code in SamplePage.html, we defined a KinectWebGLHelper object constructor in KinectWebGLHelper.js file. This object exposes 3 functions:

    • bindStreamToCanvas(DOMString streamName, HTMLCanvasElement canvas)
      Binds the specified canvas element with the specified image stream.
      This function mirrors KinectUIAdapter.bindStreamToCanvas function, but uses canvas “webgl” context rather than the “2d” context.
    • unbindStreamFromCanvas(DOMString streamName)
      Unbinds the specified image stream from previously bound canvas element, if any.
      This function mirrors KinectUIAdapter.unbindStreamToCanvas function.
    • getMetadata(DOMString streamName)
      Allows clients to access the “webgl” context managed by KinectWebGLHelper object.


    The code modifications (relative to code in WebserverBasics-WPF) necessary to get code in SamplePage.html to start using this helper object are fairly minimal. We replaced

    uiAdapter.bindStreamToCanvas(
        Kinect.USERVIEWER_STREAM_NAME,
        userViewerCanvasElement);
    uiAdapter.bindStreamToCanvas(
        Kinect.BACKGROUNDREMOVAL_STREAM_NAME,
        backgroundRemovalCanvasElement);

    with

    glHelper = new KinectWebGLHelper(sensor); 
    glHelper.bindStreamToCanvas(
        Kinect.USERVIEWER_STREAM_NAME,
        userViewerCanvasElement);
    glHelper.bindStreamToCanvas(
        Kinect.BACKGROUNDREMOVAL_STREAM_NAME,
        backgroundRemovalCanvasElement);

    Additionally, we used the glHelper instance for general housekeeping such as clearing the canvas state whenever it’s supposed to become invisible.

    The KinectWebGLHelper further encapsulates the logic to actually set up and manipulate the WebGL context within an “ImageMetadata” object constructor.

    Setting up the WebGL context

    The first step is to get the webgl context and set up the clear color. For historical reasons, some browsers still use “experimental-webgl” rather than “webgl” as the WebGL context name:

    var contextAttributes = { premultipliedAlpha: true }; 
    var glContext = imageCanvas.getContext('webgl', contextAttributes) ||
        imageCanvas.getContext('experimental-webgl', contextAttributes);
    glContext.clearColor(0.0, 0.0, 0.0, 0.0); // Set clear color to black, fully transparent

    Defining a Vertex Shader

    Next we define a geometry to render plus a corresponding vertex shader. When rendering a 3D scene to a 2D screen the vertex shader would typically transform a 3D world-space coordinate into a 2D screen-space coordinate but, since we’re rendering 2D image data coming from a Kinect sensor into a 2D screen, we just need to define a rectangle using 2D coordinates and map the Kinect image onto this rectangle as a texture so the vertex shader ends up being pretty simple:

    // vertices representing entire viewport as two triangles which make up the whole 
    // rectangle, in post-projection/clipspace coordinates
    var VIEWPORT_VERTICES = new Float32Array([
        -1.0, -1.0,
        1.0, -1.0,
        -1.0, 1.0,
        -1.0, 1.0,
        1.0, -1.0,
        1.0, 1.0]);
    var NUM_VIEWPORT_VERTICES = VIEWPORT_VERTICES.length / 2;
    // Texture coordinates corresponding to each viewport vertex
    var VERTEX_TEXTURE_COORDS = new Float32Array([
        0.0, 1.0,
        1.0, 1.0,
        0.0, 0.0,
        0.0, 0.0,
        1.0, 1.0,
        1.0, 0.0]);

    var vertexShader = createShaderFromSource(
        glContext.VERTEX_SHADER,
    "\
    attribute vec2 aPosition;\
    attribute vec2 aTextureCoord;\
    \
    varying highp vec2 vTextureCoord;\
    \
    void main() {\
         gl_Position = vec4(aPosition, 0, 1);\
         vTextureCoord = aTextureCoord;\
    }");

    We specify the shader program as a literal string that gets compiled by the WebGL context. Note that you could instead choose to get the shader code from the server as a separate resource from a designated URI.
    We also need to let the shader know how to find its input data:

    var positionAttribute = glContext.getAttribLocation(
        program,
        "aPosition");
    glContext.enableVertexAttribArray(positionAttribute);

    var textureCoordAttribute = glContext.getAttribLocation(
        program,
        "aTextureCoord");
    glContext.enableVertexAttribArray(textureCoordAttribute);

    // Create a buffer used to represent whole set of viewport vertices
    var vertexBuffer = glContext.createBuffer();
    glContext.bindBuffer(
        glContext.ARRAY_BUFFER,
        vertexBuffer);
    glContext.bufferData(
        glContext.ARRAY_BUFFER,
       VIEWPORT_VERTICES,
        glContext.STATIC_DRAW);
    glContext.vertexAttribPointer(
        positionAttribute,
        2,
        glContext.FLOAT,
        false,
        0,
        0);

    // Create a buffer used to represent whole set of vertex texture coordinates
    var textureCoordinateBuffer = glContext.createBuffer();
    glContext.bindBuffer(
        glContext.ARRAY_BUFFER,
        textureCoordinateBuffer);
    glContext.bufferData(
        glContext.ARRAY_BUFFER,
        VERTEX_TEXTURE_COORDS,
        glContext.STATIC_DRAW);
    glContext.vertexAttribPointer(textureCoordAttribute,
        2,
        glContext.FLOAT,
        false,
        0,
        0);

    // Create a texture to contain images from Kinect server
    // Note: TEXTURE_MIN_FILTER, TEXTURE_WRAP_S and TEXTURE_WRAP_T parameters need to be set
    // so we can handle textures whose width and height are not a power of 2.
    var texture = glContext.createTexture();
    glContext.bindTexture(
        glContext.TEXTURE_2D,
        texture);
    glContext.texParameteri(
        glContext.TEXTURE_2D,
        glContext.TEXTURE_MAG_FILTER,
        glContext.LINEAR);
    glContext.texParameteri(
        glContext.TEXTURE_2D,
        glContext.TEXTURE_MIN_FILTER,
        glContext.LINEAR);
    glContext.texParameteri(
        glContext.TEXTURE_2D,
        glContext.TEXTURE_WRAP_S,
        glContext.CLAMP_TO_EDGE);
    glContext.texParameteri(
        glContext.TEXTURE_2D,
        glContext.TEXTURE_WRAP_T,
        glContext.CLAMP_TO_EDGE);
    glContext.bindTexture(
        glContext.TEXTURE_2D,
        null);

    Defining a Fragment Shader

    Fragment Shaders (also known as Pixel Shaders) are used to compute the appropriate color for each geometry fragment. This is where we’ll sample color values from our texture and also apply the chosen convolution kernel to process the image.

    // Convolution kernel weights (blurring effect by default) 
    var CONVOLUTION_KERNEL_WEIGHTS = new Float32Array([
        1, 1, 1,
        1, 1, 1,
        1, 1, 1]);
    var TOTAL_WEIGHT = 0;
    for (var i = 0; i < CONVOLUTION_KERNEL_WEIGHTS.length; ++i) {
        TOTAL_WEIGHT += CONVOLUTION_KERNEL_WEIGHTS[i];
    }

    var fragmentShader = createShaderFromSource(
        glContext.FRAGMENT_SHADER,
        "\
        precision mediump float;\
        \
        varying highp vec2 vTextureCoord;\
        \
        uniform sampler2D uSampler;\
        uniform float uWeights[9];\
        uniform float uTotalWeight;\
        \
        /* Each sampled texture coordinate is 2 pixels appart rather than 1,\
        to make filter effects more noticeable. */ \
        const float xInc = 2.0/640.0;\
        const float yInc = 2.0/480.0;\
        const int numElements = 9;\
        const int numCols = 3;\
        \
        void main() {\
            vec4 centerColor = texture2D(uSampler, vTextureCoord);\
            vec4 totalColor = vec4(0,0,0,0);\
            \
            for (int i = 0; i < numElements; i++) {\
                int iRow = i / numCols;\
                int iCol = i - (numCols * iRow);\
                float xOff = float(iCol - 1) * xInc;\
                float yOff = float(iRow - 1) * yInc;\
                vec4 colorComponent = texture2D(\
                    uSampler,\
                    vec2(vTextureCoord.x+xOff, vTextureCoord.y+yOff));\
                totalColor += (uWeights[i] * colorComponent);\
            }\
            \
            float effectiveWeight = uTotalWeight;\
            if (uTotalWeight <= 0.0) {\
                effectiveWeight = 1.0;\
            }\
            /* Premultiply colors with alpha component for center pixel. */\
            gl_FragColor = vec4(\
                totalColor.rgb * centerColor.a / effectiveWeight,\
                centerColor.a);\
        
    }");

    Again, we specify the shader program as a literal string that gets compiled by the WebGL context and, again, we need to let the shader know how to find its input data:

    // Associate the uniform texture sampler with TEXTURE0 slot 
    var textureSamplerUniform = glContext.getUniformLocation(
        program,
        "uSampler");
    glContext.uniform1i(textureSamplerUniform, 0);

    // Since we're only using one single texture, we just make TEXTURE0 the active one
    // at all times
    glContext.activeTexture(glContext.TEXTURE0);

    Drawing Kinect Image Data to Canvas

    After getting the WebGL context ready to receive data from Kinect, we still need to let WebGL know whenever we have a new image to be rendered. So, every time that the KinectWebGLHelper object receives a valid image frame from the KinectSensor object, it calls the ImageMetadata.processImageData function, which looks like this:

    this.processImageData = function(imageBuffer, width, height) { 
        if ((width != metadata.width) || (height != metadata.height)) {
            // Whenever the image width or height changes, update tracked metadata and canvas
            // viewport dimensions.
            this.width = width;
            this.height = height;
            this.canvas.width = width;
            this.canvas.height = height;
            glContext.viewport(0, 0, width, height);
        }

        glContext.bindTexture(
            glContext.TEXTURE_2D,
            texture);
        glContext.texImage2D(
            glContext.TEXTURE_2D,
            0,
            glContext.RGBA,
            width,
            height,
            0,
            glContext.RGBA,
            glContext.UNSIGNED_BYTE,
            new Uint8Array(imageBuffer));

        glContext.drawArrays(
            glContext.TRIANGLES,
            0,
            NUM_VIEWPORT_VERTICES);
        glContext.bindTexture(
            glContext.TEXTURE_2D,
            null);
    };

    Customizing the Processing Effect Applied to Kinect Image

    You might have noticed while reading this post that the default value for CONVOLUTION_KERNEL_WEIGHTS provided by this WebGL sample maps to the following 3x3 convolution kernel:

    1 1 1
    1 1 1
    1 1 1

    and this corresponds to a blurring effect. The following table shows additional examples of effects that can be achieved using 3x3 convolution kernels:

        

    Effect

    Kernel Weights

    Resulting Image

    Original

    0 0 0
    0 1 0
    0 0 0
    Original_sample

    Blurring

    1 1 1
    1 1 1
    1 1 1
    Blurred_sample

    Sharpening

    0 -1 0
    -1 5 -1
    0 -1 0
    Sharpened_sample

    Edge Detection

    -1 0 1
    -2 0 2
    -1 0 1
    EdgeDetected_sample

    It is very easy to experiment with different weights of a 3x3 kernel to apply these and other effects by changing the CONVOLUTION_KERNEL_WEIGHTS coefficients and reloading application in browser. Other kernel sizes can also be supported by changing the fragment shader and its associated setup code.

    Summary

    The new WebserverWebGL sample is very similar in user experience to WebserverBasics-WPF, but the fact that it uses the WebGL API to leverage the power of your GPU means that your web applications can perform powerful kinds of Kinect data processing without burdening the CPU or adding latency to your user experience. We didn't add WebGL functionality previously because it was only recently that WebGL became supported in all major browsers. If you're not sure if your clients will have a WebGL-compatible browser but still want to guarantee they can display image-stream data, you should implement a hybrid approach that uses "webgl" canvas context when available and falls back to using "2d" context otherwise.

    Happy coding!

    Additional Resources

     

    Eddy Escardo-Raffo
    Senior Software Development Engineer
    Kinect for Windows

  • Kinect for Windows Product Blog

    Thousands of developers are participating in Kinect for Windows v2 Developer Preview—starting today

    • 1 Comments

    In addition to being a great day for Xbox One, today is also a great day for Kinect for Windows. We have started delivering Kinect for Windows v2 Developer Preview kits to program participants. The Developer Preview includes a pre-release Kinect for Windows v2 sensor, access to the new generation Kinect for Windows software development kit (SDK), as well as ongoing updates and access to private program forums. Participants will also receive a Kinect for Windows v2 sensor when they become available next summer (northern hemisphere).

    Microsoft is committed to making the Kinect for Windows sensor and SDK available early to qualifying developers and designers so they can prepare to have their new-generation applications ready in time for general availability next summer. We continue to see a groundswell for Kinect for Windows. We received thousands of applications for this program and selected participants based on the applicants’ expertise, passion, and the raw creativity of their ideas. We are impressed by the caliber of the applications we received and look forward to seeing the innovative NUI experiences our Developer Preview customers will create.

    The new Kinect for Windows v2 sensor will feature the core capabilities of the new Kinect for Xbox One sensor. With the first version of Kinect for Xbox 360, developers and businesses saw the potential to apply the technology beyond gaming—in many different computing environments. Microsoft believes that the opportunities for revolutionizing computing experiences will be even greater with this new sensor. The benefits will raise the bar and accelerate the development of NUI applications across multiple industries, from retail and manufacturing to healthcare, education, communications, and more:

    Real Vision
    Kinect Real Vision technology dramatically expands its field of view for greater line of sight. An all-new active IR camera enables it to see in the dark. And by using advanced three-dimensional geometry, it can even tell if you’re standing off balance.

    Real Motion
    Kinect Real Motion technology tracks even the slightest gestures. So a simple squeeze of your hand results in precise control over an application, whether you’re standing up or sitting down.

    Real Voice
    Kinect Real Voice technology focuses on the sounds that matter. Thanks to an all-new multi-microphone array, the advanced noise isolation capability lets the sensor know who to listen to, even in a crowded space.

    2014 will be exciting, to say the least. We will keep you updated as the Developer Preview program evolves and we get closer to the Kinect for Windows v2 worldwide launch next summer. Additionally, follow the progress of the early adopter community by keeping an eye on them (#k4wdev) and by following us (@kinectwindows).

    The Kinect for Windows Team

    Key links

Page 4 of 11 (106 items) «23456»