• Kinect for Windows Product Blog

    Joshua Blake on Kinect for Windows and the Natural User Interface Revolution (Part 3)

    • 0 Comments

     

    The following blog post was guest authored by Kinect for Windows (K4W) MVP, Joshua Blake. Josh is the Technical Director of the InfoStrat Advanced Technology Group in Washington, D.C. where he and his team work on cutting-edge Kinect and NUI projects for their clients. You can find him on twitter @joshblake or at his blog, http://nui.joshland.org.

    Josh recently recorded several videos for our Kinect for Windows Developer Center. This is the third of three posts he will be contributing this month to the blog.


     

    In part 1, I shared videos covering the core natural user interface concepts and a sample application that I use to control presentations called Kinect PowerPoint Control. In part 2, I shared two more advanced sample applications: Kinect Weather Map and Face Fusion. In this post, I’m going to share videos that show some of the real-life applications that my team and I created for one of our clients. I’ll also provide some additional detail about how and why we created a custom object tracking interaction. These applications put my NUI concepts into action and show what is possible with Kinect for Windows.

     

    Making it fun to learn

    Our client, Kaplan Early Learning Company, sells teaching resources focused on early childhood education. Kaplan approached us with an interest in creating a series of educational applications for preschool and kindergarten-aged children designed to teach one of several core skills such as basic patterns, spelling simple words, shapes, and spatial relationships. While talking to Kaplan, we learned they had a goal of improving student engagement and excitement while making core skills fun to learn.

    We suggested using Kinect for Windows because it would allow the students to not just interact with the activity but also be immersed in virtual worlds and use their bodies and physical objects for interacting. Kaplan loved the idea and we began creating the applications. After a few iterations of design and development, testing with real students, and feedback, we shipped the final builds of four applications to Kaplan earlier this summer. Kaplan is now selling these applications bundled with a Kinect for Windows sensor in their catalog as Kaplan Move-NG.

    The Kinect for Windows team and I created the videos embedded below to discuss our approach to addressing challenges involved in designing these applications and to demonstrate the core parts of three of the Move-NG applications.

     

    Designing early childhood education apps for Kaplan

    In the video below, I discuss InfoStrat’s guiding principles to creating great applications for Kinect as well as some of the specific challenges we faced creating applications that are fun and exciting for young children while being educational and fitting in a classroom environment. In the next section below the video, read on for additional discussion and three more videos showing the actual applications.

     

    Real-world K4W apps: Designing early childhood education apps for Kaplan (7:32)

     

     

     

    One of the key points covered in this video is that when designing a NUI application, we have to consider the context in which the application will be used. In the education space, especially in early childhood education, this context often includes both teachers and students, so we have to design the applications with both types of users in mind. Here are a few of the questions we thought about while designing these apps for Kaplan:

    • When will the teacher use the app and when will the students use the app?
    • Will the teacher be more comfortable using the mouse or the Kinect for specific tasks? Which input device is most appropriate for each task?
    • Will non-technical teachers understand how to set up the space and use the application? Does there need to be a special setup screen to help the teacher configure the classroom space?
    • How will the teachers and students interact while the application is running?
    • How long would it take to give every student a turn in a typical size classroom?
    • What is the social context in the classroom, and what unwritten social behavior rules can we take into account to simplify the application design?
    • Will the user interaction work with both adults and the youngest children?
    • Will the user interaction work across the various ways children respond to visual cues and voice prompts?
    • Is the application fun?
    • Do students across the entire target age group understand what to do with minimal or no additional prompts from the teacher?

     

    And most importantly:

    • Does the design satisfy the educational goals set for the application?

     

    As you can imagine, finding a solution to all of these questions was quite a challenge. We took an iterative approach and tested with real children in the target age range as often as possible. Fortunately, my three daughters are in the target age range so I could do quick tests at home almost daily and get feedback. We also sent early builds to Kaplan to get a broader range of feedback from their educators and additional children.

    In several cases, we created a prototype of a design or interaction that worked well for ourselves as adults, but failed completely when tested with children. Sometimes the problem was the data from the children’s smaller bodies had more noise. Other times the problem was that the children just didn’t understand what they were supposed to do, even with prompting, guidance, or demonstration. It was particularly challenging when a concept worked with older kindergarten kids but was too complex for the youngest of the preschooler age range. In those cases there was a cognitive development milestone in the age range that the design relied upon and we simply had to find another solution. I will share an example of this near the end of this post.

     

    Kaplan Move-NG application and behind-the scenes videos

    The next three videos each cover one of the Kaplan Move-NG applications. The videos introduce the educational goal of the app and show a demonstration of the core interaction. In addition, I discuss the design challenges mentioned above as well as implementation details such as what parts of the Kinect for Windows SDK we used, how we created a particular interaction, or how feedback from student testing affected the application design. These videos should give you a quick overview of the apps as well as a behind-the-scenes view into what went into the designs.  I hope sharing our experience will help you create better applications which incorporate the interactivity and fun of Kinect.

     

    Real-world K4W apps: Kaplan Move-NG Patterns (6:28)

     

     

     

     

     

    Real-world K4W apps: Kaplan Move-NG Where Am I (5:57)

     

     

     

     

     

    Real-world K4W apps: Kaplan Move-NG Word Pop (7:41)

     

     

     

     

    Object tracking as a natural interaction

    The last video above showed Word Pop, which has the unique feature of letting the user spell words by catching letters with a physical basket (or box). In the video, I showed how we created a custom basket tracker by transforming the Kinect depth data. (My technique was inspired by Kyle McDonald’s work at the Art && Code 2011 conference, as shown at 1:43 in his festival demonstration.) Figure 1 shows the basket tracker developer UI as shown in the Word Pop video. In this section, I’m going to give a little more detail on how this basket tracker works and what led to this design.

    fig1 - box tracking
    Figure 1: The basket tracker developer UI used internally during development of Word Pop. The left image in the interface shows the background removed user and basket, with a rectangle drawn around the basket. The right image shows a visualization of how the application is transforming the depth data.

    To find the basket, we excluded the background and user’s torso from the depth image and then applied the Sobel operator. This produces a gradient value representing the curvature at each point. We mark pixels with low curvature as flat pixels, shown in white in figure 1. The curvature threshold value for determining flat pixels was found empirically.

    The outline of the basket is determined by using histograms of flat pixels across the horizontal and vertical dimensions, shown along the top and left edges of the right image in figure 1. The largest continuous area of flat pixels in each dimension is assumed to be the basket. The basket area is expanded slightly, smoothed across frames, and then the application hit tests this area against the letters falling from the sky to determine when the student has caught a letter.

    In testing, we found this implementation to be robust even when the user moves the basket around quickly or holds it out at the end of one arm. In particular, we did not need to depend upon skeleton tracking, which was often interrupted by the basket itself.

    One of our early Word Pop prototypes used hand-based interaction with skeleton tracking, but this was challenging for the youngest children in the target age range to use or understand. For example, given a prompt of “touch the letter M”, my three-year-old would always run to the computer screen to touch the “M” physically rather than moving her mirror image avatar to touch it. On the other hand, my seven-year-old used the avatar without a problem, illustrating the cognitive development milestone challenge I mentioned earlier. When we added the basket, skeleton tracking data became worse, but we could easily track the interactions of even the youngest children. Since “catching” with the basket has only one physical interpretation – using the avatar image – the younger kids started interacting without trouble.

    The basket in Word Pop was a very simple and natural interaction that the children immediately understood. This may seem like a basic point, but it is a perfect example of what makes Kinect unique and important: Kinect lets the computer see and understand our real world, instead of us having to learn and understand the computer. In this case, the Kinect let the children reuse a skill they already had – catching things in baskets – and focus on the fun and educational aspects of the application, rather than being distracted by learning a complex interface.

    I hope you enjoyed a look behind-the-scenes of our design process and seeing how we approached the challenge of designing fun and educational Kinect applications for young children. Thanks to Ben Lower for giving me the opportunity to record the videos in this post and the previous installments. Please feel free to comment or contact me if you have any questions or feedback on anything in this series. (Don’t forget to check out part 1 and part 2 if you haven’t seen those posts and videos already.)

    Thanks for reading (and watching)!

    -Josh

    @joshblake | joshb@infostrat.com | mobile +1 (703) 946-7176 | http://nui.joshland.org

  • Kinect for Windows Product Blog

    Joshua Blake on Kinect and the Natural User Interface Revolution (Part 2)

    • 0 Comments

     

    The following blog post was guest authored by K4W MVP, Joshua Blake. Josh is the Technical Director of the InfoStrat Advanced Technology Group in Washington, D.C where he and his team work on cutting-edge Kinect and NUI projects for their clients. You can find him on twitter @joshblake or at his blog, http://nui.joshland.org.

    Josh recently recorded several videos for our Kinect for Windows Developer Center. This is the second of three posts he will be contributing this month to the blog.

     


     

    In part 1, I shared videos covering the core natural user interface concepts and a sample application that I use to control presentations called Kinect PowerPoint Control. In this post, I’m going to share videos of two more of my sample applications, one of which is brand new and has never been seen before publicly!

    When I present at conferences or workshops about Kinect, I usually demonstrate several sample applications that I’ve developed. These demos enable me to illustrate how various NUI design scenarios and challenges are addressed by features of the Kinect. This helps the audience see the Kinect in action and gets them thinking about the important design concepts used in NUI. (See the Introduction to Natural User Interfaces and Kinect video in part 1 for more on NUI design concepts.)

    Below, you will find overviews and videos of two of my open source sample applications: Kinect Weather Map and Face Fusion. I use Kinect Weather Map in most developer presentations and will be using the new Face Fusion application for future presentations.

    I must point out that these are still samples - they are perhaps 80% solutions to the problems they approach and lack the polish (and complexity!) of a production system. This means they still have rough edges at certain places but also are easier for a developer to look through and learn from the code.

    Kinect Weather Map

    This application lets you play the role of a broadcast meteorologist and puts your image in front of a live, animated weather map. Unlike a broadcast meteorologist, you won’t need a green screen or any special background due to the magic of the Kinect! The application demonstrates background removal, custom gesture recognition, and a gesture design that is appropriate to this particular scenario. This project source code is available under an open source license at http://kinectweather.codeplex.com/.

    Here are three videos covering different aspects of the Kinect Weather Map sample application:

     

    Community Sample: Kinect Weather Map – Design (1:45)

     

    Community Sample: Kinect Weather Map – Code Walkthrough – Gestures (2:40)

     

    Community Sample: Kinect Weather Map – Code Walkthrough – Background Removal (5:40)

     

    I made Kinect Weather Map a while ago, but it still works great for presentations and is a good reference for new developers for getting started with real-time image manipulation and background removal. This next application, though, is brand new and I have not shown it publicly until today!

     

    Face Fusion

    The Kinect for Windows SDK recently added the much-awaited Kinect Fusion feature. Kinect Fusion lets you integrate data across multiple frames to create a 3D model, which can be exported to a 3D editing program or used in a 3D printer. As a side-effect, Kinect Fusion also tracks the position of the Kinect relative to the reconstruction volume, the 3D box in the real-world that is being scanned.

    I wanted to try out Kinect Fusion so I was thinking about what might make an interesting application. Most of the Kinect Fusion demos so far have been variations of scanning a room or small area by moving the Kinect around with your hands. Some demos scanned a person, but required a second person to move the Kinect around. This made me think – what would it take to scan yourself without needing a second person? Voila! Face Fusion is born.

    Face Fusion lets you make a 3D scan of your own head using a fixed Kinect sensor. You don’t need anyone else to help you and you don’t need to move the Kinect at all. All you need to do is turn your head while in view of the sensor. This project source code is available under an open source license at http://facefusion.codeplex.com.

    Here are two videos walking through Face Fusion’s design and important parts of the source code. Watch them first to see what the application does, then join me again below for a more detailed discussion of a few technical and user experience design challenges.

     

    Community Sample: Face Fusion – Design (2:34)

     

     

    Community Sample: Face Fusion – Code Walkthrough (4:29)

     

    I’m pretty satisfied with how Face Fusion ended up in terms of the ease of use and discoverability. In fact, at one point while setting up to record these videos, I took a break but left the application running. While I wasn’t looking, the camera operator snuck over and started using the application himself and successfully scanned his own head. He wasn’t a Kinect developer and didn’t have any training except watching me practice once or twice. This made me happy for two reasons: it was easy to learn, and it worked for someone besides myself!

     

    Face Fusion Challenges

    Making the application work well and be easy to use and learn isn’t as easy as it sounds though. In this section I’m going to share a few of the challenges I came across and how I solved them.

    Figure 1: Cropped screenshots from Face Fusion: Left, the depth image showing a user and background, with the head and neck joints highlighted with circles. Right, the Kinect Fusion residual image illustrates which pixels from this depth frame were used in Kinect Fusion.

     

    Scanning just the head

    Kinect Fusion tracks the movement of the sensor (or the movement of the reconstruction volume – it is all relative!) by matching up the new data to previously scanned data. The position of the sensor relative to the reconstruction volume is critical because that is how it knows where to add the new depth data in the scan. Redundant data from non-moving objects just reinforces the scan and improves the quality. On the other hand, anything that changes or moves during the scan is slowly dissolved and re-scanned in the new position.

    This is great for scanning an entire room when everything is fixed, but doesn’t work if you scan your body and turn your head. Kinect Fusion tends to lock onto your shoulders and torso, or anything else visible around you, while your head just dissolves away and you don’t get fully scanned. The solution here was to reduce the size of the reconstruction volume from “entire room” to “just enough to fit your head” and then center the reconstruction volume on your head using Kinect SDK skeleton tracking.

    Kinect Fusion ignores everything outside of the real-world reconstruction volume, even if it is visible in the depth image. This causes Kinect Fusion to only track the relative motion between your head and the sensor. The sensor can now be left in one location and the user can move more freely and naturally because the shoulders and torso are not in the volume.

     

    Figure 2: A cropped screenshot of Face Fusion scanning only the user’s head in real-time. The Face Fusion application uses Kinect Fusion to render the reconstruction volume. Colors represent different surface normal directions.

     

    Controlling the application

    Since the scanning process requires the user to stand (or sit) in view of the sensor rather than at the computer, it is difficult to use mouse or touch to control the scan. This is a perfect scenario for voice control! The user can say “Fusion Start”, “Fusion Pause”, or “Fusion Reset” to control the scan process without needing to look at the screen or be near the computer. (Starting the scan just starts streaming data to Kinect Fusion, while resetting the scan clears the data and resets the reconstruction volume.)

    Voice control was a huge help, but I found that when testing the application, I still tended to try to watch the screen during the scan to see how the scan was doing and if the scan had lost tracking. This affected my ability to turn my head far enough for a good scan. If I ignored the screen and slowly turned all the way around, I would often find the scan had failed early on because I moved too quickly and I wasted all that time for nothing. I realized that in this interaction, we needed to have both control of the scan through voice and feedback on the scan progress and quality through non-visual means. Both channels in the interaction are critical.

     

    Figure 3: A cropped Face Fusion screenshot showing the application affirming that it heard the user’s command, top left. In the top right, the KinectSensorChooserUI control shows the microphone icon so the user knows the application is listening.

     

    Providing feedback to the user

    Since I already had voice recognition, one approach might have been to use speech synthesis to let the computer guide the user through the scan. I quickly realized this would be difficult to implement and would be a sub-optimal solution. Speech is discrete but the scan progress is continuous. Mapping the scan progress to speech would be challenging.

    At some point I got the idea of making the computer sing, instead. Maybe the pitch or tonality could provide a continuous audio communication channel. I tried making a sine wave generator using the NAudio open source project and bending the pitch based upon the average error in the Kinect Fusion residual image. After testing a prototype, I figured out that this worked well; it greatly improved my confidence in the scan progress without seeing the screen. Even better, it gave me more feedback than I had before so I knew when to move or hold still, resulting in better scan results!

    Face Fusion plays a pleasant triad chord when it has fully integrated the current view of the user's head, and otherwise continuously slides a single note up to an octave downward based upon the average residual error. This continuous feedback lets me decide how far to turn my head and when I should stop to let it catch up. This is easier to understand when you hear it.  I encourage you to watch the Face Fusion videos above.  Better yet download the code and try it yourself!

    The end result may be a little silly to listen to at first, but if you try it out you’ll find that you are having an interesting non-verbal conversation with the application through the Kinect sensor – you moving your head in specific ways and it responding with sound. It helps you get the job done without needing a second person.

    This continuous audio feedback technique would also be useful for other Kinect Fusion applications where you move the sensor with your hands. It would let you focus on the object being scanned rather than looking away at a display.

    Figure 4: A sequence of three cropped Face Fusion screenshots showing the complete scan. When the user pauses the scan by saying “Kinect Pause”, Face Fusion rotates the scan rendering for user review.

     

    Keep watching this blog this next week for part three, where I will share one more group of videos that break down the designs of several early-childhood educational applications we created for our client, Kaplan Early Learning Company. Those videos will take you behind-the-scenes on our design process and show you how we approached the various challenging aspects of designing fun and educational Kinect applications for small children.

    -Josh

    @joshblake | joshb@infostrat.com | mobile +1 (703) 946-7176 | http://nui.joshland.org

  • Kinect for Windows Product Blog

    Turn any surface into a touch screen with Ubi Interactive and Kinect for Windows

    • 0 Comments

    The following blog post was guest authored by Anup Chathoth, co-founder and CEO of Ubi Interactive.

    Ubi Interactive is a Seattle startup that was one of 11 companies from around the world selected to take part in a three-month Microsoft Kinect Accelerator program in the spring of 2012. Since then, the company has developed the software with more than 100 users and is now accepting orders for the software.

     

    Patrick Wirtz, an innovation manager for The Walsh Group, spends most of his time implementing technology that will enhance Walsh’s ability to work with clients. It’s a vital role at The Walsh Group, a general building construction organization founded in 1898 that has invested more than US$450 Million in capital equipment and regularly employs more than 5,000 engineers and skilled tradespeople.

    "It’s a powerful piece of technology," says Patrick Wirtz, shown here using Ubi in The Walsh Group offices."It’s a powerful piece of technology," says Patrick Wirtz, shown here using Ubi in The Walsh Group
    offices. By setting up interactive 3-D blueprints on the walls, Walsh gives clients the ability
    to explore, virtually, a future building or facility.

    In the construction industry, building information modeling (BIM) is a critical component of presentations to clients. BIM allows construction companies like The Walsh Group to represent the functional characteristics of a facility digitally. While this is mostly effective, Wirtz wanted something that would really “wow” his clients. He wanted a way for them to not only see the drawings, but to bring the buildings to life by allowing clients to explore the blueprints themselves.

    Ubi's interactive display being used during a presentation in a Microsoft conference roomWirtz found the solution he had been seeking when he stumbled upon an article about Ubi. At Ubi Interactive, we provide the technology to transform any surface into an interactive touch screen. All the user needs is a computer running our software, a projector, and the Kinect for Windows sensor. Immediately, Wirtz knew Ubi was something he wanted to implement at Walsh: “I contacted the guys at Ubi and told them I am very interested in purchasing the product.” Wirtz was excited about the software and flew out to Seattle for a demo.

    After interacting with the software, Wirtz was convinced that this technology could help The Walsh Group. “Ubi is futuristic-like technology,” he noted—but a technology that he and his colleagues are able to use today. Wirtz immediately saw the potential: Walsh’s building information models could now be interactive displays. Instead of merely presenting drawings to clients, Walsh can now set up an interactive 3-D blueprint on the wall. Clients can walk up to the blueprint and discover what the building will look like by touching and interacting with the display. In use at Walsh headquarters since June 2012, Ubi Interactive brings client engagement to an entirely new level.

    Similarly, Evan Collins, a recent graduate of California Polytechnic State University, used the Ubi software as part of an architecture show he organized. The exhibition showcased 20 interactive displays that allowed the fifth-year architecture students to present their thesis projects in a way that was captivating to audience members. Collins said the interactive displays, “…allowed audience members to choose what content they interacted with instead of listening to a static slideshow presentation.”

    Twenty Ubi Interactive displays at California Polytechnic University 
    Twenty Ubi Interactive displays at California Polytechnic University

    Wirtz’s and Collins’ cases are just two ways that people are currently using Ubi. Because the solution is so affordable, people from a wide range of industries have found useful applications for the Ubi software. Wirtz said, “I didn’t want to spend $10,000. I already had a projector and a computer. All I needed to purchase was the software and a $250 Kinect for Windows sensor. With this small investment, I can now turn any surface into a touch screen. It’s a powerful piece of technology.”

    In addition to small- and mid-sized companies, several Fortune 500 enterprises like Microsoft and Intel are also using the software in their conference rooms. And the use of the technology goes beyond conference rooms:

    • Ubi Interactive makes it possible for teachers to instruct classes in an interactive lecture hall.
    • Shoppers can access product information on a store’s window front, even after hours.
    • Recipes can be projected onto kitchen countertops without having to worry about getting anything dirty.
    • Children can use their entire bedroom wall to play interactive games like Angry Birds.
    • The possibilities are endless.

    At Ubi Interactive, it is our goal to make the world a more interactive place. We want human collaboration and information to be just one finger touch away, no matter where you are. By making it possible to turn any surface into a touch screen, we eliminate the need for screen hardware and thereby reduce the cost and extend the possibilities of enabling interactive displays in places where they were not previously feasible—such as on walls in public spaces. Our technology has implications of revolutionizing the way people live their lives on a global level. After private beta evaluation with more than 50 organizations, the Ubi software is now available for ordering at ubi-interactive.com.

    Anup Chathoth
    Co-Founder and CEO, Ubi Interactive

    Key links

  • Kinect for Windows Product Blog

    Joshua Blake on Kinect and the Natural User Interface Revolution (Part 1)

    • 0 Comments

    The following blog post was guest authored by K4W MVP, Joshua Blake. Josh is the Technical Director of the InfoStrat Advanced Technology Group in Washington, D.C where he and his team work on cutting-edge Kinect and NUI projects for their clients. You can find him on twitter @joshblake or at his blog, http://nui.joshland.org.

    Josh recently recorded several videos for our Kinect for Windows Developer Center and will be contributing three posts this month to the blog.



    I’ve been doing full-time natural user interface (NUI) design and development since 2008:  starting with multi-touch apps for the original Microsoft Surface (now called “PixelSense") and most-recently creating touch-free apps using Kinect. Over this time, I have learned a great deal about what it takes to create great natural user interfaces, regardless of the input or output device.

    One of the easiest ways to get involved with natural user interfaces is by learning to create applications for the Kinect for Windows sensor, which has an important role to play in the NUI revolution. It is inexpensive enough to be affordable to almost any developer, yet it allows our computers see, hear, and understand the real-world similar to how we understand it. It isn’t enough to just mash up new sensors with existing software, though. In order to reach the true potential of the Kinect, we need learn what makes a user interface truly ‘natural’.

    The Kinect for Windows team generously offered to record several videos of me sharing my thoughts on natural user interface and Kinect design and development. Today, you can watch the first three of these videos on the Kinect for Windows Developer Center.

    Introduction to Natural User Interfaces and Kinect

    In this video, I present the most important ideas and concepts that every natural user interface designer or developer must know and give concrete examples of the ideas from Kinect development. This video covers: what natural user interfaces are, what ideas to consider when designing a natural user interface, and the difference between gestures and manipulations.

     

     

    Kinect PowerPoint Control

    This pair of videos covers my Kinect PowerPoint Control sample project. The “Design” video quickly demonstrates the features of the application, and the “Code Walkthrough” video explains the most important parts of the code. The project source code is available under an open source license at https://kinectpowerpoint.codeplex.com.

    I use this app all the time to control my PowerPoint presentations (such as the Intro to NUI video above) with Kinect. The app demonstrates the bare minimum code required to do simple custom gesture recognition using Kinect skeleton data and how to respond to basic voice commands using Kinect speech recognition. I have found many Kinect developers have trouble getting started with gesture recognition, so the features in the sample are kept minimal on purpose so that the code is easy to read and learn from.

     

    Community Sample: Kinect PowerPoint Control – Design (1:41)

     

     

    Community Sample: Kinect PowerPoint Control – Code Walkthrough (3:30)

     

     

    Keep watching this blog next week for part two, where I will share more videos showing advanced features of the Kinect SDK.  I will introduce two more of my sample Kinect projects including a completely new, previously unpublished sample that uses Kinect Fusion for object scanning.

    -Josh

    @joshblake | joshb@infostrat.com | mobile +1 (703) 946-7176 | http://nui.joshland.org

  • Kinect for Windows Product Blog

    Join Now, BUILD for Tomorrow

    • 3 Comments

    Today at Microsoft BUILD 2013, we made two important announcements for our Kinect for Windows developer community.

    First, starting today, developers can apply for a place in our upcoming developer kit program. This program will give participants exclusive early access to everything they need to start building applications for the recently-announced new generation Kinect for Windows sensor, including a pre-release version of the new sensor hardware and software development kit (SDK) in November, and a replacement unit of the final sensor hardware and firmware when it is publicly available next year. The cost for the program will be US$399 (or local equivalent). Applications must be received by July 31 and successful applicants will be notified and charged in August. Interested developers are strongly encouraged to apply early, as spots are very limited and demand is already great for the new sensor. Review complete program details and apply for the program.


    The upcoming Kinect for Windows SDK 1.8 will include more realistic color capture with Kinect Fusion.
    The upcoming Kinect for Windows SDK 1.8 will include more realistic color capture with Kinect Fusion.

    Additionally, in September we will again refresh the Kinect for Windows SDK with several exciting updates including:

    • The ability to extract the user from the background in real time
    • The ability to develop Kinect for Windows desktop applications by using HTML5/JavaScript
    • Enhancements to Kinect Fusion, including capture of color data and improvements to tracking robustness and accuracy

    The feature enhancements will enable even better Kinect for Windows-based applications for businesses and end users, and the convenience of HTML5 will make it easier for developers to build leading-edge touch-free experiences.

    This will be the fourth significant update to the Kinect for Windows SDK since we launched 17 months ago. We are committed to continuing to improve the existing Kinect for Windows platform as we prepare to release the new generation Kinect for Windows sensor and SDK.  If you aren’t already using Kinect for Windows to develop touch-free solutions, now is a great time to start. Join us as we continue to make technology easier to use and more intuitive for everyone.

    Bob Heddle
    Director, Kinect for Windows

    Key links 

  • Kinect for Windows Product Blog

    Using Kinect Interactions to Create a Slider Control

    • 8 Comments

    In the 1.7 release, the Kinect for Windows Toolkit added the "Interactions Framework" which makes it easy to create Kinect-enabled applications in WPF that use buttons and grip scrolling.  What may not be obvious from the Toolkit samples is creating new controls for this framework is easy and straightforward.  To demonstrate this, I’m going to introduce a slider control that can be used with Kinect for Windows to “scrub” video or for other things like turning the volume up to eleven.

    A solution containing the control code and a sample app is in the .zip file below.

    Look Before You Leap

    Before jumping right in and writing a brand new WPF control, it's good to see if other solutions will meet your needs.  Most WPF controls are designed to be look-less.  That is, everything about the visual appearance of the control is defined in XAML, as opposed to using C# code.  So if it's just the layout of things in the control, transitions, or animations you need to be different, changing the control template will likely suit your needs.  If you want the behavior of multiple controls combined into a reusable component then a UserControl may do what you want.

    Kinect HandPointers

    HandPointers are the abstraction that the Interactions Framework provides to tell the UI where the user's hands are and what state they are in.  In the WPF layer the API for HandPointers resembles the API for the mouse where possible.  Unlike the mouse, there is typically more than one hand pointer active at a time since more than one hand is visible by the Kinect sensor at a time.  In the controls that are in the toolkit (KinectCursorVisualizer, KinectTileButton, KinectScrollViewer, etc.) only the primary hand pointer of the primary user is used.  However, your control will still get events for all the other hand pointers.  As a result there is code in the event handlers to only respond to the primary user's primary hand.

    KinectRegion Events

    KinectRegion is the main component to look to when adding Kinect Interactions functionality to a WPF control.  All the WPF controls that are descendants of the KinectRegion will receive HandPointer* events as the HandPointers are used.  For example, when a hand pointer moves into the control's boundaries, the control will receive a KinectRegion.HandPointerEnter event.  If you've handled mouse events before, many of the KinectRegion events will feel familiar. 

    KinectRegion events - http://msdn.microsoft.com/en-us/library/microsoft.kinect.toolkit.controls.kinectregion_events.aspx

    Handling KinectRegion Events in the Slider

    The slider control handles KinectRegion events to allow the user to grip and drag the thumb of the slider.  When a control "captures" a hand pointer it means that all the events of the captured hand pointer will be sent to that control until capture is released.  A general guideline for implementing control interactions is that a control should always capture hand pointer input events while the user is interacting with it otherwise it will miss many of the events it needs to function properly

    The state diagram below gives the basic states of the control and what causes the state transitions.  The key thing to note is that the transitions in and out of dragging are caused by capture changing.  So that leads to the question, what causes capture to change?

    The control takes capture when it gets a grip event.  That will put the control into the dragging state until capture is released.  Capture can be released for a number of reasons.  Most commonly it is released when the control receives a GripRelease event indicating the user opened their hand.  It can also be released if we lose track of the hand.  This can happen when the hand moves too far outside the bounds of the KinectRegion.

    Expanding the Hit Area of the Control 

    This control was originally designed to control video playback.  The design of the UI was such that we wanted to put the control at the bottom of the UI but allow the user to grab anywhere to move the playback position.  The way the slider does this is to allow the app to specify a different WPF UIElement that will attach hover and grip handlers.  See the KinectSlider.GripEventTarget property.  This uses WPFs ability to register event handlers on controls other than yourself.

    Things Missing

    While this control works and could actually be used in a real application, it is far from complete in a WPF sense.  It does not implement an automation peer so accessibility is limited.  While touch and keyboard usage may work a little, it is not fully supported.  Focus visuals, visuals for all the Slider permutations, and support for multiple themes are missing.

    Resources for Building WPF Controls

    Books and other resources we use to build controls include:

    WPF 4 Unleashed http://www.informit.com/store/wpf-4-unleashed-9780672331190

    WPF Control Development Unleashed - http://www.informit.com/store/wpf-control-development-unleashed-building-advanced-9780672330339

    WPF source code - http://referencesource.microsoft.com/

    Retemplating WPF controls - http://msdn.microsoft.com/en-us/magazine/cc163497.aspx


  • Kinect for Windows Product Blog

    nsquared releases three new Kinect for Windows-based applications

    • 0 Comments

    The following blog post was guest authored by Celeste Humphrey, business development consultant at nsquared, and Dr. Neil Roodyn, director of nsquared.

    A company that is passionate about learning, technology, and creating awesome user experiences, nsquared has developed three new applications that take advantage of Kinect for Windows to provide users with interactive, natural user interface experiences. nsquared is located in Sydney, Australia.


    At nsquared, we believe that vision-based interaction is the future of computing. The excitement we see in the technology industry regarding touch and tablet computing is a harbinger of the changes that are coming as smarter computer vision systems evolve.

    Kinect for Windows has provided us with the tools to create some truly amazing products for education, hospitality, and events.

    Education: nsquared sky spelling

    We are excited to announce nsquared sky spelling, our first Kinect for Windows-based educational game. This new application, aimed at children aged 4 to 12, makes it fun for children to learn to spell in an interactive and collaborative environment. Each child selects a character or vehicle, such as a dragon, a biplane, or a butterfly, and then flies as that character through the sky to capture letters that complete the spelling of various words. The skeleton recognition capabilities of the Kinect for Windows sensor and software development kit (SDK) track the movement of the children as they stretch out their arms as wings to navigate their character through hoops alongside their wingman (another player). The color camera in the Kinect for Windows sensor allows each child to add their photo, thereby personalizing their experience.

    nsquared sky spelling
    nsquared sky spelling

    Hospitality: nsquared hotel kiosk

    The nsquared hotel kiosk augments the concierge function in a hotel by providing guidance to hotel guests through an intuitive, interactive experience. Guests can browse through images and videos of activities, explore locations on a map, and find out what's happening with a live event calendar. It also provides live weather updates and has customizable themes. The nsquared hotel kiosk uses the new gestures supported in the Kinect for Windows SDK 1.7, enabling users to use a “grip” gesture to drag content across the screen and a “push” gesture to select content. With its fun user interface, this informative kiosk provides guests an interactive alternative to the old brochure rack.

    Kinect for Windows technology enables nsquared to provide an interactive kiosk experience for less than half the price of a similar sized touchscreen (see note).

    nsquared hotel kiosk
    nsquared hotel kiosk

    Events: nsquared media viewer

    The new nsquared media viewer application is a great way to explore interactive content in almost any environment. Designed for building lobbies, experience centers, events, and corporate locations, the nsquared media viewer enables you to display images and video by category in a stylish, customizable carousel. Easy to use, anyone can walk up and start browsing in seconds.

    In addition to taking advantage of key features of the Kinect for Windows sensor and SDK, nsquared media viewer utilizes Windows Azure,  allowing clients to view reports about the usage of the screen and the content displayed.

    nsquared media viewer
    nsquared media viewer

    Kinect for Windows technology has made it possible for nsquared to create applications that allow people to interact with content in amazing new ways, helping us take a step towards our collective future of richer vision-based computing systems.

    Celeste Humphrey, business development consultant, and
    Dr. Neil Roodyn, director, nsquared

    Key links

     

    ____________
    Note: Based on the price of 65-inch touch overlay at approximately US$900 compared to the cost of a Kinect for Windows sensor at approximately US$250. For integrated touch solutions, the price can be far higher.
    Back to blog...

  • Kinect for Windows Product Blog

    The New Generation Kinect for Windows Sensor is Coming Next Year

    • 31 Comments

    The all-new active-infrared capabilities allow the new sensor to work in nearly any lighting condition. This makes it possible for developers to build apps with enhanced recognition of facial features, hand position, and more.By now, most of you likely have heard about the new Kinect sensor that Microsoft will deliver as part of Xbox One later this year. 

    Today, I am pleased to announce that Microsoft will also deliver a new generation Kinect for Windows sensor next year. We’re continuing our commitment to equipping businesses and organizations with the latest natural technology from Microsoft so that they, in turn, can develop and deploy innovative touch-free applications for their businesses and customers. A new Kinect for Windows sensor and software development kit (SDK) are core to that commitment.

    Both the new Kinect sensor and the new Kinect for Windows sensor are being built on a shared set of technologies. Just as the new Kinect sensor will bring opportunities for revolutionizing gaming and entertainment, the new Kinect for Windows sensor will revolutionize computing experiences. The precision and intuitive responsiveness that the new platform provides will accelerate the development of voice and gesture experiences on computers.

    Some of the key capabilities of the new Kinect sensor include:

    • Higher fidelity
      The new sensor includes a high-definition (HD) color camera as well as a new noise-isolating multi-microphone array that filters ambient sounds to recognize natural speaking voices even in crowded rooms. Also included is Microsoft’s proprietary Time-of-Flight technology, which measures the time it takes individual photons to rebound off an object or person to create unprecedented accuracy and precision. All of this means that the new sensor recognizes precise motions and details, such as slight wrist rotation, body position, and even the wrinkles in your clothes. The Kinect for Windows community will benefit from the sensor’s enhanced fidelity, which will allow developers to create highly accurate solutions that see a person’s form better than ever, track objects and environments with greater detail, and understand voice commands in noisier settings than before.

    The enhanced fidelity and depth perception of the new Kinect sensor will allow developers to create apps that see a person's form better, track objects with greater detail, and understand voice commands in noisier settings.
    The enhanced fidelity and depth perception of the new Kinect sensor will allow developers to
    create apps that see a person's form better, track objects with greater detail, and understand
    voice commands in noisier settings.

    • Expanded field of view
      The expanded field of view accommodates a multitude of differently sized rooms, minimizing the need to modify existing room configurations and opening up new solution-development opportunities. The combination of the new sensor’s higher fidelity plus expanded field of view will give businesses the tools they need to create truly untethered, natural computing experiences such as clicker-free presentation scenarios, more dynamic simulation and training solutions, up-close interactions, more fluid gesture recognition for quick interactions on the go, and much more.
          
    • Improved skeletal tracking
      The new sensor tracks more points on the human body than previously, including the tip of the hand and thumb, and tracks six skeletons at once. This not only yields more accurate skeletal tracking, it opens up a range of new scenarios, including improved “avateering,” the ability to develop enhanced rehabilitation and physical fitness solutions, and the possibility to create new experiences in public spaces—such as retail—where multiple users can participate simultaneously.

    The new sensor tracks more points on the human body than previously and tracks six skeletons at once, opening a range of new scenarios, from improved "avateering" to experiences in which multiple users can participate simultaneously.
    The new sensor tracks more points on the human body than previously, including the tip of the hand
    and thumb, and tracks six skeletons at once. This opens up a range of new scenarios, from improved
    "avateering" to experiences in which multiple users can participate simultaneously.
      

    • New active infrared (IR)
      The all-new active-IR capabilities allow the new sensor to work in nearly any lighting condition and, in essence, give businesses access to a new fourth sensor: audio, depth, color…and now active IR. This will offer developers better built-in recognition capabilities in different real-world settings—independent of the lighting conditions—including the sensor’s ability to recognize facial features, hand position, and more. 

    I’m sure many of you want to know more. Stay tuned; at BUILD 2013 in June, we’ll share details about how developers and designers can begin to prepare to adopt these new technologies so that their apps and experiences are ready for general availability next year.

    A new Kinect for Windows era is coming: an era of unprecedented responsiveness and precision.

    Bob Heddle
    Director, Kinect for Windows

    Key links

     
    Photos in this blog by STEPHEN BRASHEAR/Invision for Microsoft/AP Images

     

  • Kinect for Windows Product Blog

    Reflexion Health advancing physical therapy with Kinect for Windows

    • 0 Comments

    Reflexion Health, founded with technology developed at the West Health Institute, realized years ago that assessing physical therapy outcomes is difficult for a variety of reasons, and took on the challenge of designing a solution to help increase the success rates of rehabilitation from physical injury.

    In 2011, the Reflexion team approached the Orthopedic Surgery Department of the Naval Medical Center San Diego to help test their new Rehabilitation Measurement Tool (RMT). This software solution was developed to make physical therapy more engaging, efficient, and successful. By using the Kinect for Windows sensor and software development kit (SDK), the RMT allows clinicians to measure patient progress. Patients often do much of their therapy alone and because they can lack immediate feedback from therapists, it can be difficult for them to be certain that they are performing the exercises in a manner that will provide them with optimal benefits. The RMT can indicate if exercises were performed properly, how frequently they were performed, and give patients real-time feedback.

    Reflexion Health's Kinect for Windows-based tool helps measure how patients respond to physical therapy.
    Reflexion Health's Kinect for Windows-based tool helps measure how patients respond to physical therapy.

    “Kinect for Windows helps motivate patients to do physical therapy—and the data set we gather when they use the RMT is becoming valuable to demonstrate what form of therapy is most effective, what types of patients react better to what type of therapy, and how to best deliver that therapy. Those questions have vexed people for a long time,” says Dr. Ravi Komatireddy, co-founder at Reflexion Health.

    The proprietary RMT software engages patients with avatars and educational information, and a Kinect for Windows sensor tracks a patient’s range of motion and other clinical data. This valuable information helps therapists customize and deliver therapy plans to patients.

    “RMT is a breakthrough that can change how physical therapy is delivered,” Spencer Hutchins, co-founder and CEO of Reflexion Health says. “Kinect for Windows helps us build a repository of information so we can answer rigorous questions about patient care in a quantitative way.” Ultimately, Reflexion Health has demonstrated how software could be prescribed—similarly to pharmaceuticals and medical devices—and how it could possibly lower the cost of healthcare.

    More information about RMT and the clinical trials conducted by the Naval Medical Center can be found in the newly released case study.

    Kinect for Windows team

    Key links

     

  • Kinect for Windows Product Blog

    Using Kinect InteractionStream Outside of WPF

    • 3 Comments

    Last month with the release of version 1.7 of our SDK and toolkit we introduced something called the InteractionStream.  Included in this release were two new samples called Controls Basics and Interaction Gallery which, among other things, show how to use the new InteractionStream along with new interactions like Press and Grip.  Both of these new samples are written using managed code (C#) and WPF.

    One question I’ve been hearing from developers is, “I don’t want to use WPF but I still want to use InteractionStream with managed code.  How do I do this?”  In this post I’m going to show how to do exactly that.  I’m going to take it to the extreme by removing the UI layer completely:  we’ll use a console app using C#.

    The way our application will work is summarized in the diagram below:

    image

     

    There are a few things to note here:

    1. Upon starting the program, we initialize our sensor, interactions, and create FrameReady event handlers.
    2. Our sensor is generating data for every frame.  We use our FrameReady event handlers to respond and handle depth, skeleton, and interaction frames.
    3. The program implements the IInteractionClient interface which requires us to implement a method called GetInteractionInfoAtLocationwhich gives us back information about interactions happening with a particular user at a specified location:
      public InteractionInfo GetInteractionInfoAtLocation(int skeletonTrackingId, InteractionHandType handType, double x, double y)
      {
      var interactionInfo = new InteractionInfo
      {
      IsPressTarget = false,
      IsGripTarget = false
      };

      // Map coordinates from [0.0,1.0] coordinates to UI-relative coordinates
      double xUI = x * InteractionRegionWidth;
      double yUI = y * InteractionRegionHeight;

      var uiElement = this.PerformHitTest(xUI, yUI);

      if (uiElement != null)
      {
      interactionInfo.IsPressTarget = true;

      // If UI framework uses strings as button IDs, use string hash code as ID
      interactionInfo.PressTargetControlId = uiElement.Id.GetHashCode();

      // Designate center of button to be the press attraction point
      //// TODO: Create your own logic to assign press attraction points if center
      //// TODO: is not always the desired attraction point.
      interactionInfo.PressAttractionPointX = ((uiElement.Left + uiElement.Right) / 2.0) / InteractionRegionWidth;
      interactionInfo.PressAttractionPointY = ((uiElement.Top + uiElement.Bottom) / 2.0) / InteractionRegionHeight;
      }

      return interactionInfo;
      }
    4. The other noteworthy part of our program is in the InteractionFrameReady method.  This is where we process information about our users, route our UI events, handle things like Grip and GripRelease, etc.

     

    I’ve posted some sample code that you may download and use to get started using InteractStream in your own managed apps.  The code is loaded with tips in the comments that should get you started down the path of using our interactions in your own apps.  Thanks to Eddy Escardo Raffo on my team for writing the sample console app.

    Ben

    @benlower | kinectninja@microsoft.com | mobile: +1 (206) 659-NINJA (6465)

Page 4 of 9 (90 items) «23456»