Download Research Tools
Today, we have the first part of a two-part blog posted by program managers in Beijing and Redmond respectively—first up, Guobin Wu:
I consider myself incredibly lucky to be the program manager of the Kinect Sign Language Translator project. There are more than 20 million people in China who are hard of hearing, and an estimated 360 million such people around the world, so this project has immense potential to generate positive social impact worldwide.
Opening new doors of communication for sign language users
I clearly remember the extensive effort we put into writing the proposal. We knew that Prof. Xilin Chen of the Chinese Academy of Sciences had been researching sign language recognition technology for 10 years, and he was eager to try the Kinect technology, which offered a very favorable price-to-performance ratio. Ming Zhou, a principle researcher from Microsoft Research Asia, had a good working relationship with Prof. Chen, and it was on Ming’s strong recommendation that we submitted the sign language translator proposal in response to Stewart Tansley’s call for Kinect projects.
During the first six months, we focused mainly on Chinese sign language data collection and labeling. Prof. Chen’s team worked closely with Prof. Hanjing Li of the special education school at Beijing Union University. The first step was to recruit two or three of Prof. Li’s students who are deaf to be part of the project. One candidate in particular stood out: Dandan Yin. We were moved when, during the interview, she told us, “When I was a child, my dream was to create a machine to help people who can’t hear.”
The next milestone was to build a sign language recognition system. The team has published many papers that explain the technical details, but what I want to stress here is the collaborative nature of the project. Every month, we had a team meeting to review the progress and plan our next steps. Experts from a host of disciplines—language modeling, translation, computer vision, speech recognition, 3D modeling, and special education—contributed to the system design.
Our system is still a research prototype. It is progressing from recognizing isolated words signed by a specific person (translator mode) to understanding continuous communication from any competent signer (communication mode). Our current prototype can successfully produce good results for translator mode, and we are diligently working to overcome the technology hurdles so that the system can reliably understand and interpret in communication mode. And while we’re solving those challenges, we are also starting to build up the system’s vocabulary of American Sign Language gestures, which are different from those of Chinese Sign Language.
We’ve had the good fortune to demo the system at both the Microsoft Research Faculty Summit and the Microsoft company meeting this year. Dandan attended both events and displayed her professionalism as a signer. After the Faculty Summit in July, she emotionally thanked Microsoft for turning her dream into reality. I was nearly moved to tears by our reception during the company meeting, the first one that I’d ever attended in person. And I was thrilled to hear thundering applause when Dandan communicated with a hearing employee by using our system.
Since these demos, the project has received much attention from researchers and the deaf community, especially in the United States. We expect that more and more researchers from different disciplines and different countries will collaboratively build on the prototype, so that the Kinect Sign Language Translator system will ultimately benefit the global community of those who are deaf or hard of hearing. The sign language project is a great example of selecting the right technical project with the right innovative partners, and applying effort and perseverance over the years. It has been a wonderful, multidisciplinary, collaborative effort, and I’m honored and proud to be involved.
—Guobin Wu, Research Program Manager, Microsoft Research Asia
Today, we have the second part of a two-part blog posted by program managers in Beijing and Redmond respectively—second up, Stewart Tansley:
When Microsoft Research shipped the first official Kinect for Windows software development kit (SDK) beta in June 2011, it was both an ending and a beginning for me. The thrilling accomplishment of rapidly and successfully designing and engineering the SDK was behind us, but now the development and supporting teams had returned to their normal research work, and I was left to consider how best to showcase the research potential of Kinect technology beyond gaming.
Since Kinect’s launch in November 2010, investigators from all quarters had been experimenting with the system in imaginative and diverse applications. There was very little chance of devising some stand-out new application that no one had thought of—since so many ideas were already in play. So I decided to find the best of the current projects and “double down” on them.
But rather than issuing a public global call—which we didn’t do, because so many people were proactively experimenting with Kinect technology—we turned to the Microsoft Research labs around the world and asked them to submit their best Kinect collaborations with the academic world, thus bringing together professors and our best researchers, as we normally do in Microsoft Research Connections.
We whittled twelve outstanding proposals to five finalists and picked the best three for additional funding and support. One of those three was the Kinect Sign Language Translator, a collaboration among Microsoft Research Asia, the Chinese Academy of Sciences, and Beijing Union University.
Incredibly, the Beijing-based team delivered a demonstration model in fewer than six months, and I first saw it run in October 2012, in Tianjin. Only hours earlier, I had watched a seminal on-stage demo of simultaneous speech translation, during which Microsoft Research’s then leader, Rick Rashid, spoke English into a machine learning system that produced a pitch-perfect Chinese translation—all in real time, on stage before 2,000 Chinese students. It was a "Star Trek" moment. We are living in the future!
Equally inspiring though, and far away from the crowds, I watched the diminutive and delightful Dandan Yin gesture to the Kinect device connected to the early sign language translator prototype—and words appeared on the screen! I saw magic that day, and not just on stage.
Nine months later, in July 2013, we were excited to host Dandan at the annual Microsoft Research Faculty Summit in Redmond—her first trip outside China. We were thrilled with the response by people both attending and watching the Summit. The sign language translator and Dandan made the front page of the Seattle Times and were widely covered by Internet news sites.
We knew we had to make a full video of the system to share it with others and take the work further. Over a couple of sweltering days in late July (yes, Seattle does get hot sunny days!), we showed the system to Microsoft employees. It continued to capture the imagination, including that of Microsoft employees who are deaf.
We got the chance to demonstrate the system at the Microsoft annual company meeting in September 2013—center stage, with 18,000 in-person attendees and more than 60,000 watching online worldwide. This allowed us to bring Dandan and the Chinese research team back to Seattle, and it gave us the opportunity to complete our video.
That week, we all went back into the studio, and through a long hard day, shot the remaining pieces of the story, explaining how the system could one day transform the lives of millions of people who are deaf or hard or hearing—and all of us—around the world.
I hope you enjoy the video and are inspired by it as much we are.
We look forward to making this technology a reality for all! We would love to hear your comments.
—Stewart Tansley, Director, Microsoft Research Connections
The Lab of Things (LoT) may sound like something you’d find in a sci-fi movie, but it is a lot more practical than that: it’s a research platform that makes it easy to deploy interconnected devices in multiple homes, then share your individual research data with other investigators, turning it all into a large-scale study. The LoT thus enhances field studies in such diverse disciplines as healthcare, energy management, and home automation. It not only makes deployment and monitoring easier—it also simplifies the analysis of experimental data and promotes sharing of data, code, and study participants, further lowering the barrier to evaluating ideas in a diverse set of environments where people live, work, or play.
One key to the success of the LoT is the involvement of the academic research community in developing extensions to the LoT infrastructure. These extensions can be in the form of drivers, applications, and cloud components such as analytics.
Shortly after we released the LoT in July of this year, a group of students from University College London (UCL) started poking around the code and got inspired: they’ve developed an analytics engine to scrutinize data collected from experiments and research applications running on the LoT. And this is no slouch of an engine, either. Among other things, it:
Watch the video: Students develop analytics engine for the Lab of Things
The analytical models provided by the UCL Lab of Things Analytics Engine allow the user to evaluate usage patterns of devices, compare data sets, and find anomalies. The engine also has the capability to run custom R scripts, thereby enabling users to employ statistical models beyond those directly implemented in the engine.
If you are interested in the LoT and running data analytics using the analytics engine, visit the Lab of Things site and the analytics engine CodePlex site.
—Arjmand Samuel, Senior Research Program Manager, Microsoft Research Connections