Download Research Tools
Have you ever found yourself waiting for results from your Internet search engine? Oh, sure, search for Kim Kardashian and the results come flying back at warp speed. But queries with vague terms are often automatically reformulated into complex queries that may take significantly longer to provide results.
Achieving a consistently fast response time, regardless of the obscurity of the search term, is a challenging goal, one that requires the combined efforts of experts in engineering systems, operational data, distributed systems, machine learning, and performance optimization. Recently, Microsoft Research joined forces with Pohang University of Science and Technology (POSTECH) in Korea to tackle this challenge, and together, they’ve attained promising results.
Leading the collaboration are Professor Seung-won Hwang from POSTECH and researchers Sameh Elnikety and Yuxiong He from Microsoft Research.
Professor Seung-won Hwang participated in the 2014 Korea Day event at Microsoft Research Asia.
The goal of the collaborative project is to improve Bing search results. Even a few search queries that take too long to process (known as tail queries) can undermine user satisfaction and have a negative impact on revenues. In their research on how to reduce the latency in returning results for tail queries, researchers in the collaborative team must predict whether a query takes a long time to process and needs extra resources, such as selective parallelization, in order to resolve it quickly.
The team received the best paper runner-up award at WSDM 2015 in February. Pictured are Prof. Seung-won Hwang and Saehoon Kim from POSTECH (second and fourth from left), Yuxiong He from Microsoft Research (third from left), and WSDM program committee chairs.
The collaborative team has developed techniques that first identify and then accelerate tail queries, thereby improving server throughput by more than 70% in experimental trials. For example, by using past query logs, the team has developed a predictor that spots tail queries with a high rate of accuracy (98.9%). Those time-consuming queries are then handled by a resource manager that the team has perfected, which allocates additional hardware resources to the troublesome queries. These new techniques have been presented at top-tier conferences, including SIGIR 2014 and WSDM 2015, where the work received the best paper runner-up award.
As shown in this diagram, the predictor identifies time-consuming queries, which then are allocated additional hardware resources by the resource manager.
The search engine project is part of a larger program sponsored by the Korea Government Collaboration Program with the Korean Ministry of Science, ICT, and Future Planning (MSIP). Through this program, some of Professor Hwang’s doctoral students have worked as interns at Microsoft Research; later, during a sabbatical, the professor herself came to Microsoft Research as a visiting scientist.
Professor Hwang praises the program for exposing students to production-scale system problems, and calls it a great opportunity to work with top-notch researchers and to publish in top-tier conferences. The benefits of the program are mutual, as Microsoft researcher Yuxiong He points out. “The complementary knowledge and skill sets of the team members have empowered us to solve important practical problems for Microsoft and the entire IT industry,” she observes.
Sameh Elnikety also highly praised the program: “From my personal experience, this program has a positive impact to all involved: students get excellent training, faculty members work on important practical problems, and researchers collaborate with top faculty members, resulting in useful publications and tech transfers.”
Professor Hwang continues to collaborate with Microsoft Research to improve search results. The team’s next challenge is to optimize the tools to better handle queries generated from mobile devices—queries that often involve searching through geo-tagged datasets. And they’re making headway: Professor Hwang will be demonstrating a geo-tagged query optimizer at the upcoming 2015 Korea Day at Microsoft Research Asia, once again showing the power of academic-industry collaboration.
—Miran Lee, Principal Research Program Manager, Microsoft Research Asia
Data science offers the potential to revolutionize areas as disparate as commerce, healthcare, cybersecurity, and politics. To make progress in these areas, we must also make progress in computer science. Specifically, we at Microsoft Research believe that the best solution to a diverse set of problems is a diverse group of technically trained experts.
Cultivating such a broad base of expertise is at the heart of the Microsoft Research Data Science Summer School, an eight-week effort to introduce large-scale data analysis to undergraduate students in the New York City area that is committed to increasing diversity in computer science. The summer school therefore encourages applications from women, minorities, individuals with disabilities, and students from smaller, resource-constrained colleges.
This year’s summer school will run from June 15, 2015 to August 7, 2015. Apply online for the 2015 summer school; please note that the application deadline is April 17.
All applicants must:
The school will choose eight upper-level NYC undergraduate students who come from race, gender, and socioeconomic groups that are traditionally under-represented in computer science, or whose schools resources don’t meet students’ demands. Our intent is to give these young women and men a head start in their computing careers. Selected applicants will receive a laptop and a $5,000 stipend. More importantly, they will be introduced to the key tools and techniques for working with large data sets. The instruction will focus on how these tools can help solve actual problems, and will provide hands-on experience with real-world data, which is often far messier than the prepackaged data sets typically used in college courses.
The first four weeks of the summer school will introduce the students to practical tools for acquiring and interacting with data from online sources, methods from applied statistics for exploring data, and simple but effective tools from machine learning for modeling data. This will include scripting on the command line and statistical modeling in R. The course will contain a morning lecture and discussion followed by group and individual lab work in the afternoon.
The final four weeks will focus on two group research projects with mentor check-ins. Both groups will learn to apply technical tools to answer substantive scientific questions, and each will share its finding by producing a technical report, a demonstration, or both. These projects should serve as a key differentiator for graduate school applications and for those seeking research jobs, and a particularly successful project could lead to a scientific publication and/or recognition at a major conference. For example, last year’s summer school projects were accepted to the 2014 KDD Workshop on Data Science for Social Good and were recognized during the poster session at the 2015 ACM Richard Tapia Celebration of Diversity in Computing, the Association of Computing Machinery’s premier diversity event.
In addition to increasing diversity in computer science, the Microsoft Research Data Science Summer School also fosters long-term interactions between Microsoft Research and talented young students from the New York City area. Our over-arching goal is to get the students excited about computer science and to show them the creative, research side of the discipline, which they may not have encountered in their classes. In the process, we hope to prepare them for future careers in computer science.
—Jake Hofman & Justin Rao, Researchers, Microsoft Research New York City
Since its introduction in 2013, the Lab of Things (LoT) has captured the imagination of researchers, who are using this flexible, platform for experimental research that uses connected devices. During the past six months, we’ve updated and added features to the LoT. We’ve also seen LoT adopted in the classroom and used for some interesting research projects. We would like to share a few of these projects with you, and hope that they will inspire you to try using the Lab of Things for your own research.
Bringing auditory messages to people who are deaf or hard of hearing
The oven timer beeps, the doorbell rings, the smoke alarm blares: our homes are full of devices that deliver important messages via sound. But to people who cannot hear them, those acoustic messages remain undelivered. The Sound Choice team, whose members are students at the University of Washington, set out to solve this predicament. Using LoT, the student researchers integrated auditory data from a network of home sensors and processed the information in real time. The system then relayed the information to a wearable smartwatch that translated the message into tactile and visual output.
Monitoring elderly community residents
Many older people prefer to “age in place,” remaining in their own home as long as possible. But this poses serious problems for elderly folks living on their own. What happens if they fall or suffer a stroke? Who would know? While attending the University of Washington as a visiting scholar, Christian Bock, a student at Germany’s Heidelberg University, developed an experimental system for monitoring elderly people who live alone. His prototype (see the video below) uses three sensors—one in the kitchen, one on the refrigerator door, and one on the front door—to monitor the movements of the elderly resident. LoT links the devices together and stores the data in the Microsoft Azure cloud, where it is analyzed for signs of inactivity that could indicate an injury or illness. The data could be shared with family or community caregivers, who could then intervene in the event of an apparent medical problem.
Learning about the Internet of Things
Home sensors connected via LoT are just part of the much broader Internet of Things (IoT), that vast array of sensors in our houses, cars, stores, offices, and public spaces. It’s vital that researchers understand how to use the IoT as they design new systems. And what better place to start than by mastering the LoT? That was the conclusion of the faculty at Korea’s Kookmin University, whose Smart Embedded System Lab has been equipped with a comprehensive IoT curriculum based on the LoT platform. Students will use this curriculum to complete final projects across many different departments.
Evaluating smart home apps
At another Korean university, the Daegu Gyeongbuk Institute of Science and Technology, project lead Minsu Jo and his classmates are using LoT to understand the nature of people’s everyday activities in a home setting. To do so, they’re evaluating several smart home scenarios in their lab, which has been equipped with a variety of homelike sets and sensors, and they’re employing a home dashboard that lets users review and control the various apps. If you understand Korean, you’ll want to check out their video, which provides a high-level introduction to LoT and shows off some of their research.
Using LoT for teaching
As the foregoing examples show, people are using LoT as a teaching and research tool at universities around the world, and many of the student projects have been highly creative and potentially useful. See more LoT-based student projects and teaching materials, including university-level class curricula.
Integrating with Microsoft Azure services
Recently, we have added two samples to CodePlex that demonstrate how you can send LoT sensor data to the cloud via some powerful, but easy-to-use Azure services. The first sample shows how to use the Azure Mobile Services SDK to write data to a SQL Azure database from a LoT application. The second sample demonstrates how to integrate LoT with (1) Azure Event Hubs, which enables your app to process massive amounts of sensor data, and (2) Azure Stream Analytics, which lets you process complex event data in a low-latency, readily available, and highly scalable cloud environment.
Now that you've learned about just a few of the creative and noteworthy ways that students and researchers are using the LoT platform, we hope that you’ll download the latest version and start deploying your research studies.
—Arjmand Samuel, Senior Research Program Manager, Microsoft Research