Download Research Tools
More than half of the world’s population now lives in cities and suburbs, and as just about any of these billions of people can tell you, urban traffic can be a nightmare. Cars stack up bumper-to-bumper, clogging our highways, jangling our nerves, taxing our patience, polluting our air, and taking a toll on our productivity. In short, traffic jams impair on our emotional, physical, and economic wellbeing.
A study by the Brazilian National Association of Public Transport showed that the country’s traffic exacted an economic toll of about US$7.2 million in 1998. Unfortunately, it’s only getting worse; there are now about three times as many vehicles in Brazil, making traffic exponentially worse, according to Fernando de Oliveira Pessoa, a traffic expert in Belo Horizonte, Brazil’s sixth-largest city.
Microsoft Research has joined forces with the Federal University of Minas Gerais, home to one of Brazil’s foremost computer science programs, to tackle the seemingly intractable problem of traffic jams. The immediate objective of this research is to predict traffic conditions over the next 15 minutes to an hour, so that drivers can be forewarned of likely traffic snarls.
The aptly named Traffic Prediction Project plans to combine all available traffic data—including both historic and current information gleaned from transportation departments, Bing traffic maps, road cameras and sensors, and the social networks of the drivers themselves—to create a solution that gets motorists from point A to point B with minimal stop-and-go. The use of historic data and information from social networks are both unique aspects of the project.
By using algorithms to process all these data, the project team intends to predict traffic jams accurately so that drivers can make smart, real-time choices, like taking an alternative route, using public transit, or maybe even just postponing a trip. The predictions should also be invaluable to traffic planners, especially when they are working to accommodate traffic from special events and when planning for future transportation needs.
Achieving reliable predictions will involve processing terabytes of data, which is why the researchers are using Microsoft Azure as the platform for the service. The exceptional scalability, immense storage capacity, and prodigious computational power of Microsoft Azure makes it the perfect resource for this data-intensive project. And because Microsoft Azure is cloud-based, running the Traffic Prediction service on Azure makes it accessible to all users, in real time, all of the time.
To date, the researchers have tested their prediction model in some of the world’s most traffic-challenged cities: New York, Los Angeles, London, and Chicago. The model achieved a prediction accuracy of 80 percent, and that was based on using only traffic-flow data. The researchers expect the accuracy to increase to 90 percent when traffic incidents and data from social networks are folded in.
So the next time your highway resembles a long, thin parking lot, you might calm yourself by contemplating how Microsoft Azure and the Traffic Prediction Project might help you avoid such tie-ups in the future.
—Juliana Salles, Senior Program Manager, Microsoft Research
Have you ever found yourself waiting for results from your Internet search engine? Oh, sure, search for Kim Kardashian and the results come flying back at warp speed. But queries with vague terms are often automatically reformulated into complex queries that may take significantly longer to provide results.
Achieving a consistently fast response time, regardless of the obscurity of the search term, is a challenging goal, one that requires the combined efforts of experts in engineering systems, operational data, distributed systems, machine learning, and performance optimization. Recently, Microsoft Research joined forces with Pohang University of Science and Technology (POSTECH) in Korea to tackle this challenge, and together, they’ve attained promising results.
Leading the collaboration are Professor Seung-won Hwang from POSTECH and researchers Sameh Elnikety and Yuxiong He from Microsoft Research.
Professor Seung-won Hwang participated in the 2014 Korea Day event at Microsoft Research Asia.
The goal of the collaborative project is to improve Bing search results. Even a few search queries that take too long to process (known as tail queries) can undermine user satisfaction and have a negative impact on revenues. In their research on how to reduce the latency in returning results for tail queries, researchers in the collaborative team must predict whether a query takes a long time to process and needs extra resources, such as selective parallelization, in order to resolve it quickly.
The team received the best paper runner-up award at WSDM 2015 in February. Pictured are Prof. Seung-won Hwang and Saehoon Kim from POSTECH (second and fourth from left), Yuxiong He from Microsoft Research (third from left), and WSDM program committee chairs.
The collaborative team has developed techniques that first identify and then accelerate tail queries, thereby improving server throughput by more than 70% in experimental trials. For example, by using past query logs, the team has developed a predictor that spots tail queries with a high rate of accuracy (98.9%). Those time-consuming queries are then handled by a resource manager that the team has perfected, which allocates additional hardware resources to the troublesome queries. These new techniques have been presented at top-tier conferences, including SIGIR 2014 and WSDM 2015, where the work received the best paper runner-up award.
As shown in this diagram, the predictor identifies time-consuming queries, which then are allocated additional hardware resources by the resource manager.
The search engine project is part of a larger program sponsored by the Korea Government Collaboration Program with the Korean Ministry of Science, ICT, and Future Planning (MSIP). Through this program, some of Professor Hwang’s doctoral students have worked as interns at Microsoft Research; later, during a sabbatical, the professor herself came to Microsoft Research as a visiting scientist.
Professor Hwang praises the program for exposing students to production-scale system problems, and calls it a great opportunity to work with top-notch researchers and to publish in top-tier conferences. The benefits of the program are mutual, as Microsoft researcher Yuxiong He points out. “The complementary knowledge and skill sets of the team members have empowered us to solve important practical problems for Microsoft and the entire IT industry,” she observes.
Sameh Elnikety also highly praised the program: “From my personal experience, this program has a positive impact to all involved: students get excellent training, faculty members work on important practical problems, and researchers collaborate with top faculty members, resulting in useful publications and tech transfers.”
Professor Hwang continues to collaborate with Microsoft Research to improve search results. The team’s next challenge is to optimize the tools to better handle queries generated from mobile devices—queries that often involve searching through geo-tagged datasets. And they’re making headway: Professor Hwang will be demonstrating a geo-tagged query optimizer at the upcoming 2015 Korea Day at Microsoft Research Asia, once again showing the power of academic-industry collaboration.
—Miran Lee, Principal Research Program Manager, Microsoft Research Asia
Data science offers the potential to revolutionize areas as disparate as commerce, healthcare, cybersecurity, and politics. To make progress in these areas, we must also make progress in computer science. Specifically, we at Microsoft Research believe that the best solution to a diverse set of problems is a diverse group of technically trained experts.
Cultivating such a broad base of expertise is at the heart of the Microsoft Research Data Science Summer School, an eight-week effort to introduce large-scale data analysis to undergraduate students in the New York City area that is committed to increasing diversity in computer science. The summer school therefore encourages applications from women, minorities, individuals with disabilities, and students from smaller, resource-constrained colleges.
This year’s summer school will run from June 15, 2015 to August 7, 2015. Apply online for the 2015 summer school; please note that the application deadline is April 17.
All applicants must:
The school will choose eight upper-level NYC undergraduate students who come from race, gender, and socioeconomic groups that are traditionally under-represented in computer science, or whose schools resources don’t meet students’ demands. Our intent is to give these young women and men a head start in their computing careers. Selected applicants will receive a laptop and a $5,000 stipend. More importantly, they will be introduced to the key tools and techniques for working with large data sets. The instruction will focus on how these tools can help solve actual problems, and will provide hands-on experience with real-world data, which is often far messier than the prepackaged data sets typically used in college courses.
The first four weeks of the summer school will introduce the students to practical tools for acquiring and interacting with data from online sources, methods from applied statistics for exploring data, and simple but effective tools from machine learning for modeling data. This will include scripting on the command line and statistical modeling in R. The course will contain a morning lecture and discussion followed by group and individual lab work in the afternoon.
The final four weeks will focus on two group research projects with mentor check-ins. Both groups will learn to apply technical tools to answer substantive scientific questions, and each will share its finding by producing a technical report, a demonstration, or both. These projects should serve as a key differentiator for graduate school applications and for those seeking research jobs, and a particularly successful project could lead to a scientific publication and/or recognition at a major conference. For example, last year’s summer school projects were accepted to the 2014 KDD Workshop on Data Science for Social Good and were recognized during the poster session at the 2015 ACM Richard Tapia Celebration of Diversity in Computing, the Association of Computing Machinery’s premier diversity event.
In addition to increasing diversity in computer science, the Microsoft Research Data Science Summer School also fosters long-term interactions between Microsoft Research and talented young students from the New York City area. Our over-arching goal is to get the students excited about computer science and to show them the creative, research side of the discipline, which they may not have encountered in their classes. In the process, we hope to prepare them for future careers in computer science.
—Jake Hofman & Justin Rao, Researchers, Microsoft Research New York City