Download Research Tools
On a gloomy day in December 2009, an international panel of experts met at the unlikely venue of a football stadium on the outskirts of Oxford, U.K. The panel, chaired by Dan Atkins, who recently stepped down as director of the National Science Foundation's (NSF's) Office of Cyberinfrastructure, convened to review the achievements of the U.K. e-Science Programme, which I had the privilege of directing from 2001 to 2005, before I joined Microsoft Research. The venue for the review was chosen to overlap with the 2009 U.K. e-Science All Hands Meeting (AHM). These AHMs were begun in 2002, and their continuation three years after the end of the program's formal funding is, for me, a testament to the passion and strength of the multidisciplinary e-science community that we created. I was present in Oxford to discuss my management and organization of the e-Science Core Programme, and I was curious to see what impression the achievements of U.K. e-science would make on this distinguished panel.
The review was organized by the U.K. Engineering and Physical Sciences Research Council (EPSRC), which had set up a punishing schedule for the panel-a virtually nonstop series of interviews and visits, with hardly a moment to breathe. The results of the review have now been published on the EPSRC website, and I was delighted that the panel had concluded "that the U.K. e-Science Program is in a world-leading position along the path of building a U.K. Foundation for the Transformative Enhancement of Research and Innovation." They further declared that "the U.K. has created a 'jewel', a pioneering, vital activity of enormous strategic importance to the pursuit of scientific knowledge and the support of allied learning."
The report concluded with recommendations for action by the United Kingdom and included a plea for the need to support "Crossing the Chasm" between research prototypes and mainstream cyberinfrastructure. Atkins recently presented a summary of the international panel's conclusions to the NSF Advisory Committee for Cyberinfrastructure.
The NSF's Office of Cyberinfrastructure is developing a detailed implementation plan for U.S. cyberinfrastructure. Ideally, the NSF will take some of the good things from the U.K. e-science experience and avoid some of those that proved less successful!
-Tony Hey, corporate vice president, External Research, a division of Microsoft Research
“If I have seen further, it is by standing on the shoulders of giants.”—Sir Isaac Newton
Standing on the shoulders of giants is a metaphor we often use to describe how research advances. More than an aphorism, it is a mindset that we ingrain in students when they start graduate school: take the time to understand the current state of the art before attempting to advance it further. Having to justify why you have reinvented the wheel during your PhD defense is not a comfortable situation to be in. Moreover, the value of truly reproducible research is reinforced every time a paper is retracted because its results cannot be reproduced, or every time that promising academic research—such as pursuit of important new drugs—fails to meet the test of reproducibility.
Of course, to truly learn from work that has preceded yours, you need access to it. How can you build on the latest research if you don’t know its details? Thankfully, open access (OA) is making it easier to find research papers, and Microsoft Research is committed to OA. Though it’s a good start, OA articles only contain words and pictures. What about the data, software, input parameters, and everything else needed to reproduce the research?
While research software provides the potential for better reproducibility, most people agree that we are some way from achieving this. It’s not just a matter of throwing your source code online. Even though tools such as GitHub provide excellent sharing and versioning, it is up to the researcher or developer to make sure the code cannot only be re-run but also understood by others. There are still technical issues to overcome, but the social ones are even harder to tackle. The development of scientific software and researchers’ selection of which software to use and reuse are all intertwined. We at Microsoft Research are concerned with this—see “Troubling Trends in Scientific Software” in the May 17, 2013, issue of Science magazine.
Kenji Takeda talks about reproducible research and the cloud at CW14.Photo: Tim Parkinson, CC-BY
This year’s Collaboration Workshop (CW14), run by the Software Sustainability Institute (SSI), brought together likeminded innovators from a broad spectrum of the research world—researchers, software developers, managers, funders, and more—to explore the role of software in reproducible research. This theme couldn’t have been timelier, and I was excited to take part in this dynamic event again with a talk on reproducible research and the cloud. The “unconference” format—where the agenda is driven by attendees’ participation—was perfect for exploring the many issues around reproducible research and software. So, too, was the eclectic make-up of the attendees, so unlike that at more conventional conferences.
Hack Day winners receive Windows 8.1 tablets for Open Source Health Check. Left to right: Arfon Smith (GitHub), Kenji Takeda (Microsoft Research), James Spencer (Imperial College), Clyde Fare (Imperial College), Ling Ge (Imperial College), Mark Basham (DIAMOND), Robin Wilson (University of Southampton), Neil Chue-Hong (Director, SSI), Shoaib Sufi (SSI)
Instead of leaving after two days, many participants stayed on for Hack Day—a hackathon that challenged them to create real solutions to problems surfaced at the workshop. Eight team leaders had to pitch their ideas to the crowd, as the researchers and software developers literally voted with their feet to join their favorite team. The diversity of ideas was impressive, such as scraping the web to catalogue scientific software citations, extending GitHub to natively visualize scientific data, and assessing research code quality online. We made sure that teams were able to use Microsoft Azure to quickly set up websites, Linux virtual machines, and processing back-ends to build their solutions.
Arfon Smith from GitHub and I served as judges, and we had a tough time choosing a winning project. After much back-and-forth, we awarded the honor to the Open Source Health Check team, which created an elegant and genuinely usable service that combines some of the best practices discussed during the workshop. Their prototype runs a checklist on any GitHub repository to make sure that it incorporates the critical components for reproducibility, including documentation, an explicit license, and a citation file. The team worked furiously to implement this, including deploying it on Microsoft Azure and integrating it with the GitHub API, to demonstrate a complete online working system.
Recomputation.org aims to make computational experiments easily reproducible decades into the future.
In addition to our role at CW14, Microsoft Research is delighted to be supporting teams working on new approaches to scientific reproducibility as part of our Microsoft Azure for Research program:
While we still have not achieved truly reproducible research, CW14 proved that the community is dedicated to improving the situation, and cloud computing has an increasingly important role to play in enabling reproducible research.
—Kenji Takeda, Solutions Architect and Technical Manager, Microsoft Research Connections
Sitting on a plane heading back to the Pacific Northwest, I’m reflecting on the week I just spent in Minneapolis—a week of inspiration and impact at the Grace Hopper Celebration (GHC) of Women in Computing. I’m thinking about the pertinence of this year’s GHC theme, “Think Big, Drive Forward,” and how our 260-strong contingent of Microsoft employees carried that message forward. Wearing t-shirts emblazoned with the word “Innovator,” my fellow Softies and I strove to support and inspire the next generation of women computer scientists.
Aspirations in Computing Dinner Celebration at Grace Hopper.
It was invigorating to hear from Microsoft leaders Julie Larson-Green and Jacky Wright, as they, along with Maria Klawe, a Microsoft board member and president of Harvey Mudd College, informed conference attendees about career paths, technical leadership, and the future of women at Microsoft. Seeing young professionals’ eyes light up upon hearing that women comprise 29 percent of our senior leadership team, I could sense a renewed interest in careers at Microsoft.
Microsoft’s senior technical women and executives also held closed-door sessions for the company’s GHC attendees, encouraging them to drive their careers forward and be the new spirit of our company. This message took on even greater resonance, among both the Microsoft and general attendees, when it was announced that Microsoft had just been named the most inspiring American company by Forbes magazine.
While such accolades are great, we know that for our company to continue to lead technological innovations and succeed in our transformative vision of “One Microsoft,” we will need more gender diversity on our research teams. Moreover, we can build those diverse teams only if the female talent is available, which means that we need to increase the number of women who are pursuing advanced degrees in computer science. We need to take direct action, like that of my fellow researchers—A. J. Brush, Jaeyeon Jung, Jaime Teevan, and Kathryn McKinley—who spent the conference helping PhD attendees prepare their poster presentations, find their dream jobs, publish their research, and pursue career opportunities.
But attracting more women to computing is an enormous task, one that is beyond the capabilities of any one company alone. Fortunately, the country’s top computer science institutions have banned together in the National Center for Women & Information Technology Academic Alliance (NCWIT AA), a broad partnership that includes academic, nonprofit, government, and industry members. These institutions will help us truly grow the pipeline of women innovators, which is why Microsoft Research is pleased to offer them project start-up assistance through the MSR NCWIT AA Seed Fund. The seed funds are designated for initiatives that recruit and retain women in computing and IT.
My favorite part of the conference is spending time with the winners of the NCWIT Aspirations in Computing Award. This award recognizes female high school students who have the potential to become amazing computer scientists. These young women run summer camps to excite middle school girls about computer science through Aspire IT. We were excited to support this year’s camp leaders with Surface devices and Kodu Touch, which exposed young women to game development. On Wednesday we hosted a special session with past winners and Microsoft executives, and on Friday night we honored 60 winners across the United States at meet-up sessions in 12 of our Microsoft retail stores.
Pictured from left to right: Kinect aspiration winner Rochelle Willard from USC with Rane Johnson-Stempson and Rico Malvar from Microsoft Research.
On Saturday, we ended the conference by challenging attendees to “think big and drive forward” change in disaster response during the Grace Hopper Open Source Day. Free and open source software (FOSS) usage is becoming widespread, but learning how to contribute to an existing FOSS project or to release a new open source application can be daunting. Open Source Day enabled participants to spend time coding for an existing FOSS project or to get help starting their own community-developed software project. Our Microsoft Disaster Response Team led a group of young women working to create open source applications for disaster response.
This year’s GHC inspired not only me, but 4,600 other attendees, exciting us all to change the future of technology and women in computing. If every attendee would encourage and mentor just one budding female computer scientist, we could almost double the number women studying computer science today at US universities. I am extremely optimistic we will make a difference, and I can’t wait to see the technology innovations that women will drive.
—Rane Johnson-Stempson, Director, Education and Scholarly Communication, Microsoft Research Connections