Download Research Tools
“If I have seen further, it is by standing on the shoulders of giants.”—Sir Isaac Newton
Standing on the shoulders of giants is a metaphor we often use to describe how research advances. More than an aphorism, it is a mindset that we ingrain in students when they start graduate school: take the time to understand the current state of the art before attempting to advance it further. Having to justify why you have reinvented the wheel during your PhD defense is not a comfortable situation to be in. Moreover, the value of truly reproducible research is reinforced every time a paper is retracted because its results cannot be reproduced, or every time that promising academic research—such as pursuit of important new drugs—fails to meet the test of reproducibility.
Of course, to truly learn from work that has preceded yours, you need access to it. How can you build on the latest research if you don’t know its details? Thankfully, open access (OA) is making it easier to find research papers, and Microsoft Research is committed to OA. Though it’s a good start, OA articles only contain words and pictures. What about the data, software, input parameters, and everything else needed to reproduce the research?
While research software provides the potential for better reproducibility, most people agree that we are some way from achieving this. It’s not just a matter of throwing your source code online. Even though tools such as GitHub provide excellent sharing and versioning, it is up to the researcher or developer to make sure the code cannot only be re-run but also understood by others. There are still technical issues to overcome, but the social ones are even harder to tackle. The development of scientific software and researchers’ selection of which software to use and reuse are all intertwined. We at Microsoft Research are concerned with this—see “Troubling Trends in Scientific Software” in the May 17, 2013, issue of Science magazine.
Kenji Takeda talks about reproducible research and the cloud at CW14.Photo: Tim Parkinson, CC-BY
This year’s Collaboration Workshop (CW14), run by the Software Sustainability Institute (SSI), brought together likeminded innovators from a broad spectrum of the research world—researchers, software developers, managers, funders, and more—to explore the role of software in reproducible research. This theme couldn’t have been timelier, and I was excited to take part in this dynamic event again with a talk on reproducible research and the cloud. The “unconference” format—where the agenda is driven by attendees’ participation—was perfect for exploring the many issues around reproducible research and software. So, too, was the eclectic make-up of the attendees, so unlike that at more conventional conferences.
Hack Day winners receive Windows 8.1 tablets for Open Source Health Check. Left to right: Arfon Smith (GitHub), Kenji Takeda (Microsoft Research), James Spencer (Imperial College), Clyde Fare (Imperial College), Ling Ge (Imperial College), Mark Basham (DIAMOND), Robin Wilson (University of Southampton), Neil Chue-Hong (Director, SSI), Shoaib Sufi (SSI)
Instead of leaving after two days, many participants stayed on for Hack Day—a hackathon that challenged them to create real solutions to problems surfaced at the workshop. Eight team leaders had to pitch their ideas to the crowd, as the researchers and software developers literally voted with their feet to join their favorite team. The diversity of ideas was impressive, such as scraping the web to catalogue scientific software citations, extending GitHub to natively visualize scientific data, and assessing research code quality online. We made sure that teams were able to use Microsoft Azure to quickly set up websites, Linux virtual machines, and processing back-ends to build their solutions.
Arfon Smith from GitHub and I served as judges, and we had a tough time choosing a winning project. After much back-and-forth, we awarded the honor to the Open Source Health Check team, which created an elegant and genuinely usable service that combines some of the best practices discussed during the workshop. Their prototype runs a checklist on any GitHub repository to make sure that it incorporates the critical components for reproducibility, including documentation, an explicit license, and a citation file. The team worked furiously to implement this, including deploying it on Microsoft Azure and integrating it with the GitHub API, to demonstrate a complete online working system.
Recomputation.org aims to make computational experiments easily reproducible decades into the future.
In addition to our role at CW14, Microsoft Research is delighted to be supporting teams working on new approaches to scientific reproducibility as part of our Microsoft Azure for Research program:
While we still have not achieved truly reproducible research, CW14 proved that the community is dedicated to improving the situation, and cloud computing has an increasingly important role to play in enabling reproducible research.
—Kenji Takeda, Solutions Architect and Technical Manager, Microsoft Research Connections
Most parents want their children to have access to the best educational opportunities at schools with broad, enriching curricula. Students attending such schools may find themselves challenged with finding sufficient time to study any one subject adequately—in or out of the classroom. MyCloud, an innovative e-learning platform developed by Microsoft Research Asia, helps solve this problem by providing students and teachers with an interactive space for collaboration, exploration, and enrichment.
Students from Singapore Nan Chiau primary school, which has been using the MyCloud e-learning platform since 2011 for Chinese language instruction.
Originally developed to assist in teaching Chinese to students in Singapore, MyCloud is a web-based, interactive platform that allows teachers and students to extend learning beyond the classroom. Students can use a tablet, smartphone, laptop, or desktop computer to access the platform and complete assignments. Having grown up with technology, today’s students are very comfortable with using it in an educational setting; in fact, the high-tech aspect of this innovative platform captures students’ attention and interest, promoting engagement and fueling an intrinsic motivation to excel academically.
By using MyCloud, teachers can upload assigned lessons directly, knowing that their students can readily access their assignments and easily follow their instructions. It also enables teachers to upload supplemental activities and lessons, thereby complementing and expanding upon the material covered during limited classroom time. These supplemental activities not only broaden and enhance course content; they allow students to learn at their own pace, as the student controls MyCloud. Students can take uploaded tests on the e-learning platform to help them assess their progress with their studies. And the audio component is particularly valuable to students who are learning a foreign language, as they can practice their speaking and listening skills and readily learn new vocabulary. Video uploads will soon be added to promote students’ learning even further.
The value of this e-learning platform is evident at Nan Chiau Primary School in Singapore, which has been using it since 2011 for Chinese language instruction. Teachers at Nan Chiau understand that students must complete time-consuming exercises to learn Chinese vocabulary and tonal inflections, but the time allotted for classroom instruction is limited. MyCloud has allowed the students to pursue their mastery of Chinese on their own time and at their own pace, reinforcing the significance of the rate at which individuals learn, while enhancing students’ enjoyment of learning. Students have shown increased proficiency in Chinese language as a result of using the e-learning platform, and Microsoft’s partnership with Nan Chiau Primary School demonstrates how schools can successfully use its technology to enhance learning and empower students.
—Winnie Cui, Microsoft Research Asia, Senior UR Manager
Scientists around the world are striving tirelessly to monitor and model the environment—to understand the intricate workings of our ecosystem—so that policymakers can make informed decisions that lead to a sustainable future for “spaceship Earth.” This research involves using the thousands of available environmental datasets, on everything from agriculture and biodiversity to climate and the oceans. But finding, browsing, choosing, and downloading the right data can be ridiculously hard, even for the experts.
What if finding environmental data were as simple as clicking on a map?
Draw a box around the geographic area you’re interested in, select the environmental information you want, and view the data on Bing Maps within seconds
Enter FetchClimate, a tool that makes locating environmental information as easy as searching for a hotel or coffee shop online. Just draw a box around the geographic area you’re interested in, select the environmental information you want, and view the data on Bing Maps within seconds. What used to take researchers hours, days, or even weeks can now be done very quickly—by anyone. When possible, FetchClimate calculates data uncertainty, so you know how reliable the information is, and the tool allows you to specify precisely the size of the area and the period of time for your query.
FetchClimate runs in the cloud, on Microsoft Azure, meaning there is no physical limit on how much information can be added. You can not only look at historical climate data but also peer into the future, as we have included forecast data from the latest climate simulation experiments. For example, you can see what the predicted temperature or precipitation in your area will be in 2050.
Visualization of year-to-year precipitation averages in southern Asia
The Computational Ecology and Environmental Science group in Microsoft Research has spent several years developing FetchClimate, working with Moscow State University, which provided software development, and the DigiLab at the London College of Communication, which designed an interface that makes finding and understanding environmental information stress-free. So we’re excited to be releasing FetchClimate—in three different ways—for anyone to use for research, study, or just to satisfy their curiosity about our planet.
The deployment package will be attractive to individuals, research teams, national laboratories, and international collaborations who are used to dealing with geographical data and are keen to share it with colleagues and the outside world in a more dynamic way. For example, Ireland’s Marine Institute has created the Irish Digital Ocean–SMART Marine Research Platform to stimulate collaborative research across the marine sector. As Eoin O’Grady, Information Services & Development Manager at the Marine Institute, explains, “FetchClimate greatly simplifies access to scientific data, promoting reuse. We see it as an excellent way to share Irish marine research data, part of the Irish Digital Ocean, with a broad range of users in the marine community, to support research and innovation and as input into public information services."
In addition, we are currently sponsoring a special Climate Data Initiative that offers grants of Microsoft Azure resources to help early adopters set up their own FetchClimate-powered services. Using the deployment package, you will be able to implement your own instance of FetchClimate, including your datasets and a web front end that is customized for your own site—and we’ll provide the space on Azure! If you would like to pursue this, please submit a proposal by June 15, 2014. We will be selecting 40 awardees from among these proposals.
We created FetchClimate as a way to turn data into actionable information, and to make that information easily available to the world. There are some exciting features that we haven’t discussed here (hint: what if you could upload a model, not just data?), and FetchClimate is just one of several exciting tools for environmental science that we are developing. All of these tools illustrate how, with a bit of imagination, we can begin to deliver research-as-a-service on Microsoft Azure. We hope these tools will help scientists, policymakers, and the public become more informed and better equipped to take care of our planet.
—Kenji Takeda, Solutions Architect and Technical Manager, Microsoft Research
—Kristin Tolle, Director of Environmental Science Infrastructure Development, Microsoft Research