Download Research Tools
Plant biologists in Brazil are working to develop a better understanding of tropical ecosystems—how they work and how they impact climate change, not only in the region, but worldwide. These researchers are dedicated and disciplined. They’re in the field from dawn to dusk, working through rain, wind, heat, and cold, applying all of their energy to understanding these complex ecosystems. This is intense observational work: they take copious notes and then, after grueling hours in the field, they return to their labs and flesh out their field notes in detail, striving to fully capture and make sense of what they observed. It all adds up to a long day that can take a toll on even the most committed researchers.
At the University of Campinas (better known as UNICAMP), computer science professors Ricardo Torres and Cecilia Baranauskas are exploring solutions that might help these overworked field researchers. The professors’ computer science students are creating environmental data-management apps that allow plant biologists to go to the field, observe the ecosystem, take notes by using digital devices, and then push that data to the cloud. (This work is an outgrowth of a project in e-phenology, which is supported by the Microsoft Research–FAPESP Institute for IT Research.)
Environmental data-management app for recording and sharing field observations
The environmental data-management apps should increase the precision and accuracy of the recorded data, eliminating the errors that often creep in during the transcription of handwritten notes. The ready availability of previously entered data will enable researchers in the field to easily compare new observations to past ones and to enter new information by updating a few spreadsheet cells. Moreover, by pushing the data to the cloud, it will be available to colleagues no matter where they are, enabling real-time collaboration between the researcher in the field and the team back in the lab.
With the goal of generating a variety of application ideas, the professors have split their computer science classes into multiple groups, each of which proposes a solution. Then they iterate. They talk with the plant biologists and accompany them to the field, in order to understand their needs. If all goes as planned, these students will devise applications that enable biologists to more fully record their observations in real time and preserve the record quickly, safely, and accessibly in the cloud. And what a nice convergence of high-tech computer science and shoe-leather biology that will be!
—Juliana Salles, Senior Research Program Manager, Microsoft Research Connections
In this era of big data, researchers are relying more and more on data mining to help them with their research. Researchers from nearly every field (not to mention businesses from almost every sector) are slicing, dicing, and sifting an exponentially growing mass of data, looking for patterns, trends, and insights. This is powerful stuff, and the essence of the data-intensive “fourth paradigm” of scientific inquiry.
Powerful, yes, but also complex. Data mining requires numerous steps: data understanding, data cleaning, model creation, and model comparison. Fortunately, there are new tools for Microsoft Excel that make each step simpler and combine them more seamlessly.
New add-ins for Microsoft Excel that simplify data mining are available to download.
These tools, collectively known as the Microsoft SQL Server 2012 SP1 Data Mining Add-ins for Office (just rolls off the tongue, yes?) are the product of a joint effort between the Data Mining SQL team and the Microsoft Research Machine Learning and Applied Statistics group. The tools are available for download.
Microsoft Data Mining Add-ins help you take advantage of SQL Server predictive analytics in Microsoft Excel and Microsoft Visio. The download includes the following components:
This integrated, comprehensive set of tools should make life simpler for anyone with big data to mine.
—David Heckerman, Distinguished Scientist, Microsoft Research —Raman Iyer, Principal Group Manager (Development), SQL Server Business Intelligence, Microsoft Corporation
First to explain… no, there is no time. Let me sum up: you are a scientist with complex geospatial data visualization challenges. We at Microsoft Research have a solution for you and we’re enhancing this through the release of a software library called Narwhal. (We threw in some example applications as well.) The parent project is Layerscape and the geospatial stories are told by using the WorldWide Telescope visualization engine. The release of Narwhal is in line with our philosophy of “As long as we’re going to build some tools, let’s share them and save others having to re-invent.” Inconceivable! For more: read on!
Suppose you have some data that you’d like to look at… and it is complicated data. What do I mean by complicated? Perhaps you have a model of an electrical impulse travelling through a maze of 7,000 neurons. Or you have recovered the dive trajectories for the 43 Weddell seals you tagged last summer, or you just derived the magnetic field interactions between Jupiter and Callisto, or the Jaguar supercomputer has finally finished your solution for the thermodynamic structure of the Earth. Let’s run through the two questions that occur to the data visualizer—you—at a time like this: What format should my data be in? And how do I look at it?
WorldWide Telescope visualization of data on Puget Sound water flow
Unfortunately, there is as yet no single answer to these two questions; and to be fair, you probably already know what format your data is in (be it MATLAB, Comma Separated Value, NetCDF-CF, Microsoft Excel, or whatever). But because your data is complicated, you find it difficult to render and examine on your laptop. Well, we built WorldWide Telescope (WWT) to take advantage of your PC graphics card and now you can look at 500,000 data points as they unfold in time; watch this tour to get the idea. The ability to see the data is just the beginning; we are painfully aware that even though you can see the data, there are lots of other tasks to perform before it is useful, and that is why we built both the Layerscape website (to support content sharing) and the WorldWide Telescope Add-in for Excel (to help you import your data into WWT). All of this you can learn about at Layerscape.
So far, so good; but if you are really a technical programmer, you will see more potential here—more visualization power—than you can readily access by using Excel. In fact, you may want to be able to connect directly from your software—which helps make sense of your data—to WWT where that data will appear as pixels and lines and circles and polygons and moving sidewalks and drifting balloons and neural impulses and seal-dive trajectories and magnetic fields. Enter Narwhal: software that helps you organize your data and send it to WWT. Narwhal is in its first release, so it is not the ultimate solution, but it does take big jump in that direction. To see what sorts of things Narwhal can help you do, take a look at this video.
To wrap this up: we are certain that visualization is a key to understanding data, and that humans—and specifically, researchers—are increasingly good at deluging ourselves with massive, complex, hard-to-understand datasets. At Microsoft Research we are both happy and fortunate to get to work on related tools: Layerscape, WorldWide Telescope, and the WWT Add-in for Excel… and now Narwhal. We hope that they find their way to the scientists and educators who need them—and we will continue to refine them, so watch this blog for updates.
—Rob Fatland, Senior Research Program Manager, Microsoft Research Connections