We’re going to start off this blog with a couple of basic posts explaining what Dryad and DryadLINQ are. I’m going to start by introducing Dryad, and Yuan will follow on with DryadLINQ.
Dryad is an execution engine that is optimized for certain types of distributed computation. It relies on the fact that its inputs are immutable and finite, and the outputs of a job are not generally available until the entire computation completes (i.e. it doesn’t support “infinite” streaming computations). Dryad is optimized for medium to large clusters (say, between 100 computers and 10,000 computers), and for long-running batch jobs rather than sub-second interactive lookups. It has mostly been used for data-intensive computations which can benefit from streaming data to or from many disks at once, and which can be broken into parallel stages that do not require very frequent communications. If a program can be broken into pieces that can do more than 20 or 30 seconds of IO or computation before needing to communicate then it can probably be run efficiently on Dryad. You can get more details about Dryad from the papers and presentations at our project page. Later on we will have some posts talking about the design choices described above, and whether or not they could easily be relaxed.
The basic computational model that Dryad uses is a directed-acyclic graph (DAG). This is a very flexible abstraction, and it permits efficient execution plans for a great many algorithms, including most optimized database-style queries, and iterative algorithms such as k-means, matrix power iteration, and graph traversal. Future posts will give lots of examples of such applications from domains like data-mining, machine learning and computer vision. Most users of Dryad never construct these DAGs by hand, though: instead there is a higher-level language layer that compiles user-friendly syntax into the DAG form that Dryad can understand. Several programming layers have been written on top of Dryad, and the one this blog will mostly talk about is direct integration with .NET languages via LINQ, i.e. DryadLINQ. I will write a post later comparing the use of a DAG abstraction with MapReduce.
As mentioned above, Dryad is mostly used as middleware below a high-level language layer. Below Dryad in turn is a cluster-management system that supports some low-level actions like starting a process on a remote computer, and one or more distributed storage systems that support “partitioned files,” i.e. datasets that have been split into chunks that can be striped across the computers in the cluster. Dryad was deliberately written with an abstraction layer hiding the cluster services and file systems so that it could be ported to different infrastructures. Most of the Dryad development has been done on top of a Microsoft internal cluster infrastructure called Cosmos that was developed by the Bing product group. More recently we (in collaboration with the HPC product group and Microsoft’s External Research team) ported Dryad to run on the HPC cluster services and this port is available free for non-commercial use from this page on the Microsoft Research site. In the future we hope to make a port to the Azure cloud infrastructure available. The public release currently doesn’t have much of a distributed file-system, and users have to semi-manually manage partitioned files (the user guide that comes as part of the download explains all this). We have a simple, lightweight distributed file system in the works that we are hoping to release before too long.
Dryad has been doing the heavy-lifting for all of Bing’s data-mining since mid-2006, these days mostly programmed using a language called SCOPE that the Bing team developed. It has also spread to a lot of smaller clusters inside Microsoft, used for tasks like botnet detection, computational biology and computer vision. Our research lab in Silicon Valley has a cluster of around 250 computers that is shared by the 70 or so researchers here, and used both for research on the system and as a computational resource. Over half of the researchers and interns in our lab have used Dryad in some way over the past few years, and we hope with the availability of Dryad and DryadLINQ (download them here) that many more people will have the opportunity to try it out.