Download Research Tools
Question: What precocious five-year old is writing parallel code to make the most efficient use of multi-core processors?
Answer: The Barcelona Supercomputing Center (BSC)–Microsoft Research Centre in Barcelona, Spain, also known as BSCMSRC by those who enjoy trying to pronounce acronyms that contain no vowels.
From left to right: Andrew Blake, managing director, Microsoft Research Cambridge; Fabrizio Gagliardi, director, Microsoft Research Connections EMEA; Maria Ribera, dean of Barcelona School of Informatics; Rick Rashid, senior vice president of Microsoft Research; Antoni Giró, president, Rector of Technical University of Catalonia - Universitat Politècnica de Catalunya; and Mateo Valero, director, Barcelona Supercomputing Center
Okay, so it was a trick question. But the Centre, which celebrates its fifth anniversary on November 2, 2011, truly is a precocious operation, producing code that makes it easy for programmers to develop parallel-processing software. This is vital because everything—from smart phones and tablets, to PCs and supercomputers—is sprouting extra cores so users can do more. A joint venture of BSC and Microsoft Research, the BSCMSRC brings together the expertise of hardware and software researchers from BSC and software mavens from Microsoft Research.
One technology that the BSCMSRC researchers have been looking at is transactional memory (TM). TM makes it easier to write parallel programs that frequently share data, a process that otherwise requires complex and unwieldy programs. The Centre has developed sophisticated TM applications to date, QuakeTM and Atomic Quake. These applications, which are based on the open-source Quake game server, will be useful in evaluating TM-equipped chips. As part of the €4 million VELOX project funded by the European Commission, BSCMSRC has coordinated the development of a fully integrated TM system that includes hardware simulators, language runtime systems, and compiler support alongside the new TM applications.
BSCMSRC researchers have also developed a dataflow programming model called StarsS, in which data that is produced and consumed in applications automatically “flows” at program runtime. This frees the programmer from explicitly architecting data movements in his or her application and makes it much easier to develop software. BSCMSRC researchers are integrating the StarsS programming model with the Barrelfish research OS, a new message-passing, open-source operating system being developed by Microsoft Research and ETH Zurich.
“BSC’s expertise in computer architecture has been a great fit with our expertise in programming language implementation,” notes Tim Harris, senior researcher at Microsoft Research Cambridge. “This cross-disciplinary approach has led to proposals for new, general-purpose hardware features to accelerate the language runtime systems that underpin modern languages such as Haskell and C#.”
In marking the BSCMSRC’s fifth anniversary, BSC Director Mateo Valero commented “I am proud of the impact of the work done by a very young team at the Centre in our five years of existence. With the multidisciplinary competences of our research personnel, the Centre is in a unique position to influence both hardware and software design. I am also very happy to see Microsoft Research being a major actor in our little Silicon Port at Barcelona in the Mediterranean.”
Fabrizio Gagliardi, Microsoft Research Connections director for Europe, the Middle East, and Africa—and Mateo’s counterpart in this adventure—adds, “Our collaboration with Mateo and his team of computer architects goes a long time back and was the foundation for this joint endeavor. I am very pleased and proud for the results of this collaboration and the resonance and the impact that this is having worldwide.”
—Kenji Takeda, Solutions Architect and Technical Manager, Microsoft Research Connections EMEA
The challenge of DNA sequencing is central to all genomics research, and while the technology has existed since the 1970s, today’s massively-parallel sequencing instruments are capable of producing gigabytes of raw genomic data quickly and increasingly cheaply. Reconstruction of a DNA sequence from this data (for example, through de novo assembly) is a compute-intensive task, and experimentation has shown that data quantity is no substitute for quality when it comes to the accurate reconstruction of a DNA sequence. Unfortunately, not all sequencing technologies produce reliable and accurate results, and experimental data will always contain varying rates of error. Therefore, a preliminary quality control (QC) step is regularly employed to detect and counteract such sequencing errors.
The QC of sequencing results may range from simple manual filtering procedures to comprehensive automated solutions. To contribute to this area of QC tools development, we present Sequence Quality Control Studio (SeQCoS), a Microsoft .NET software suite that is designed to perform an array of QC evaluations and post-QC manipulation of sequencing data. SeQCoS generates a series of standard plots that illustrate the quality of the input data. These plots (saved in JPEG file format) provide information on commonly observed measurements, such as GC content (the proportion of guanine and cytosine nucleotide bases in a DNA sequence), and distribution of quality scores at position-specific and sequence-specific levels. In order to filter out poorly performing sequences, SeQCoS also conducts basic trimming and discarding functions to manipulate sequence files.
At Microsoft Research, the Microsoft Biology Initiative team is collaborating with academic research groups in the sequencing of various organisms. To ensure that the sequenced sample is not contaminated by other strains or sequencing vectors, SeQCoS optionally integrates NCBI BLAST for PCs running the Windows operating system to search against a BLAST-formatted database. We provide a pre-formatted database of NCBI UniVec, a repository of vector sequences, adapters, linkers and PCR (polymerase chain reaction) primers that are used in DNA sequencing; however, researchers can use a different database if they prefer.
About the Tools
SeQCoS was written in C#, using the .NET Bio (formerly the Microsoft Biology Foundation [MBF]) bioinformatics toolkit and Sho, a data analysis and visualization application. It is freely available as open-source code under the Apache 2.0 license. Further details and software downloads are available from Sequence Quality Control Studio.
.NET Bio is a library of common bioinformatics functions (file parsers, algorithms, and web service connectors) that simplify the creation of bioinformatics applications on the .NET platform and is an open-source project that is freely available for academic and commercial use under the Apache 2.0 license. While this project was initiated by Microsoft Research, it is owned by the Outercurve Foundation, a non-profit organization, and is governed by a growing community of users and contributors.
—Kevin Ha, Microsoft Research Intern
On October 25, 2011, Microsoft Research Connections released an update to Zentity, a repository platform designed to manage research objects—such as journal articles, reports, datasets, projects, and people—as well as the relationships among them. Zentity supports arbitrary data models, and provides semantically rich functionality that enables users to find and visually explore interesting relationships among elements by using the Microsoft Silverlight PivotViewer control and Microsoft Research Visual Explorer.
With the 2.1 release, Zentity now includes the Resource Manager web user interface that provides better content management capabilities via easier ways to query the database, review and update records, and create and edit relationships among items. The Resource Manager will work with custom data models and even enables users to save searches for later use. Zentity 2.1 also offers the option to install a localized Spanish-language version of the software.
I would like to highlight and thank a few of our partners who have been working with a variety of institutions to customize their Zentity deployments.
Building Blocks has partnered with the UK Economic and Social Research Council (ESRC) to expose the ESRC’s catalog of research projects and their outputs. The ESRC catalog contains more than 100,000 research objects, including books and journal articles as well as research outcomes and impact reports. The PivotViewer control integrated into Zentity 2.1 provides a visually compelling yet simple way for end-users to browse, filter, and explore decades’ worth of ERSC grant data and to find relevant research reports.
In a case study on this project, Building Blocks wrote:
Zentity was seen as the ideal research repository solution as it can handle the complex data models, whilst also providing data access in many open formats. In addition the team designed a more intuitive and robust backend system to enable ESRC support teams to manage the submission of research outputs, reducing management overhead. The quality and consistency of the data was also improved by ensuring the internal workflows were more efficient and allowing integration with other academic data sources such as SHERPA/RoMEO.
Meanwhile, in Scotland, Company Net partnered with Queen Margaret University to create an online experience for the digital archive of content from the Homecoming Scotland 2009 events. A Scottish government initiative, Homecoming Scotland 2009 was a year-long celebration of Scottish culture and achievements. The archive site also uses the PivotViewer control to make it easy to pivot among the people, places, and events associated with the Homecoming Scotland 2009 celebrations.
And finally, working with a collection of researcher data and electronic theses and dissertations at the Jorge Tadeo Lozano University (UJTL) in Bogotá, Colombia, Microsoft Partner Softtek delivered a solution localized in Spanish and customized to the needs of the researchers and integrated into UJTL’s environment. In his Softtek blog, Antonio Macias writes:
Having partnered with Microsoft Research in the deployment of Zentity 2.0 has definitely been an enriching experience for us since, on one hand, we have demonstrated Softtek’s continuous commitment to deliver high-quality services while working jointly with a highly respected high-tech company like Microsoft. We have been exposed to emergent technologies that will shape our world in the next 5 or 10 years. Indeed this exposure will help us add a fresh perspective to the set of solutions that we already provide to our large base of customers.
Zentity 2.1 is a freely available via download from Microsoft Research. I hope that you’ll give it a try, and if you are looking for partners to help on a deployment project, that you’ll use the Microsoft Partner Network.
—Alex Wade, Director for Scholarly Communication, Microsoft Research Connections