151.3 TFlops on 8,064 cores with 90.2 percent efficiency

Windows Azure now offers customers a cloud platform that can cost effectively and reliably meet the needs of Big Compute. With a massively powerful and scalable infrastructure, new instance configurations, and a new HPC Pack 2012, Windows Azure is designed to be the best platform for your Big Compute applications. In fact, we tested and validated the power of Windows Azure for Big Compute applications by running the LINPACK benchmark. The network performance was so impressive –151.3 TFlops on 8,065 cores with 90.2 percent efficiency—that we submitted the results and have been certified as one of the Top 500 of the world’s largest supercomputers.

Hardware for Big Compute

As part of our commitment to Big Compute we are announcing hardware offerings designed to meet customers’ needs for high performance computing. We will offer two high performance configurations: The first with 8 cores and 60 GB of RAM, and a second with 16 cores with 120 GB of RAM. Both configurations will also provide an InfiniBand network with RDMA for MPI applications.

The high performance configurations are virtual machines delivered on systems consisting of:

  • Dual Intel Sandybridge processors at 2.6 GHz
  • DDR3 1600 MHz RAM
  • 10 GigE network for storage and internet access
  • InfiniBand (IB) 40 Gbps network with RDMA

Our InfiniBand network supports remote direct memory access (RDMA) communication between compute nodes. For applications written to use the message passing interface (MPI) library, RDMA allows memory on multiple computers to act as one pool of memory. Our RDMA solution provides near bare metal performance (i.e., performance comparable to that of a physical machine) in the cloud, which is especially important for Big Compute applications.

The new high performance configurations with RDMA capability are ideal for HPC and other compute intensive applications, such as engineering simulations and weather forecasting that need to scale across multiple machines. Faster processors and a low-latency network mean that larger models can be run and simulations will complete faster.

LINPACK Benchmark

To demonstrate the performance capabilities of the Big Compute hardware, we ran the LINPACK benchmark, submitted the results and have been certified as one of the Top 500 of the world’s largest supercomputers. The LINPACK benchmark demonstrates a system’s floating point computing power by measuring how fast it solves a dense by n system of linear equations Ax = b, which is a common task in engineering. This approximates performance when solving real problems.

We achieved 151.3 TFlops on 8,064 cores with 90.2 percent efficiency. The efficiency number reflects how close to the maximum theoretical performance a system can achieve, calculated as the machine’s frequency in cycles per second times the number of operations it can perform. One of the factors that influences performance and efficiency in a compute cluster is the capability of the network interconnect. This is why we use InfiniBand with RDMA for Big Compute on Windows Azure. 

Here is the output file from the LINPACK test showing our 151.3 Terraflop result. 

What’s impressive about this result is that it was achieved using Windows Server 2012 running in virtual machines hosted on Windows Azure with Hyper-V. Because of our efficient implementation, you can get the same performance for your high performance application running on Windows Azure as on a dedicated HPC cluster on-premises.

Windows Azure is the first public cloud providers to offer virtualized InfiniBand RDMA network capability for MPI applications. If your code is latency-sensitive, our cluster can send a 4 byte packet across machines in 2.1 microseconds. InfiniBand also delivers high throughput. This means that applications will scale better, with a faster time to result and lower cost.

Application Performance

The chart below shows how the NAMD molecular dynamics simulation program scales across multiple cores running in Windows Azure with the newly announced configurations. We used 16-core instances for running the application, so 32 and more cores require communication across the network. NAMD really shines on our RDMA network, and the solution time reduces impressively as we add more cores. 

How well a simulation scales depends on both the application and the specific model or problem being solved.

We are currently testing the high performance hardware with a select group of partners and will make it publicly available in 2013.

Windows Azure Support for Big Compute with Microsoft HPC Pack 2012

We began supporting Big Compute on Windows Azure two years ago. Big Compute applications require large amounts of compute power that typically run for many hours or days. Examples of Big Compute include modeling complex engineering problems, understanding financial risk, researching disease, simulating weather, transcoding media, or analyzing large data sets. Customers doing Big Compute are increasingly turning to the cloud to support a growing need for compute power, which provides greater flexibility and economy than having the work done all on-premise.  

In December 2010, the Microsoft HPC Pack first provided the capability to “burst” (i.e., instantly consume additional resources in the cloud to meet extreme demand in peak usage situations) from on-premises compute clusters to the cloud. This made it easy for customers to use Windows Azure to handle peak demand. HPC Pack took care of provisioning and scheduling jobs, and many customers saw immediate return on their investment by leveraging the always-on cloud compute resources in Windows Azure.

Today, we are pleased to announce the fourth release of our compute cluster solution since 2006. Microsoft HPC Pack 2012 is used to manage compute clusters with dedicated servers, part-time servers, desktop computers, and hybrid deployments with Windows Azure. Clusters can be entirely on-premises, can be extended to the cloud on a schedule or on demand, or be can be all in the cloud and active only when needed.                                                                                                                                                                   

The new release provides support for Windows Server 2012. Features include Windows Azure VPN integration for access to on-premises resources, such as license servers, new job execution control for dependencies, new job scheduling policies for memory and cores, new monitoring tools, and utilities to help manage data staging.

Microsoft HPC Pack 2012 will be available in December 2012.

Big Compute on Windows Azure today

Windows Azure was designed from the beginning to support large-scale computation. With the Microsoft HPC Pack, or with their own applications, customers and partners can quickly bring up Big Compute environments with tens of thousands of cores. Customers are already putting these Windows Azure capabilities to the test, as the following examples of large-scale compute illustrate.

Risk Reporting for Solvency II Regulations

Milliman is one of the world's largest providers of actuarial and related products and services. Their MG-ALFA application is widely used by insurance and financial companies for risk modeling, it integrates with the Microsoft HPC Pack to distribute calculations to HPC clusters or burst work to Windows Azure. To help insurance firms meet risk reporting for Solvency II regulations, Milliman also offers MG-ALFA as a service using Windows Azure. This enables their customers to perform complex risk calculations without any capital investment or management of an on-premises cluster. The solution from Milliman has been in production for over a year with customers running it on up to 8,000 Windows Azure compute cores.

MG-ALFA can reliably scale to tens of thousands of Windows Azure cores. To test new models, Milliman used 45,500 Windows Azure compute cores to compute 5,800 jobs with a 100 percent success rate in just over 24 hours. Because you can run applications at such a large scale, you get faster results and more certainty in the outcomes as a result of not using approximations or proxy modelling methods. For many companies, complex and time-consuming projections have to be done each quarter. Without significant computing power, they either have to compromise on how long they wait for results or reduce the size of the model they are running. Windows Azure changes the equation.

The Cost of Insuring the World

Towers Watson is a global professional services company. Their MoSes financial modeling software applications are widely used by life insurance and annuity companies worldwide to develop new offerings and manage their financial risk. MoSes integrates with the Microsoft HPC Pack to distribute projects across a cluster that can also burst to Windows Azure. Last month, Towers Watson announced they are adopting Windows Azure as their preferred cloud platform.

One of Towers Watson’s initial projects for the partnership was to test the scalability of the Windows Azure compute environment by modeling the cost of insuring the world. The team used MoSes to perform individual policy calculations on the cost of issuing whole life policies to all seven billion individuals on earth. The calculations were repeated 1,000 times across risk-neutral economic scenarios. To finish these calculations in less time, MoSes used the HPC Pack to distribute these calculations in parallel across 50,000 compute cores in Windows Azure.

Towers Watson was impressed with their ability to complete 100,000 hours of computing in a couple of hours of real time. Insurance companies face increasing demands on the frequency and complexity of their financial modeling. This test demonstrated the extraordinary possibilities that Windows Azure brings to insurers. With Windows Azure, insurers can run their financial models with greater precision, speed and accuracy for enhanced management of risk and capital.     

Speeding up Genome Analysis

Cloud computing is expanding the horizons of science and helping us better understand the human genome and disease. One example is the genome-wide association study (GWAS), which identifies genetic markers that are associated with human disease.

David Heckerman and the eScience research group at Microsoft Research developed a new algorithm called FaST-LMM that can find new genetic relationships to diseases by analyzing data sets that are several orders of magnitude larger than was previously possible  and detecting more subtle signals in the data than before.

The research team turned to Windows Azure to help them test the application. They used the Microsoft HPC Pack with FaST-LMM on 27,000 compute cores on Windows Azure to analyze data from the Wellcome Trust study of the British population. They analyzed 63,524,915,020 pairs of genetic markers, looking for interactions among these markers for bipolar disease, coronary artery disease, hypertension, inflammatory bowel disease (Crohn’s disease), rheumatoid arthritis, and type I and type II diabetes.

Over 1,000,000 tasks were scheduled by the HPC Pack over 72 hours, consuming the equivalent of 1.9 million compute hours. The same computation would have taken 25 years to complete on a single 8-core server. The result: the ability to discover new associations between the genome and these diseases that could help potential breakthroughs in prevention and treatment.

Researchers in this field will have free access to these results and independently validate their own lab’s results. These researchers can calculate the results from individual pairs and the FaST-LMM algorithm on-demand with free access in the Windows Azure Data Marketplace.

Big Compute

With a massively powerful and scalable infrastructure, the new instance configurations, and the Microsoft HPC Pack 2012, Windows Azure is designed to be the best platform for your Big Compute applications.

We invite you to let us know about Big Compute interests and applications by contacting us at – bigcompute@microsoft.com.

     -  Bill Hilf, General Manager, Windows Azure Product Marketing.