HPC has never really appeared on my radar before. All this stuff about "Compute Clusters" and "Grid computing" has pretty much passed me by. At least that was true until end of 2007 - when two of my UK Finance ISVs started to get interested in it.

Here is my attempt to place it on your radar :-)

A swift intro to HPC, Computer Clusters and...

I think most of us are happy with  the general idea of a Computer Cluster, even if we don't use the term. A Computer Cluster is a group of loosely coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer.

There are several types of Computer Clusters out there:

  1. High Availability e.g. SQL Server Mirroring or Microsoft Cluster Services (MSCS) are examples many of us are familiar with.
  2. Load Balancing e.g. Network Load Balancing
  3. Grid - grids connect collections/clusters of computers to perform CPU intensive computing tasks. Check Wikipedia for one definition - that by no means is THE definition :)

Which brings us to High Performance Computing which refers to many things - but increasingly is applied when talking about Computer Clusters that are being used to perform large amounts of processing by sharing the load across a cluster of machines (an implementation of Option 3)

Which brings me to - Windows Compute Cluster Server 2003...

Microsoft entered the HPC market with Windows Compute Cluster Server 2003. There still remains a fair amount of confusion over what this actually is. Here is my attempt to clarify:

  • Windows Compute Cluster Server 2003 = Windows Server 2003 Computer Cluster Edition + Windows Compute Cluster Pack
  • Windows Server 2003 Computer Cluster Edition = a 64bit Windows Server 2003 Standard Edition licensed for computationally intense workloads (and therefore not licensed as a fileserver, a database server etc). Nothing is different in the OS - this is just a different license.
  • Windows Compute Cluster Pack = tools and apis which help you create applications to run in a cluster - SDKS, sample code, schedulers

In practice this means that we have ISVs delivering HPC solutions that:

  • are only interested in Windows Server 2003 CCE because of the attractive (i.e. cheaper) licensing vs 64bit standard. They use their own schedulers etc. You may well be an ISV that fits this model and can benefit from the license model of CCE. Check it out!
  • and we have ISVs interested in using our SDK, tools and runtime to build HPC solutions

But... things are a changing - Windows HPC Server 2008

Now you know roughly what Windows Compute Cluster Server 2003 is, you can start to wipe that name from your memory as it gets a shiny new name (and new features!) for 2008 - Windows HPC Server 2008. You will also see Microsoft marketing talking about HPC++ - and I thought adding a ++ had gone out of fashion ...

Windows HPC Server 2008 is currently at Beta 1 and was announced in Nov 2007. We have a decent technical overview document which describes the new features and the changes since Windows Compute Cluster Server 2003. With my "dev hat" on I will call out:

  • Integration with the Windows Communication Foundation, allowing SOA application developers to harness the power of parallel computing offered by Windows HPC Server 2008
  • Support for external databases for Job Scheduling database
  • Support for Open Grid Forum’s HPC-Basic Profile interface
  • Improved management console with support for Windows PowerShell scripting

Which leads me to ... what can you do with it?

The above may be a (moderately) interesting read - but what can you actually do with Windows HPC Server 2008 (or Windows Computer Cluster Server 2003)? The short answer is - processing that needs LOTS of cpu cycles and can in some form be partitioned. If you have a problem that can be solved by applying considerable computing power at modest cost, then a Compute Cluster is a candidate architecture.

Using a Compute Cluster architecture gives you:

  • Mainframe class performance at a fraction of the cost
  • Low start up (A cluster of one node?) and operational costs
  • Ability to easily grow your computing power - just add nodes
  • Option to reduce and re-purpose your computing power - just remove nodes
  • Avoid hardware "lock ins" - nodes do not have to be identical
  • High reliability - nodes can fail independently (if you plan for it!)

You can find case studies etc at http://www.microsoft.com/windowsserver2003/ccs/hpcplus.aspx 

And would I use it?

When I look back at solutions I worked on, I realise that Compute Clusters would have worked great for a number of those solutions. One that sticks out for me was a manufacturing system that modelled an Oil processing terminal to allow planners to have accurate projections of the throughput of the terminal to operate at maximum capacity whilst avoiding potentially very serious issues. The system was Unix based - but for the detailed modelling we relied on the planners own Windows PCs. The server generated an Excel spreadsheet based on parameters set by the planners and then the spreadsheet was repeatedly run on the PC for up to 100 different what-if scenarios with the output pumped back to the Unix server. It could take many hours to run - hence every planner was given the fastest PC we could purchase. Yet - for most of the working week planners used these machines for little more than corporate email and document production.

If I was building this again I would use Windows HPC Server 2008. The computing power would exist as hundreds of nodes which would allow complex what-ifs (of 100 scenarios or more) to return their results to planners in less than a minute. I would not be alone in doing this :)

I will do two more posts on HPC.