I have had this recurring conversation with customers and partners. Clustering is not a simple concept, and neither is BizTalk. If you put them together you seem to have a perfect storm. There's lots of confusion, which I hope to be able to dispel with this post.
Also, I would like to have a link to send people, just so that I don't have to have the conversation again :)
The Summary
FAQ
The Easy Part
I say that the SQL clustering is the "Easy Part" not because it is simple, but because I don't feel compelled to explain the nitty gritty details in this post. I will give some of the BizTalk specifics here, and dedicate the next post to SQL cluster configuration. For now, as a rule, go with SQL 2005 Active/Active/Passive.
The Somewhat Easy Part
Clustering the ENTSSO MSS is pretty simple and the procedure is posted here. Basically, this is just a running service of the ENTSSO that also takes on the responsibility of distributing the Master Secret Key. This is the key that all the ENTSSO services running on other machines need to read the encrypted information in the SSO Database.
Ok, now the meat.
BizTalk hosts generally have instances on more than one box, as such they are already highly available. This means that clustering a BizTalk host is not generally necessary to ensure high availability. If one goes down the others will pick up the work, out-of-the-box standard config. The problem comes in when you REALLY don't want to have mutliple instances of a host running at the same time. The classic example of this is a receive port for an adapter that can not gracefully handle two threads reading it at the same time (i.e. FTP). It also happens when you need to be certain that all of the messages are picked up and processed sequentially and you don't want to code around the problem.
So you want an FTP port receive, but you can not affort duplicate message reads. That means you can have only one instance of the host, but that kills the out-of-the-box high availability. Now you need to cluster that host instance, that is the only way to ensure that exactly one instance of the host will be running (so long as at least one of the BizTalk app servers is running)
The Very Wierd Part
This is the one that people scratch their heads over. It does not happen much, and its not really all that complex. But it is not intutitive, so it messes with people's heads.
For this to really make sense, I have to back up a step and explain how the ENTSSO service generally works in a mutli-box BizTalk installation.
Figure 1 - Standard Multi-Box ENTSSO Strategy
Lets say that you have only 2 production servers. They will be running both SQL and BizTalk. Like a good boy/girl you decide to cluster SQL and the ENTSSO MSS.
Now you try to configure BizTalk host instances, and you find that they don't want to work quite right. That is because they have a dependency on the ENTSSO service, they always need it. Now that the ENTSSO service is clustered, only the host instances running on the active (ENTSSO) node actually have access to the service they need. All the other instances will fail based on dependencies.
Figure 2 - BizTalk Hosts with Broken Dependency
An an extra bonus you have messed up the High availability of the MSS as well. The clustering service detects that the host instances are depending on the ENTSSO MSS service and probably will not allow it to failover to other nodes.
...NICE...
Option 1 is to not cluster ENTSSO MSS in a 2 box scenario
Bottom line is that, if you have only budget for two prod boxes, you probably don't want to cluster the ENTSSO MSS. But this is really the only time I can think of that clustering that service is bad. If it is not clustered you better be a "Johnny on the spot" with MOM and implement a way to have MOM do a poor man's failover when it detects that the ENTSSO MSS server is down.
Option 2 is to also cluster all BizTalk host instances that depend on a clustered ENTSSO service.
If you really do want to cluster the ENTSSO MSS, then you also have to cluster all of the BizTalk hosts as well. This ensures that as the ENTSSO Service "flips" from one box to the next, the Microsoft Clustering Service can ensure that the BizTalk host instances flip at the same time and don't come crashing down.