A few weeks ago, I was looking into running a MySQL Galera Cluster for a customer with a large Linux IAAS deployment on Azure.

Why that? There’s ClearDB, a Microsoft Partner that offers MySQL on Azure as SaaS (Software as a service), so you can go to https://www.cleardb.com/store/azure and pick your size. Or, if you want to run it on your own, you can pick a Ubuntu Linux Gallery image and type “apt-get install mysql-server” and that’s it, right? Well, not so fast…

ClearDB is a great offering for most customers that need a MySQL backend, but in this case, even the largest ClearDB offer was not sufficient.

So the customer followed the second path down, he created an IAAS VM (actually several VMs which each run an independent database server for different purposes) and configured his services to use these databases via the internal IP addresses of these servers. But there’s one problem with this approach: Occasionally, Azure needs to deploy patches to the host systems running these VMs. And occasionally, the Linux VMs also need patches that require a restart of the database server or a reboot of the machines. Whenever this happened, the customer site would be down for a few minutes. 

To avoid this occasional downtime, I teamed up with Oli Sennhauser, CTO at FromDual and my colleague Christian Geuer-Pollmann to set up a MySQL Galera Cluster on Azure.

Such a cluster consists of three MySQL VMs. Database connections can be handled by all three machines, so the DB (read) load is distributed as well. As long as two machines are up, the database service is available. Galera achieves this by implementing the replication of database write transactions. More information can be found on http://galeracluster.com/ and on https://launchpad.net/galera/ 

So, here’s the tl;dr version of what we did:

- Set up three Ubuntu 14.04 LTS IAAS VMs with fixed internal IP addresses
- Set up an Azure internal load balancer so that database clients have a single IP they connect to
- Installed mysql-server-wsrep-5.6.16-25.5 and galera-25.3.5 plus a few dependencies
- Configured galera on these three machines
- Added a bit of iptables magic, courtesy of FromDual, to the VMs to block access to the MySQL port while a database server is recovering. The internal load balancer then moves the clients to the other servers of the cluster in case one is down.
- And in order to keep this all neat and clean, we used Powershell to automate the Azure setup part.

0. Prerequisites

The fixed internal IP and the internal load balancer make use of features that were only added to the Azure virtual network quite recently. Chances are that if you configured an Azure virtual network a while ago, these function may not be available. So just configure a new virtual network for this.

Currently, some of these features can only be configured via Powershell. So you need a (windows) machine to run powershell on, if you don’t have one handy, just create a small (A1) Windows server machine in the Azure portal and use RDP to connect to it. Then install the Azure Powershell, see here.

And you should do a bit of planning ahead for your new virtual network. It should have sufficient IP addresses to host all your database clients, the three servers of the cluster and the additional IP input address of the load balancer. In this case, we used the 10.0.0.0/8 default setting but placed all the database servers in the 10.11.0.0/16 subnet.

1. Creating the machines and the internal load balancer

As said before, we scripted all this in powershell. And in order to keep the configuration apart from the actual commands, we set a bunch of variables in the header of our script that contain the actual settings. So when you see $servicename in the examples below, that is something we’re setting in this header.

The Load balancer is created by this powershell command:

Add-AzureInternalLoadBalancer -ServiceName $servicename -InternalLoadBalancerName $loadBalancerName –SubnetName $subnetname –StaticVNetIPAddress $loadBalancerIP

When running this command, we found that the service needs to be deployed before running this command. So in order to ensure this, we just created a small toy IAAS VM, then created the loadbalancer and the database VMs and then removed the toy VM again.

To configure a VM to use the internal load balancer, we add an endpoint to the VM configuration:

Add-AzureEndpoint `
            -Name mysql `
            -LocalPort 3306 `
            -PublicPort 3306 `
            -InternalLoadBalancerName $loadBalancerName `
            -Protocol tcp `
            -ProbePort 3306 `
            -ProbeProtocol "tcp" `
            -ProbeIntervalInSeconds 5 `
            -ProbeTimeoutInSeconds 11 `
            -LBSetName mysql

Since we have multiple Linux VMs in the same cloud service, we need to remove the standard SSH endpoint and create an individual SSH endpoint for each machine:

Remove-AzureEndpoint `
            -Name SSH `
            | `
Add-AzureEndpoint `
            -Name SSH `
            -LocalPort 22 `
            -PublicPort $externalSshPortNumber `
            -Protocol tcp `
            |`

And we want to use a static internal IP for each machine since we need to specifiy these IP adresses in the galera configuration:

Set-AzureSubnet -SubnetNames $subnetname `
            | `
Set-AzureStaticVNetIP -IPAddress $machineIpAddress `
            | `

We wrapped all this into a configuration function called GetCustomVM. So here’s the complete script:

   1: #
   2: # Set up three VMs for a Galera Cluster
   3: #
   4:  
   5:  
   6: # Azure Cmdlet Reference
   7: # http://msdn.microsoft.com/library/azure/jj554330.aspx
   8:  
   9: $subscriptionId     = "<your subscription ID here>"
  10: $imageLabel         = "Ubuntu Server 14.04 LTS"       # One from Get-AzureVMImage | select Label
  11: $datacenter         = "West Europe" # change this to your preferred data center, your VNET and storage account have to be set up there as well
  12: $adminuser          = "<your linux user name here>"
  13: $adminpass          = "<a linux password>"
  14: $instanceSize       = "ExtraSmall" # ExtraSmall,Small,Medium,Large,ExtraLarge,A5,A6,A7,A8,A9,Basic_A0,Basic_A1,Basic_A2,Basic_A3,Basic_A4
  15: $storageAccountName = "<the storage account name for the vm harddisk files>"
  16: $vnetname           = "<the name of your vnet>"
  17: $subnetname         = "<the name of the subnet for the database servers>"
  18:  
  19: $loadBalancerName   = "galera-ilb" # should be changed if there are multiple galera clusters  
  20: $loadBalancerIP     = "10.11.0.10"
  21:  
  22: $servicename        = "<your service name>" # all machines will be created in this service
  23: $availabilityset    = "galera-as" # should be changed if there are multiple galera clusters  
  24:  
  25: #
  26: # Calculate a bunch of properties
  27: #
  28: $subscriptionName = (Get-AzureSubscription | `
  29:     select SubscriptionName, SubscriptionId | `
  30:     Where-Object SubscriptionId -eq $subscriptionId | `
  31:     Select-Object SubscriptionName)[0].SubscriptionName
  32:  
  33: Select-AzureSubscription -SubscriptionName $subscriptionName -Current
  34:  
  35: $imageName = (Get-AzureVMImage | Where Label -eq $imageLabel | Sort-Object -Descending PublishedDate)[0].ImageName
  36:    
  37: $storageAccountKey = (Get-AzureStorageKey -StorageAccountName $storageAccountName).Primary
  38:  
  39: $storageContext = New-AzureStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey
  40:  
  41: #
  42: # Fix the local subscription object
  43: #
  44: Set-AzureSubscription -SubscriptionName $subscriptionName -CurrentStorageAccount $storageAccountName 
  45:  
  46:  
  47: #
  48: # This function encapsulates the configuration generation of a single new Galera VM
  49: #
  50: Function Get-CustomVM
  51: {
  52:     Param (
  53:         [string]$customVmName, 
  54:         [string]$machineIpAddress, 
  55:         [int]$externalSshPortNumber,
  56:         [string] $storageAccountName = $storageContext.StorageAccountName
  57:         )
  58:  
  59:     # 
  60:     # configure the VM object
  61:     #
  62:     $vm = New-AzureVMConfig `
  63:             -Name $customVmName `
  64:             -InstanceSize $instanceSize `
  65:             -ImageName $imageName `
  66:             -AvailabilitySetName $availabilityset `
  67:             -MediaLocation "https://$storageAccountName.blob.core.windows.net/vhds/$customVmName-OSDisk.vhd" `
  68:             -HostCaching "Read" `
  69:             | `
  70:         Add-AzureProvisioningConfig `
  71:             -Linux `
  72:             -LinuxUser $adminuser `
  73:             -Password $adminpass `
  74:             | `
  75:         Set-AzureSubnet -SubnetNames $subnetname `
  76:             | `
  77:         Set-AzureStaticVNetIP -IPAddress $machineIpAddress `
  78:             | `
  79:         Remove-AzureEndpoint `
  80:             -Name SSH `
  81:             | `
  82:         Add-AzureEndpoint `
  83:             -Name SSH `
  84:             -LocalPort 22 `
  85:             -PublicPort $externalSshPortNumber `
  86:             -Protocol tcp `
  87:             |`
  88:         Add-AzureEndpoint `
  89:             -Name mysql `
  90:             -LocalPort 3306 `
  91:             -PublicPort 3306 `
  92:             -InternalLoadBalancerName $loadBalancerName `
  93:             -Protocol tcp `
  94:             -ProbePort 3306 `
  95:             -ProbeProtocol "tcp" `
  96:             -ProbeIntervalInSeconds 5 `
  97:             -ProbeTimeoutInSeconds 11 `
  98:             -LBSetName mysql
  99:  
 100:     $vm
 101: }
 102:  
 103: #
 104: # 0. Create cloud service before instantiating internal load balancer
 105: #
 106: if ((Get-AzureService | where ServiceName -eq $servicename) -eq $null) {
 107:     Write-Host "Create cloud service"
 108:     New-AzureService -ServiceName $servicename -Location $datacenter
 109: }
 110:  
 111: #
 112: # 1. Create a dummyVM with an external endpoint so that the internal load balancer (which is in preview) is willing to be created
 113: #
 114: $dummyVM = New-AzureVMConfig -Name "placeholder" -InstanceSize ExtraSmall -ImageName $imageName `
 115:     -MediaLocation "https://$storageAccountName.blob.core.windows.net/vhds/dummy-OSDisk.vhd" -HostCaching "ReadWrite" `
 116:     | Add-AzureProvisioningConfig -Linux -LinuxUser $adminuser -Password $adminpass `
 117:     | Set-AzureSubnet -SubnetNames $subnetname `
 118:     | Set-AzureStaticVNetIP -IPAddress "10.0.1.200" 
 119:  
 120: New-AzureVM -ServiceName $servicename -VNetName $vnetname -VMs $dummyVM 
 121:  
 122: #
 123: # 2. Create the internal load balancer (no endpoints yet)
 124: #
 125: Add-AzureInternalLoadBalancer -ServiceName $servicename -InternalLoadBalancerName $loadBalancerName –SubnetName $subnetname –StaticVNetIPAddress $loadBalancerIP
 126: if ((Get-AzureInternalLoadBalancer -ServiceName $servicename) -ne $null) {
 127:     Write-Host "Created load balancer"
 128: }
 129:  
 130: #
 131: # 3. Create the cluster machines and hook them up to the ILB (without mentioning "-Location $datacenter -VNetName $vnetname ", because the $dummyVM pinned these already
 132: #
 133: $vm1 = Get-CustomVM -customVmName "galera-a" -machineIpAddress "10.11.0.11" -externalSshPortNumber 40011
 134: $vm2 = Get-CustomVM -customVmName "galera-b" -machineIpAddress "10.11.0.12" -externalSshPortNumber 40012
 135: $vm3 = Get-CustomVM -customVmName "galera-c" -machineIpAddress "10.11.0.13" -externalSshPortNumber 40013
 136: New-AzureVM -ServiceName $servicename -VMs $vm1,$vm2,$vm3
 137:  
 138: #
 139: # 4. Delete the dummyVM
 140: #
 141: Remove-AzureVM -ServiceName $servicename -Name $dummyVM.RoleName -DeleteVHD
 142:  

 

Now the load balancer and the three VMs are created.

2. Install and configure Galera on the three VMs

We took the galera .deb packages from lounchpad.net:

https://launchpad.net/codership-mysql/5.6/5.6.16-25.5/+download/mysql-server-wsrep-5.6.16-25.5-amd64.deb and https://launchpad.net/galera/3.x/25.3.5/+download/galera-25.3.5-amd64.deb 

In these packages, we found a few minor glitches that collided with the Ubuntu 14.04 LTS we installed them on. The first glitch was that mysql-server-wsrep-5.6.16-25.5-amd64.deb has a configured dependency on mysql-client. And Ubuntu sees this satisfied with the mysql-client-5.5 package it uses as default, but this creates a version conflict. So I downloaded the .deb and modified its dependency to point to mysql-client-5.6 by following http://ubuntuincident.wordpress.com/2010/10/27/modify-dependencies-of-a-deb-file/. The second glitch was the fact that the default my.cnf contains the path /var/log/mysql/error.log which does not exist on Ubuntu. This created the strange situation that the server process would not start but just create two mysterious entries in syslog. Running strace on the server process showed the path it was trying to access, and once I created it everything worked fine. Another glitch in the package was that is was missing an upstart script for mysql, instead it had just a classic /etc/init.d shell script which confused upstart.  So I took one from a standard mysql-server-5.6 package and everything worked out well.

The steps to set up Galera were:

$ apt-get install mysql-client-5.6
$ apt-get install libssl0.9.8
$ dpkg -i galera-25.3.5-amd64.deb
$ dpkg --force-depends -i mysql-server-wsrep-5.6.16-25.5-amd64.modified.deb
$ mkdir /var/log/mysql
$ chown mysql /var/log/mysql

and put the standard upstart script from mysql-server-5.6 into the upstart config directory.

The next part was to configure the galera cluster function. As you can see in the script above, we have created three machines with the internal IP addresses 10.11.0.11, 10.11.0.12 and 10.11.0.13. For this, we need to set a few things in the default my.cnf

Binlog_format=row
Default_storage_engine=InnoDB
Innodb_autoinc_lock_mode=2
Query_cache_type=0
Query_cache_size=0
Innodb_flush_log_at_trx_commit=0
Bind_address=0.0.0.0
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="<your cluster name here>"
Wsrep_sst_method=rsync

These settings are the same in all three machines. On each of the machines, we can now set a human readable node name, eg.g

wsrep_node_name='Node A'

In the next step, we configured the actual clustering, i.e., we told each machine where to find the replication partners.

On machine 10.11.0.11, we set the following line in my.cnf:

wsrep_cluster_address="gcomm://"

This allows this database node to come up even if there is no replication partner.

Then we started the server on 10.11.0.11.

Then we set the following line in my.cnf on 10.11.0.12:

wsrep_cluster_address="gcomm://10.11.0.11,10.11.0.13"

and started the server on 10.11.0.12

Then we set the following line in my.cnf on 10.11.0.13:

wsrep_cluster_address="gcomm://10.11.0.11,10.11.0.12"

and started the server on 10.11.0.12.

Now we went back to 10.11.0.11 and changed the line to:

wsrep_cluster_address="gcomm://10.11.0.12,10.11.0.13"

and restarted the server. Now the galera cluster was configured.

Instead of changing the configuration of the initial node twice, one can also directly start the server process and add the configuration setting to the command line, e.g. mysqld_safe wsrep_cluster_address=”gcomm://”. This is a good workaround if for whatever reason the cluster was fully shut down and needs to be brought up manually again.

Since the internal load balancer was already configured before, we can now use the ILB input IP address to connect to the cluster. So the clients use 10.11.0.10:3306 to connect to the cluster. And with each new TCP connection, the load balancer chooses one of the running nodes and connects the client to it.

There is one additional issue that may confuse clients in one specific situation. Imagine that one of the nodes just failed and is about to start up again. In this state, the database server can be accessed but does not yet have data replicated from the other nodes. In this state, although the clients can connect, all database commands will fail. If clients aren’t prepared to handle this situation, this may show up as database errors in applications. But there’s a solution: FromDual has implemented a small shell script that uses the Linux iptables firewall to deny access to the server while it is in this state. The load balancer then finds it cannot access the TCP port and reroutes the request to another running cluster node.

To run the script whenever a replication state change occurs, another line is added to my.cnf:

wsrep_notify_cmd = /usr/local/bin/block_galera_node.sh

The script and the instructions for setting this up can be found here:  http://fromdual.com/galera-cluster-for-mysql-and-hardware-load-balancer/ Don’t be alarmed by the fact that it talks about hardware load balancers, it works the same with the (software-based) Azure internal load balancer.

Hope this helps,

H.