This Blog will provide information about running SAP applications on SQL Server and Windows. The Blog is written by folks of Microsoft who are working with SAP and SQL Server for more than a decade.
In this part we want to talk about some scenarios where AlwaysOn and Log-Shipping is getting combined. The two scenarios we want to look into particularly would be:
Hence the target server in the DR site needs to get supplied by Log-Shipping with the transaction log backups performed in the main site on the current primary.
See the scenarios below graphically displayed
Graphic 1: Using Log-Shipping to distribute Tlog backups on the different nodes of the AlwaysOn configuration
Graphic 2: Using a combination of AlwaysOn and Log-Shipping. Supply DR site with Log-Shipping
As clear cut as Scenario #2 is, there is some point to scenario #1 as well. In scenario #1, we already supply the DR site with the data of the main site with AlwaysOn. Nevertheless, even with such a configuration, there might occur scenarios, where one needs to restore an image of the production SAP database in relatively short time for a point-in-time recovery. Reasons could be manual failures where data got deleted and now needs to be restored. Since one doesn’t know whether the infrastructure for such an activity is immediately available in the main site or DR site, it can be an advantage having the Tlog Backups and also full database backups available on both sides. Over the years we also encountered numerous cases of human errors (from deleting SQL Server data files instead of copying them to simply reformatting whole SAN volumes or reconfiguring SAN volumes) which suddenly demanded the need to restore the database to the latest possible point in time. Again one doesn’t know which site this requirement comes up. In a lot of cases, in larger companies, Central Backup solutions also might apply quite some delays restoring single backups. Hence there is a point having quite a few days Tlog backups in the main site, so that those are accessible by the primary and secondary and to have those Tlog Backups in the DR site as well.
Another point one needs to make some compromises is the location one uses to store the Tlog Backups in the main site. No doubt, the best would be to have a location on the SAN Array used by the primary and another copy would be on the SAN Array used by the secondary plus then one location on the Array in the DR site. From an availability point of view, the best alternative. No doubt. However a scenario which opens up the possibility to break the Tlog Backup Chain. Here is why:
Log-Shipping is a 3-part process of the tasks:
Assuming that we established scenario #1 or #2 with:
The following scenario could occur:
There is another impact of the scenario above occurring. Assume we indeed ended up with the situation above and we get the former primary, which failed, up and running again. However that replica would now be the secondary replica. The way we setup Log-Shipping we now would enable the copy job for Log-Shipping to copy the Tlog backups from the new primary which created them in a local volume. First thing the copy job of the former primary, now secondary replica would detect is that the content of the local directory does not only miss Tlog backups which have been executed on the new primary meanwhile. But it also realizes that there is one file in its own local directory which doesn’t exist on the new primary. This results in the side effect, that the COMPLETE Tlog Backup directory of the primary is getting copied again. Dependent on the Tlog Backup volume this could be many Gigabytes of data copied.
How to remedy this potential problem?
Measures one can use to minimize the risk of this issue of breaking the Tlog Chain coming up
For scenario #1, the first setup step would be to go to the primary
However we are only halfway done. Since we want to have the log-Shipping working as well in case of a failover between the two replicas in the main datacenter (sapdenali5 and sapdenali6 in our case), we need to setup a similar Log-Shipping configuration for the local secondary replica as well (in our case sapdenali6). This is best done in a situation where that replica is in the role of a primary. Hence a failover might be necessary. After that is done, setup a similar configuration as before.
Assuming we have this all setup and our sapdenali5 replica is in the primary role, the settings of the jobs should look like:
The LSBackup_E64 job performs the Tlog backup every minute. The backups are performed against a file share (first copy of Tlog Backups).
The LSCopy_sapdenali6_E64 job which also is active, basically copies the backups which have been created back to a local share (second local copy of the Tlog backups)
The settings of the local secondary would look like:
The only job enabled would be the Log-Shipping copy job which would copy the Tlog backups performed by the primary onto the local secondary replica.
The secondary replica of the DR site, would have these settings of jobs:
Since we got the secondary replica in the DR site supplied with AlwaysOn already, we can’t restore the Tlog Backups and hence can disable the jobs. We don’t want to delete them since those could come in handy for scenarios where we eventually need to reseed that replica with a new backup from the main datacenter.
Another job which is disabled is also one of the copy jobs. In the scenario we set up, where we copy the Tlog backups in the main datacenter to a file share, it doesn’t make any difference which of the nodes is running in the primary role. Both of them would create the backups on the file share. Means one of the copy jobs could be deleted as well.
As described above we basically got three copies of the Tlog backups in the main datacenter now. It is obvious that with disabling copy jobs on the primary and/or secondary, one can decide to have two or even only one copy only in the main datacenter.
Differences to scenario #2
The only real difference there is to setup scenario #2 is a change in the settings of the jobs created for Log-Shipping on the node which acts as Log-Shipping destination in the DR site. Remember in scenario #2, we have no AlwaysOn to supply the secondary in DR site with the changes. This needs to be done with Log-Shipping. As a result, we need, besides copying the Tlog Backups also restore those Tlog backups on this node which is not part of the AlwaysOn configuration.
The main site settings of the jobs remain the same.
On the log-Shipping destination in the DR site the job settings would need to look like:
We only need one of the backup jobs. Hence the second backup job which got created could as well be deleted. As mentioned earlier. This scenario also allows to define a delay in restore for the Log-Shipping destination.
Now one problem stays. In case of a failover, you would need to enable or disable some scheduled tasks/jobs in a manual way. Given the fact that there often enough happened failovers which were not even noticed, this is not so nice. Hence in the next part, we’ll see how one can automate this part of the story