INFORMATIONAL: SHEDDING LIGHT on VSS & VDI Backups in SQL Server

INFORMATIONAL: SHEDDING LIGHT on VSS & VDI Backups in SQL Server

Rate This
  • Comments 8

Right, ever heard or wondered about snapshot backups in SQL Server? You might have come across the Volume Shadow Service (VSS) backup option in SQL Server. In this post, I will try to explain what they are, how they work and what to check for when you run into issues with VSS backups failing.

Before we get into the nitty-gritty of VSS, let me lay down some of the terms you will need to know,

 

 

VSS

VSS is basically a framework that consists of a set of functions that enable applications to perform backups of volumes.

 

Writer

Writers are applications that store information on disk and that work together with providers through the VSS interface.

SQL Server has 2 writers available:-

 

1) MSDE Writer - MSDEWriter works with SQL Server 7.0, SQL Server 2000, and SQL Server 2005.

2) SQL Writer - SqlServerWriter only works with instances of SQL Server 2005.

 

Provider

Providers own the shadow copy data and instantiate the shadow copies.

 

VDI

SQL Server provides application programming interfaces (API's) that enable an ISV to integrate SQL Server into its products. These specifications are available publicly for third-party vendors to develop backup solutions. Some popular ones include Symantec NetBackup, SQL Litespeed, Legato etc.

 

 

You must be thinking why I need this backup solution when we have the option of Native SQL Backups. Well, the difference with SQL backups and VSS backups is that, through VSS we can backup the database files while SQL Server is running (and has a handle to the DB Files). The primary difference is that these are "Snapshot backups" i.e.  Backup of a file or folder on a specific volume at a specific point in time. So applications on a system can continue to write to the volumes while we do the backup.

 

Now you must be thinking how we can backup the file at a point in time when sqlserver itself has a handle to the file. Well, that’s the magic of VSS a.k.a Shadow Copy :)

 

 

How does it work?

 

One of the first phases of a snapshot creation process is "Backup Initialization." During this phase, the backup application does the following:

 

The backup application initializes the IVssBackupComponents interface.

The backup application calls the IVssGatherWriterMetadata API to perform metadata enumeration.

 

Here is where the default writer comes in. Whichever is the default Writer for SQLServer, that is used to connect to the instance and enumerate all the databases & gather its metadata (The default writer for SQL Server connects to *ALL* instances of SQL Server that are started on the local system).

The metadata is then returned to the backup application.

 

So potential issues that you could run into are when the connecting to the instance fails, the enumeration fails and hence you won’t be able to select any database for backups. Also if the SQL VSS writer service is not running, the SQLWriter is not listed and hence no SQL database files can be backed up.

 

Next up, from the metadata we form a "backup component" which is basically a list of what all to backup along with backup options. After this comes the Snapshot Initialization and finally the Snapshot Creation

 

In the snapshot creation phase, we do the following:

 

a) SQL Writer talks to SQL Server to prepare for a snapshot backup

b) Then all I/O for the database being backed up is frozen, and then we create the snapshot.

c) Once this is done, the I/O is resumed. This process is called "thaw".

 

We then take the snapshot and move it to a backup media if necessary. This phase is outside of SQL Server and in no way interferes with normal data operation. This is between the Backup Application and VSS. Once this is complete, we send a backup termination signal and the snapshot is deleted.

 

Here is a pictorial representation of a VDI Backup cycle     (courtesy sqlbackuprestore.com )

You do have some options when doing VDI backups.

BLOCKSIZE: specifies the physical block size (bytes)

MAXTRANSFERSIZE: Unit of transfer (bytes) to be used between SQL Server and the backup media. The minimum is 64KB up to 4MB.

BUFFERCOUNT: Number of I/O buffers to be used for backup operations.

So the calculation for the amount of memory used in SQL for a backup operation = Maxtransfersize * BufferCount. This memory comes from the non-Bpool region (popularly known as MTL J )

E.g. maxtransfersize = 65536 and BufferCount = 40 ==> 2.5 MB

Below is a sample snippet using VDI interface to do backups.

clip_image004

<Code snippet>

 

            backup->Devices->AddDevice(deviceSet->Name, DeviceType::VirtualDevice);

                        backup->Complete += gcnew ServerMessageEventHandler(this, &BackupFileContext::CompletionHandler);

                        backup->Information += gcnew ServerMessageEventHandler(this, &BackupFileContext::InformationHandler);

                        backup->NextMedia += gcnew ServerMessageEventHandler(this, &BackupFileContext::NextMediaHandler);

                        backup->PercentComplete += gcnew PercentCompleteEventHandler(this, &BackupFileContext::PercentageHandler);

                        backup->Database = this->databaseName;

                        backup->Action = (this->level == 'I')? BackupActionType::Log: BackupActionType::Database;

                        backup->Incremental = (this->level == 'D');

                        // backup->BlockSize = 512 * 128;

                        dbg(1, __LINE__, DBG_TEXT("database '{0}' checksum {1}"), this->databaseName, backup->Checksum);

                        try {

                                    backup->SqlBackupAsync(this->server);

                        }

 

</Code snippet>

 

 

We are making use of VDI with SMO to perform backup.

Specifications here: http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.smo.backup_properties(SQL.90).aspx  

 

 

One thing to keep in mind when you are writing using SMO is that the SMO’s default Statement Timeout is 10 minutes.  You might want to set this value higher or to infinite (0), especially for large databases. If your database backup doesn’t complete in 10 minutes, you will get a timeout & the backup operation is aborted.

 

 

<snippet>

using Microsoft.SqlServer.Management.Smo;

ServerConnection server = new ServerConnection(serverName) /* suppose this is your connection object */

server.ConnectionContext.StatementTimeout = 0 //infinite statement timeout. Needed for large databases.

Try

{

backup->SqlBackupAsync(this->server);

}

</snippet>

 

You also have options of excluding some databases to be excluded from backups. By default databases that are in restoring/loading/suspect are excluded. Also excluded are tempdb and mssqlsystemresource databases. This "exclusion list" is one of the metadata that is gathered by the writer.

 

One way that you can control this is using the registry key:

 

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\BackupRestore\FilesNotToBackup

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\BackupRestore\FilesNotToSnapshot

 

To get a list of available writers run this from a command prompt:

vssadmin list providers

 

To get detailed writer metadata information, you can use the VShadow.exe that is available with the Platform SDK. Run this from a command prompt once you have VSSSDK installed

vshadow.exe –wm2

 

Some things to keep in mind:-

 

1) If the connection to SQL Server fails, the enumeration doesn’t happen so the Writer is not listed and the backups fail. Even NTBackup when not backing up a SQL database tries to connect if SQLWriter is available. So make sure the connectivity is working.

 

2) There is a sample tool available with SQL Server 2000 Sample which illustrates VDI backups. It’s called simple.exe.   This can be used to test if the SQL VSS component is functioning.

 

3) Anytime a VDI backup fails, you will get an event log entry like the following,

 

MSSQLSERVER      Error       18210    SERVERNAME     BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device 'RVX-{58702BA0-C160-4125-952E-CC21954404B7}'.

Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.)

 

Make careful note of the HResult code returned by the OS. (Error 995 in this case). The SignalAbort function is called to terminate and it gives the reason for abnormal termination. Also note where the abort was initiated from - client (backup app) or the server (SQL server).

 

So when the server receives an error code from IClientVirtualDevice::CompleteCommand, it will terminate the command and close the Virtual Device. Any buffers which have I/O being performed return VD_E_ABORT status this is what we see in the event log entry. All the subsequent commands are cancelled and VDI backup set is closed by invoking "CloseDevice"

 

OR

 

 

2005-07-26 16:00:05.49 backup    BACKUP failed to complete the command BACKUP log [AON_IDB_POSTDTS] TO VIRTUAL_DEVICE='GSB100'

WITH BLOCKSIZE = 65536 , BUFFERCOUNT = 40 , MAXTRANSFERSIZE = 262144 ,stats

 

You might get above message in the SQL Errorlog. Again this will also contain the HResult code returned by the backup operation.

 

I hope this was useful in de-mystifying some of the mysteries of VSS/VDI backups in SQL Server. As always stay tuned for more.

 

 

Regards,
Sudarshan N

Technical Lead | Microsoft SQL Server

 

 

Leave a Comment
  • Please add 4 and 7 and type the answer here:
  • Post
  • PingBack from http://microsoft-sharepoint.simplynetdev.com/informational-shedding-light-on-vss-vdi-backups-in-sql-server/

  • Every once in a while I see VDI backups failing and questions being asked as to whether the sqlvdi.dll

  • If you use a external or third party backup tool like NetBackup, BackupExec, or one of the many other

  • Thanks for this very informative article.  I have a question related to a security audit that I am doing on one of the several MSSQL Server instances that I am managing in a highly secure organization.

    I noticed that the MSSQL instance has VSS Writer Service enabled and running.  I also noticed that the Windows Server 2003 that is the OS on this server has the Volume Shadow service disabled.

    Does the VSS Writer service on SQL Server require Volume Shadow Win Server service to be enabled and running?  Sorry if this seems like a ridiculous question.

    Thanks.

    Zee - Atlanta. GA

    fredric.zimmerman@gdit.com

    Zee.ss.dba@gmail.com

  • Hi Fred,

    Thank you. Your question is very simple and makes sense. Small things like these make a big difference :)

    With SQL 2005 SP2 onwards, SQL VSS Service is configured to start automatically. Basically this is sqlwriter.exe. To actually create a VSS shadow copy/backup it needs to call some VSS API's for which Windows Volume Shadow service needs to be running. So yes, this needs to be running.

    So, if you want to backup to backup SQL databases as snapshot backups, it need Volume Shadow service to be running.

    e.g. NTBACKUP automatically starts of the system vss service to  do the backup if its stopped. So other backup applications also might have similar logic

    But, to start the SQL VSS service itself, system VSS is not required to be running. SQL VSS will start fine without it, but the backups will not happen.

    More on this here : http://technet.microsoft.com/en-us/library/cc966520.aspx

    HTH.

  • Thank you so much for the great article. we are using IBM Tivoli Manager to backup a SQL Database. we have successfully configured & Performed the VSS backup for one of the large database. Look like it is using the 'Copy-On-Write' (COW) function by default. where do need to change 'Copy-On-Write' to Full Copy/Split Mirror or Redirect-on-Write

    kindly help us

  • Nice Article Parikshit

  • Can anyone let me know How to fix below error:

    BackupIoRequest::ReportIoError: write failure on backup device '3f3041bd-470d-425f-b3c3-a9fd837a7e6d'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    spid397      BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device '3f3041bd-470d-425f-b3c3-a9fd837a7e6d'. Operating system error 995(The I/O operation has been aborted because of either a thread exit or an application request.).

    SQL Version: SQL SERVER 2008 Enterprise Edition 64 bit

Page 1 of 1 (8 items)