<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>SQL Server Storage Engine : Service Level Agreements</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Service+Level+Agreements/default.aspx</link><description>Tags: Service Level Agreements</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>TechEd session video available...</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2007/06/27/teched-session-video-available.aspx</link><pubDate>Thu, 28 Jun 2007 03:16:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:3573553</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/3573553.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=3573553</wfw:commentRss><description>&lt;P&gt;The session I gave at TechEd this year on &lt;EM&gt;'Secrets of Fast Detection and Recovery from Database Corruptions'&lt;/EM&gt; was videotaped as part of the&amp;nbsp;&lt;EM&gt;Its Showtime!&lt;/EM&gt;&amp;nbsp;TechEd program. The video is now available to watch at &lt;A href="http://www.microsoft.com/emea/itsshowtime/sessionh.aspx?videoid=549"&gt;http://www.microsoft.com/emea/itsshowtime/sessionh.aspx?videoid=549&lt;/A&gt;. This is the same session I've been delivering to user groups around the world for the past few months but this time &lt;A class="" href="http://www.sqlskills.com/blogs/kimberly/" mce_href="http://www.sqlskills.com/blogs/kimberly/"&gt;Kimberly&lt;/A&gt; joined me and played demo-monkey :-)&lt;/P&gt;
&lt;P&gt;Enjoy!&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=3573553" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Disaster+Recovery/default.aspx">Disaster Recovery</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC+CHECKDB+Series/default.aspx">DBCC CHECKDB Series</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Conferences+2007/default.aspx">Conferences 2007</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Service+Level+Agreements/default.aspx">Service Level Agreements</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/VLDB+Maintenance/default.aspx">VLDB Maintenance</category></item><item><title>More on Service Level Agreements...</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2007/03/12/more-on-service-level-agreements.aspx</link><pubDate>Tue, 13 Mar 2007 03:35:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1868780</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/1868780.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=1868780</wfw:commentRss><description>&lt;P&gt;My recent post on &lt;A class="" href="http://blogs.msdn.com/sqlserverstorageengine/archive/2007/03/04/slas-what-slas.aspx" mce_href="http://blogs.msdn.com/sqlserverstorageengine/archive/2007/03/04/slas-what-slas.aspx"&gt;SLAs&lt;/A&gt;&amp;nbsp;prompted some interest and comments from readers so this is a follow-up to that post.&lt;/P&gt;
&lt;P&gt;What most people wanted was a list of some SLAs applicable to SQL Server - easier-said-than-done because a lot of SLAs depend on the application being serviced by the database. I had a poke about the web and have pulled together some examples to get you thinking about SLAs you may want to define and strive to meet, listed by category - this is definitely a non-exhaustive list!&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Hours of Operation&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Hours that the partition/table/database is available to users. The SLAs may be different for various parts of the database, depending on what applications need access to them. Why is it important to differentiate at various granularities? For example, depending on how the unit of data is used, it may require different &lt;EM&gt;maintenance&lt;/EM&gt; than other data and so knowing the availability SLA allows maintenance downtime to be planned.&lt;/LI&gt;
&lt;LI&gt;Hours reserved for planned downtime. Again, this may differ at various granularities of data.&lt;/LI&gt;
&lt;LI&gt;Amount of advance notice for extended downtime or other changes that affect users. For instance, when my bank upgraded its computer system last year, they gave a series of warnings over the preceeding few months so that people weren't surprised.&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Service Availability&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Percentage of time SQL Server service is running and able to service connections.&lt;/LI&gt;
&lt;LI&gt;Percentage of time a particular partition/table/database is available for use (i.e. not exclusively locked for maintenance or restore).&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;System Performance&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Number of concurrent users the system supports.&lt;/LI&gt;
&lt;LI&gt;Number of transactions supported per unit of time.&lt;/LI&gt;
&lt;LI&gt;Acceptable level of performance, such as latency experienced by users for a variety of operations.&lt;/LI&gt;
&lt;LI&gt;Minimum time for an update to be replicated to various remote sites.&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Disaster Recovery&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Time allowed for recovery from each type of failure (e.g. accidental data deletion, database corruption, SQL Server crash, OS crash, server failure, site failure).&lt;/LI&gt;
&lt;LI&gt;Time it takes to bring critical data online (e.g. the read/write partitions of a sales database) such that operations can continue and less critical data can be recovered later.&lt;/LI&gt;
&lt;LI&gt;Time taken to recover data to the point of failure.&lt;/LI&gt;
&lt;LI&gt;Maximum acceptable data/transaction/work loss for various kinds of failures.&lt;/LI&gt;
&lt;LI&gt;Maximum time for application failover to a remote server/site.&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Support&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Methods available for application users to get help.&lt;/LI&gt;
&lt;LI&gt;Maximum response/resolution time from a DBA to respond to various types of problems.&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Other&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Maximum amount of space for user tables/databases.&lt;/LI&gt;
&lt;LI&gt;Amount of users in specific roles.&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;Some&amp;nbsp;other things to consider&amp;nbsp;are how you define an SLA - for example, in transactions per second or in commit latency that users experience - and the interplay between SLAs - for example, the commit latency SLA may be affected if the acceptable data loss SLA is zero and a solution such as synchronous database mirroring&amp;nbsp;or remote SAN mirroring is used.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Bottom line is that&amp;nbsp;although it can be simple to&amp;nbsp;quickly define and announce a set of SLAs for a given application, its very difficult to make sure that each is palatable to all involved, guarantee that each can be met, and allow easy diagnosis of the system to work out which component is failing when an SLA is not met. SLAs really need to be defined while a sytem is being designed as retro-fitting SLAs after-the-fact can be very time-consuming and costly.&lt;/P&gt;
&lt;P mce_keep="true"&gt;I welcome any comments or observations on this topic - I'll post on this once I get some more feedback.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1868780" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Disaster+Recovery/default.aspx">Disaster Recovery</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Service+Level+Agreements/default.aspx">Service Level Agreements</category></item><item><title>What are SLAs and why are they important?</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2007/03/04/slas-what-slas.aspx</link><pubDate>Sun, 04 Mar 2007 22:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1805710</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/1805710.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=1805710</wfw:commentRss><description>&lt;P&gt;&lt;EM&gt;(In the UK now hanging out with &lt;A class="" href="http://www.sqlskills.com/blogs/Kimberly" target=_blank mce_href="http://www.SQLskills.com/blogs/Kimberly"&gt;Kimberly&lt;/A&gt; and &lt;A class="" href="http://sqlblogcasts.com/blogs/tonyrogerson/default.aspx" mce_href="http://sqlblogcasts.com/blogs/tonyrogerson/default.aspx"&gt;Tony Rogerson&lt;/A&gt; before&amp;nbsp;teaching a Masterclass tomorrow in Reading. Then it's&amp;nbsp;off to Copenhagen for &lt;A class="" href="http://www.miracleas.dk/index.asp?page=168&amp;amp;page2=323" target=_blank mce_href="http://www.miracleas.dk/index.asp?page=168&amp;amp;page2=323"&gt;SQL Server Open World&lt;/A&gt;, with a little R&amp;amp;R in London beforehand and Copenhagen afterwards, before we fly back to the US on Sunday.&amp;nbsp;The weather here is actually better than in Seattle!)&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;I've had a bunch of feedback from the survey I sent out (still need more before posting any statistics though) and various things have jumped out at me. The most worrying is that many people either don't know what their SLAs are or have no idea whether they can meet them.&lt;/P&gt;
&lt;P&gt;Here are some questions around SLAs&amp;nbsp;- if you can't answer &lt;EM&gt;YES!&lt;/EM&gt; to all of them, then you may be in trouble.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Do you know what an SLA is?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;SLA = S&lt;/STRONG&gt;ervice &lt;STRONG&gt;L&lt;/STRONG&gt;evel&lt;STRONG&gt; A&lt;/STRONG&gt;greement. SLAs are agreements between you and your customers. If you're a DBA, then your customer is typically the company for whom you work. Examples of SLAs are:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;In the event of a corruption, or other disaster, the&amp;nbsp;maximum amount of data loss is the last 15 minutes of transactions.&lt;/LI&gt;
&lt;LI&gt;In the event of a corruption, or other disaster, the maximum amount of downtime the application can tolerate is 20 minutes.&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;Usually, it's a combination of SLAs such as those above.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Do you know why SLAs are important?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Here's the catch - an SLA is really more than just an &lt;EM&gt;agreement&lt;/EM&gt; between you and your customers - it's more like a &lt;EM&gt;contract&lt;/EM&gt; that you're obligated to meet. This means that if you're a DBA with zero-downtime and zero-data loss SLAs, you need to make sure that in the event of a corruption&amp;nbsp;you can actually meet those SLAs. The obvious thing is that if the SLAs cannot be met then the business will suffer downtime and data loss. The not so obvious thing is that if you're the one who agreed to the SLAs in the first place, and when the disaster strikes, the capabilities of the system are far below the SLA's requirements, then you could lose your job - resume/CV time - I've heard of it happening...&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Do you know your SLAs?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;You have to know what your SLAs are so you can make sure the system can meet them. Several DBAs I discussed this with don't know what their business' SLAs are, even though they&amp;nbsp;are responsible for making sure they are met. I find this&amp;nbsp;astounding - how can you sign up for meeting an SLA when you don't know that the SLA is? Especially if failing to meet the SLA could lead to resume/CV time...&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Do you think you can meet your SLAs?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The other reason to know your SLAs, of course, is so that you can correctly architect your system to meet them. There are a bunch of technologies you can use and strategies you can employ to work towards meeting your SLAs (well beyond the scope of this blog post but will be covered through the year). If you find that you can't meet your SLAs, you need to push-back&amp;nbsp;on your management - otherwise you're setting yourself up for trouble when a disaster occurs and you can't meet the SLAs - you'll be held responsible.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Do you &lt;U&gt;&lt;EM&gt;know&lt;/EM&gt;&lt;/U&gt; you can meet your SLAs?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Your disaster recovery plan looks great on paper - but have you actually tried it? I know of one company that has a 15 minute downtime SLA for a 300+GB database but the DBA is relying on clusters to provide that for him. That won't work if the database is corrupt (remember a failover cluster has a single point of failure in its shared-nothing configuration - the disks) and needs to be restored from the last full backup... Another company I know of relies on database mirroring to failover in the event of a disaster but has never tried it to see if their application fails over gracefully... You have to make sure you've practiced recovering from a disaster &lt;EM&gt;before&lt;/EM&gt; the first real disaster happens - you'll be amazed at the little things that are discovered (e.g. if the on-site backups are bad, how long will it take to get the offsite copies brought in-house from the off-site location 100 miles away? Can you still meet your 15 minute downtime SLA in that case?)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Summary&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;As you can can see from my short list of questions and answers above, its vital that you understand your SLAs and know that you can meet them - your business (and job!) may depend on it. If you're having trouble, drop me a line (&lt;A href="mailto:prandal@microsoft.com" mce_href="mailto:prandal@microsoft.com"&gt;prandal@microsoft.com&lt;/A&gt;) and I'll see what I can do to help.&lt;/P&gt;
&lt;P&gt;PS Don't forget to checkout the &lt;A class="" href="http://www.dotnetrocks.com/default.aspx?showNum=217" mce_href="http://www.dotnetrocks.com/default.aspx?showNum=217"&gt;.NET Rocks!&lt;/A&gt; show tomorrow!&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1805710" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Disaster+Recovery/default.aspx">Disaster Recovery</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Service+Level+Agreements/default.aspx">Service Level Agreements</category></item></channel></rss>