Welcome to MSDN Blogs Sign in | Join | Help

The SQL Server Wait Type Repository…

As part of my talk at the 2009 US PASS Summit here in Seattle called Inside SQL Server Wait Types, I’m creating this blog post as a reference point that can be used to find out information about wait types in SQL Server 2005 and 2008.  My hope is that if you have a question about a wait type you encounter with SQL Server 2005, 2008, or beyond, you will use this blog post as the starting point. This blog post should also be used as an enhancement to what you can find at the following SQL Server Books Online topic:

sys.dm_os_wait_stats (Transact-SQL)

Why use this blog post over other references? Because information in this blog post will be based on the the actual product source code and empirical testing of scenarios that cause these wait types.

This page is under construction but will contain the following types of information:

1) Table of all wait types as listed in sys.dm_os_wait_stats.

This table will contain the following types of information:

  1. types organized by categories
  2. Which types apply to SQL 2005, SQL 2008, or both
  3. More descriptions for how you might encounter these
  4. Tips on what to do (if anything)
  5. List of specific types that can be safely ignored

2) Links to details on some wait types

Some of these types deserve special treatment so we will provide links to more detailed scenarios for certain types including other blog posts, posts on this blog, whitepapers, etc

This include more details on how you can use various tools to see waits such as DMVs, Extended Events, Management Data Warehouse, Activity Monitor, and Performance Counters.

3) Other References on the topic

I want this blog post to be a living document and help feed enhancements to the SQL Server Books Online. So if you encounter a question on a wait type that is confusing or not listed in this post, please post a comment to this blog post or email psssql@microsoft.com. This isn’t to obtain “free support” for a problem but to be used to help enhance and create the right content for the community on this topic.

Look for the expansion of this blog post with these details in the next 3-4 weeks following the PASS conference.

 

Bob Ward
Microsoft

Reduce locking and other needs when updating data - Better Performance

The following pattern typically stems from an old practice used in SQL 4.x and 6.x days, before IDENTITY was introduced.

begin tran
declare @iVal int

select @iVal = iVal from CounterTable (HOLDLOCK) where CounterName = 'CaseNumber'

update CounterTable
set iVal = @iVal + 1
where CounterName = 'CaseNumber'

commit tran

return @iVal

This can be a dangerous construct.  Assume that the query is cancelled (attention) right after the select.  SQL Server treats this as a batch termination and does not execute other statements.   The application now holds a lock under the open transaction and without proper handling it leads to blocking.

One Statement Fix

declare @iVal int

update CounterTable
set @iVal = iVal = iVal + 1
where CounterName = 'CaseNumber'

return @iVal

SQL Server allows you to update a value and return the updated value into a variable.   A single update statement can do all the work.   If the query is cancelled by the application the update is atomic and will complete or rollback but the application has much better control over the lock and transaction scope.

Use this to help reduce blocking and possible deadlock issues.   Since the lookup only takes place one time it can also increase performance of you application.

 

[OCT 28, 2009] - Update

One of the great things about the blog is that I get to interact with other bright individuals.   From this blog I received feedback from our SQL MVP community and they pointed out some things.

The goals of my original post were:

1.   Point out the issue with bad application design leaving transaction open and holding locks

2.   Point out that the locking behavior can result in unexpected deadlocks

3.   Point out a simple way to avoid the issue.

The scenario I presented is limited to areas where you have appropriate primary keys that prevent updated from occurring in a way you might not expect.  The following blog posts points out some of the pitfalls that things like HOLDLOCK and concurrent insert statements can encounter as well.

http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/06/30/855.aspx

This lead to the following discussion (reduced for the blog post).

----- SQL MVP Message-----

The sp_getapplock is a no brainer for me - its application "tables" independent so does not lock out read only queries which is what happens if you use UPDLOCK, HOLDLOCK is horrible and causes deadlocks, other methods require SERIALIZABLE which again leads to deadlocks. 

Orphaned transactions should never happen because it should be done in a succinct stored procedure, leaving the xact open would be a programming error.

[RDORR]  This is not always true, if the logic is wrapped in a stored procedure it still does not prevent the query timeout case.   The example below shows how you can still have a transaction open after the procedure is cancelled from the client and you don't get a warning about the transaction count being different on entry and exit of the procedure.

use

tempdb

go

select

@@TRANCOUNT

go

create

procedure spTest

as

begin

begin tran

waitfor delay '01:00:00'

end

go

exec

spTest -- Cancel execution of this with a query timeout (attention)

go

select

@@TRANCOUNT -- still == 1

NOTE:  You still need logic in the application (like all should have) to handle activity after a cancel request to reset the proper state.

Ok - the begin / try; yep - that would work; however - you will require more complex logic to do the re-try; if you have gone the UPDLOCK, HOLDLOCK methods and have used begin try then the transaction you are in can only be rolled back so you are a bit stuck there too.

[RDORR]  I am saying you do the insert and not the select.   If the insert succeeds (you have proper PK of course) then there was not a row.  If it fails then there was a row.   This only requires a single lookup in the table and you can handle the @@ERROR condition.

I had a recent client who had the orphaned transaction problem that caused massive blocking - they had no trapping in place, this one was SQL 2000 so I got them to write something around sysprocesses; in 2005 and 2008 you can use the BLOCKED_PROCESS_REPORT - you could write into the activation stored proc on the event notifications queue to pick that specific event type up and if caused by a specific statement/process then actually issue a KILL  - that make sense? It's application dependent though and I tend to go with the on BLOCKED_PROCESS_REPORT (raised because the threshold in sp_configure is set to 5 seconds) just email me and let me deal with it...

-----RDORR Message-----

Devil's Advocate:  Why would you not have a PK on the table and just try the insert.   Use Try/Catch to eat the PK violation so you don't need all the locking?    I assume this is a simple sample and you want to gather other data and it is not this simple?

What would you suggest to the user about the attention problem?  We see customers get into trouble that after they acquire the lock (say app lock in your scenario) an attention hits and they don't issue a rollback and such.   So they now lock up everything.

-----SQL MVP Message-----

It's not really a good method for doing this because you can still get repeated values (see my blog: http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/06/30/855.aspx) which uses sp_getapplock instead as an alternate approach.

The problem is that unless you literally have a straigh UPDATE and WHERE without any other table references then the SELECT side of the UPDATE is done before applying the locks to then do the update so the locking goes a bit mad.

I think the answer you need to state that you must have a unique constraint so that you are only ever updating one row and can guarantee that - also you need to do if @@rowcount > 1 then raiserror; also that you should not use sub queries and other table references....

 Bob Dorr - Principal SQL Server Escalation Engineer

Should I have a SNAC with my cumulative update?

Here is a question we have received from several customers and internally at Microsoft:

Do I need to apply the “SNAC” package that comes with some of the cumulative updates?

Just so we are all on the same terms, SNAC refers to SQL Native Client. and basically contains our SQL Server “native code” data providers including OLE-DB and ODBC.

To understand more why people have asked this question, let’s look at a Cumulative Update 4 for SQL Server 2008 SP1 as found at: http://support.microsoft.com/kb/973602 as an example. Please note that there are some differences for SQL Server 2005 which I’ll talk about later in the blog post.

When you select

image

you are presented with the following choices (by default we detect your OS platform and present you with only packages that apply, but you can expand to get all platform packages):

image

The first package titled SQL_Server_2008_SP1_Cumulative_Update_4 is the “main” CU package you would install on a server or client with SQL Server components. The package I’ve highlighted is referred to as the “SNAC package”. So when presented with this screen you may be asking “Do I need to apply this separate package?” (there are other packages listed here and I’ll talk about them as well later in this blog post).

The answer is “it depends” (don’t you hate it when people say this). Consider these two scenarios:

1) You have SQL Server installed on a server or client tools installed on a client computer

The answer is NO because if you install the “main” cumulative update package you get the bits that come in the SNAC package. You may wonder why it matters whether SNAC is updated on the server itself especially if you don’t run any applications on your server that use SNAC. Linked Server Queries are one of the big reasons. A “default” SQL Server linked server is based on SNAC so this DLL (sqlncli.dll for 2005 and sqlncli10.dll for 2008) will be loaded in SQLSERVR.EXE to run a linked server query.

2) You don’t have SQL Server installed on a client or server but have an application that uses SNAC

The answer is YES. In fact, that is why we created a separate SNAC package. For customers who just need the SNAC update but don’t want to download and install the entire main CU package.

What would happen if you did try to install the main CU package on a machine where SQL Server is not installed but SNAC was installed? The answer is nothing more than if you had just installed the SNAC package (except the SQL Setup Support Files get installed). We detect what is already installed and only offer up to you the appropriate updates. In the case of a machine where SNAC is the only thing installed, you would see a screen like this from the main CU:

image

the feature that shows up is just called “Shared Features”. If you proceed with the rest of the install, it will only update sqlncl10.dll to the version with the cumulative update.

Why did we ship a separate package for SNAC if the main package can do the trick? It is all about size and redistribution. In the above screen if you select  Show Additional Information you can se the sizes of the packages:

image

As you can see the main package is ~232Mb while the SNAC package is ~7Mb. So for customers who have downloaded the SNAC redistributable package from the SQL Server 2008 Feature Pack, having a separate SNAC package for a cumulative update is critical. Any time you see a separate package in the Cumulative update list, it is primarily because a particular software component was updated that could be installed outside the normal SQL Server installation, such as a feature pack component.

Just so I can show the complete evidence here, let’s look at the SNAC files in each of these packages (read this post to find out more about how to use msiexec to extract files from an msi package).

If you extract the sqlncli.msi package (the SNAC package), you will find several files (mostly because we include the SDK lib files) but the one to drill in on is the sqlncli10.dll file. Here is my directory listing in Powershell for my extracted file:

image

Now if I extract the main package files. (since this is an .EXE file you will need to use a tool like winzip to extract all the files from the .EXE) I will get the same file. When you do this extraction, the sqlncli.msi file will be in the path <platform>/setup/<platform> such as x64/setup/x64:

image

This second set of files comes from the main CU package. You can see the sqlncli10.dll is the exact same date/time, file length, and version.

Here are some other important points regarding SNAC and SQL Server 2008 Cumulative Updates:

  • If you go look at Cumulative Update 1 and 2 for SQL Server 2008 SP1, you will notice there is no separate SNAC package. This is because no changes to SNAC were made for CU1 and CU2. Since there were no changes, there is no separate package and the main package doesn’t contain any updates to SNAC
  • The first change to SNAC made after SQL Server 2008 SP1 was in Cumulative Update 3.  So a SNAC package was included and updates in the main package exist.
  • This means from this point forward any cumulative update package for SQL Server 2008 SP1 will include a separate SNAC package and an update in the main package with an updated version even if no changes were made to SNAC. This is in line with how we do cumulative updates. This is why Cumulative Update 4 and 5  for SQL Server 2008 SP1 included a SNAC package and updates to SNAC in the main package (including a version that lines up with the engine for that
  • Cumulative Updates for SQL Server 2008 RTM behave the same as SQL Server 2008 SP1.

What about these other packages in SQL Server 2008 Cumulative Updates?. These are packages that are separable from the main CU package because they apply to different components related to SQL Server but installed through some other means such as a feature pack. As with SNAC, they only show up when an actual change is made. At that point, they will show up for each cumulative update package going forward for that service pack:

SQL_Server_2008_SP1_Cumulative_Update_NN_RSSharepoint – These are updates that are specific to the SQL Server 2008 Reporting Services Sharepoint Add-in that installs with Sharepoint Services. This package also exists for RTM CUs.

SQL_Server_2008_SP1_Cumulative_Update_NN_RRB2ClickOn – These are updates to the Report Builder 2.0 ClickOnce package.

 

What about SQL Server 2005?

SQL Server 2005 is very similar to SQL Server 2008 in that a separate package is provided as an option when changes are made to that component. But, a major difference for SNAC is that the SNAC updates ARE NOT included in the main package.

So on a SQL Server if you need SNAC updated, you must apply both the main and SNAC packages. This is something customers using linked servers should be very aware of.

Another difference is some other separate packages exist that are not in SQL Server 2008. Like SQL Server 2008, these exist to update components that can be installed outside of the normal SQL Server installation. They may not exist in all cumulative update packages because they will only appear when a change is made (or any CU afterwards):

SQL_Server_2005_SP3_Cumulative_Update_NN_SQLWriter – Updates for the Microsoft SQL Server VSS Writer Service used for SQL Server VSS backups. This package also exists for SP2 CUs.

SQL_Server_2005_SP3_Cumulative_Update_NN_RS_Sharepo – Like 2008, this is an update for the SQL Server 2005 Reporting Services Sharepoint Add-in. This package also exists for SP2 CUs.

SQL_Server_2005_SP3_Cumulative_Update_NN_XMO – This is an update to the Microsoft SQL Server 2005 Management Objects Collection as found in the SQL Server 2005 Feature Pack. This package also exists for SP2 CUs.

 

So for SNAC, here is a quick summary:

  1. If you have installed SNAC as a separate installation (say from the feature pack), then just apply the SNAC package. This applies to SQL Server 2005 or 2008
  2. If you have installed SQL Server 2008 on a client or server and want to make sure SNAC is updated, just apply the main cumulative update package. Any SNAC updates are included.
  3. if you have installed SQL Server 2005 on a client or server and want to make sure SNAC is updated, you need to apply the SNAC package in addition to the main cumulative update package

Bob Ward
Microsoft

Posted by psssql | 2 Comments

Reporting Services and the MS09-062 GDR (GDI+)

Updated 10/29/09:  Added section describing SQL Engine only servers at the bottom.

 

Brian Hartman did a great job outlining what the GDR is and why we had it.  You can read it here.  I’m not going to rehash that here.  What I wanted to do is outline some of the scenarios involving the GDR.  The main thing to be aware of is Windows 2000.  Read the section “The transition” in Brian’s blog.

Scenario 1:

In this scenario, we have a Windows 2000 Client using Reporting Manager to render the report.  The Report Server does not have the GDR applied.  The Windows 2000 Client get an error when they try to print - “Unable to load client print control” – once the kill bit has been released.  Clients on WinXP/Win2k3 or later will still be able to print fine.

Scen1

Scenario 2:

In this scenario, we have a Windows 2000 Client using Report Manager to render the report.  The Report Server does have the GDR applied.  In this case, everyone can print.

Scen2

Scenario 3:

In this scenario, the RS Server is patched, but we have introduced another layer.  This layer would either be the Report Viewer Control (Win/Web Control) or hitting a report through SharePoint Integration.  In both cases, there is a manual update you need to perform to apply the GDR for either the SharePoint box, or the application using the Report Viewer Control.  In this case, they have not been updated yet, but the RS Server has been updated.  When the Win2K clients go to print, they will receive “Unable to load client print control”.

Scen3

Scenario 4:

This scenario is the same as before, except we have updated the RVC and the SharePoint Add-in on the SharePoint Web Front End.  Any client should be able to print.

Scen4

How do I know if the GDR is applied?

The build number is the best way to tell if the GDR is applied. These build numbers include this GDR.  The KB Articles associated with the releases includes the files that get updated.

Release Build number
RS 2000 Service Pack 2 8.00.1067
RS 2005 Service Pack 2 9.00.3080
RS 2005 Service Pack 2 QFE (this will bring you up to Cumulative Update 15 as well) 9.00.3353
RS 2005 Service Pack 3 9.00.4053
RS 2005 Service Pack 3 QFE (This will bring you up to Cumulative Update 5) 9.00.4262

What do you mean by QFE in the table above?

By QFE, I mean that you are not running at base SP2 or SP3 for SQL 2005.  This means you have either applied a single hotfix, or have applied a Cumulative Update.  Because we are down that road, the GDR release is tied to whatever that last Cumulative Update was released which for SP2 was CU 15 and for SP3 was CU5.

Important Note: If you have more than Reporting Services on the server that you are applying the QFE version of the GDR to, it will also update any other services that you have on the box to that same build number (i.e. SQL Engine, Analysis Services, Integration Services, Tools, etc…).  SQL Native Client will not be updated.

This could have some ramifications, both for RS as well as any other SQL Service you are running.  For example, if you have SQL Server 2005 SP2 CU7 with the following services installed (RS and Engine), once you apply the GDR you will then be upgraded to CU15 level of fixes.  Every fix between CU7 and CU15 will be applied to both RS and Engine.

Why am I having to install this for just the SQL Engine?

We version bumped sqlservr.exe as part of the GDR.  No code change was actually made to the engine though.  As a result, if all you have is SQL Engine on your server, you will be flagged as needing this GDR if those conditions are met (based on your version as described above). As this GDR is specific to the RSClientPrint control and Reporting Services, your organization will have to determine if you want to apply it to your servers that only have SQL Engine and not Reporting Services.  Be aware about the Cumulative Update notes that were mentioned in the QFE topic above as that aspect will definitely affect the Engine based on the updates available in those cumulative fixes.

 

Adam W. Saxton | Microsoft SQL Server Escalation Services

Microsoft CSS is back at PASS again…

In little under a month (November 3-6), the US PASS 2009 Summit is being held in Seattle, Washington at the downtown Seattle Convention Center. The Microsoft CSS team has been speaking and working at PASS since 2003 and this year we are back again. i thought I would share with you our involvement at this year’s conference as you may be thinking about whether you should attend or if attending what presentations and activities you should go to:

Pre-Conference Seminar

This year we only have 1 pre-conference seminar (PASS this year is doing post-conference seminars but we are not participating in that). Adam Saxton, an escalation engineer in Texas, specializes in Reporting Services. Whenever I have any questions or issue on SSRS, I turn to Adam.

So he will be presenting a day long seminar called Tackling Top Reporting Services Issues. Adam in this session will explain, review, and demonstrate what he does for a living everyday. That is solve problems customers have with Reporting Services. This includes but not limited to:

  • How to look at server logs, configuration settings, and troubleshooting rendering issues
  • Report Design issues such as pagination
  • Issues with integration with Sharepoint
  • Kerberos security issues

If I know Adam, there will be plenty of demonstrations to show during the day. If you work with Reporting Services and want to avoid problems, know how to handle certain situations, or plan better for deployment, you won’t want to miss this session

Main Conference Talks

We have four main conference talks this year by CSS engineers. I’ve listed these in chronological order by date. Please note that in past conferences I’ve seen the agenda times and rooms change at the last minute, but this information is “as is” according to the posted schedule on the PASS website.

(DBA-500-SC) Inside SQL Server Wait Types – Bob Ward

Tues 11/3 10:15 – 11:45am   Room 3AB

This is a “500” level talk which means it can be very deep and internal focused.  However, if you have sees “wait types” show up in DMVs like sys.dm_exec_requests and wonder what some of these mean you should attend this talk. For example, I’ll be covering wait types you can generally ignore and new wait types for SQL Server 2008 called “pre-emptive” wait types. I’ll also talk about a plan for us to document all of the wait types (at least the ones that you as customers need documentation on). In this session, I’m not going to spend much time on some of the very common “waits” you encounter every day such as locks, page latches (although I will talk about non-BUF latches), and network I/O. I’ll be going into the more obscure or advanced wait types, but also talk about things like the history behind wait types and the underlying algorithm for how they work. Plenty of demos in this talk including DMVs and Extended Events. Unlike my talk  last year on memory, I’m not sure this year’s talk will be as some said a 600 level talk, but I have no doubt the Windows Debugger will pop-up somewhere in the discussion<g>

A very good complimentary session to attend is Andrew Kelly’s Capturing and Analyzing File & Wait Stats. This is where you will find a discussion on the common wait types. Andrew’s talk is scheduled right now for Thursday, 11/5 at 10:45am in Room 3AB

(DBA-X69-C) Implementing and Supporting SQL 2008 Failover Clustering - Shon Hauck

Tues 11/3 1:30-2:45pm   Room 4C1-2

Shon is the “goto” guy in SQL CSS for clustering and an escalation engineer based out of Charlotte, North Carolina. Before he joined our team he supported Windows clustering so he has knowledge of internals of both Windows and SQL Server clustering. Shon spent a great deal of time during the CTP launches of SQL Server 2008 giving the product team feedback on SQL Server 2008 clustering so he knows the “gotchas” before and after you deploy. If you have to support SQL Server 2008 in a clustered environment, this is a talk you need to attend. Shon always covers things in his talks that are hard to find in any online resource so I think you will come away with information you can use immediately when it comes to SQL failover clustering.

(BIA-X45-C) Top customer support issues in Analysis Services – John Sirmon

Wed 11/4 1:30-2:45pm  Room 2AB

Three or four years ago, all you would have seen from CSS at PASS are talks related to the SQL Sever engine. John’s talk shows how far we have come. John is an escalation engineer in our Charlotte office and brings a unique skill set that involves both Analysis Serves and Reporting Services. The full title of his talk is Top customer support issues in Analysis Services 2005/2008 and how to resolve them. This is the type of information that CSS uniquely brings to the table at PASS. We can present to you what other customers are having problems with our software and give you solutions in the form of a talk. So John here will cover things like MDX Query Performance problems, how to handle some of the more innocuous errors like “the operation has been cancelled”, and a discussion of common design mistakes that can affect performance. Whether you manage an SSAS deployment or design SSAS objects and applications there are plenty of good tips here to make your deployment and operations of SSAS more efficient.

(AD-X43-C) Troubleshooting applications accessing SQL Server - Abirami Iyer and Lakshmi Jonnakuti

Thurs 11/5 1:00-2:15pm   Room 613-614

Many people don’t know but our CSS team helps developers of data “based” applications as well as the normal “server” products. So if you develop database applications you will want to attend this session. DBAs can benefit from this session as well as Abi and Lakshmi will cover connectivity and authentication issues for applications accessing SQL Server. You will definitely want to see them demonstrate some of the tools they use everyday including BID Tracing, Network Monitor (netmon), SQL Profiler, and the Windows Debugger. Abi and Lakshmi are both in our escalation services team based near Seattle in our Sammamish office. Their focus is on Developer technologies which we often call internally “WebData”. So they both know our developer data access technologies and help developers all of the time. But they both know how our providers interact with SQL Server so this is a great talk to see the “end to end” story when it comes to client/server SQL Server issues.

The SQL Server Clinic

Last year we changed the brand name of the traditional SQL Server Service Center to the SQL Server First Aid Station. We focused our efforts on simply answering your questions about SQL Server technologies. This year we are going to change the brand name again and for a good reason. The past few years, we have found that we are sometimes pointing folks who have more “design” or “advisory” questions to the SQLCAT team. And the SQLCAT team found they were telling people with traditional “problems” to see CSS in the First Aid Station. So…. this year we will be in the same room together for something called the SQL Server Clinic.  And we also have moved the room to a more central location, Room 611 which is right in the “middle” of the hallways leading to the ballrooms on the 6th floor. I’ve been told our room will be larger than last year as well.

Our CSS staff will include our speakers from the conference as well other engineers from our Sammamish office rotating in each day. This is really a unique opportunity to interact with our CSS and SQLCAT teams like no other opportunity I’ve ever seen. We cannot guarantee this is like “getting a free case” from CSS, but we definitely can point you in the right direction. I think many of our engineers also like meeting our customers face to face given we spend most of our time interacting with them over the phone or email. The questions we get in the clinic range from “how does this work” to “I have a crash can you look at it?”. In some situations in the past, I have been able to use my laptop or the customer’s and demonstrate to them how to solve a particular problem on the spot or by the end of the conference. They don’t all work out that way but I would say just about everyone that comes to us a with a question or problems leaves with more than when they walked in the door.

The clinic will be open Tuesday through Thursday of the conference. We are still finalizing the hours but generally you should be able to talk to someone from a few minutes after the keynotes each day (we want to see them too<g>) until around 5 or 6pm.

 

Counting both US and Europe PASS conferences, this will be my 10th PASS Summit to speak and represent Microsoft. It is definitely something I look forward to every year. I would like to hear directly from many of you reading this that will be in attendance.. So please look me up after my talk, in the hallways, at lunch, at one of the evening events, or in the SQL Server Clinic. I’m open to striking up a conversation on anything related to SQL Server. Of course, sports is a big passion for me as well so if you want to also chat about your favorite sports team, it is always a great way to start a conversation with me. See you at PASS in November!

 

Bob Ward
Microsoft

Posted by psssql | 0 Comments
Filed under:

Reporting Services: What’s my version?

I was working on a case where I needed to try something on different Cumulative Updates (CU) to see what the result was in each.  Let me outline the version numbers for the Service Pack 1 release we have to date.

Release Version Number
SQL 2008 RTM 10.00.1600
SQL 2008 SP1 10.00.2531
SP1 Cumulative Update 1 10.00.2710
SP1 Cumulative Update 2 10.00.2714
SP1 Cumulative Update 3 10.00.2723
SP1 Cumulative Update 4 10.00.2734

After I had installed both CU1 and CU2, I noticed the following when I browsed to http://<server>/ReportServer.

rsserver

When I looked at the RS Server Log, I found the following:

rslog

This made me think that the CU did not apply.  However, when connecting to my SQL Engine on the same box via Management Studio, I see the following

ssms

This is the CU2 version.  The CU did apply.  So, why is the ReportServer page and the Log file indicating the SP1 base build number of 2531?  It is currently picking up the version number from ReportingServicesService.exe.  In the case of CU1 and CU2, the actual ReportingServicesService.exe file was not updated.  Only associated files were updated as part of that patch.  However, for CU3 and CU4 the EXE was updated.

You might be asking what the right approach is to determine your version of Reporting Services.  And, if you call Support, the one of the things we are going to want to know if what Version you currently have.  We typically look at the Report Server Page and the RS Logs to get that information.  The current exception to this would be if it shows the base SP1 version, in which case we will need to check some other files to validate that you are really on SP1 or on a CU.

Files

This is not the most ideal situation, but that is what is needed currently if you are looking for your version number.  Also be aware that even though CU3 and CU4 display the correct version, that doesn’t mean that CU5 will have the EXE updated.  The CU Articles list the file changes for that given patch.

kb

If you don’t see this listed in the CU KB Article, then that means it was not updated and your version display will not be accurate from the Report Server Page or the log file.

NOTE: Another caveat to this is Management Studio when connection to Reporting Services.  Currently it still displays the RTM version number (10.00.1600) and should NOT but used to determine the version of Reporting Services on your Server.

Adam W. Saxton | Microsoft SQL Server Escalation Services

Did your backup program/utility leave your SQL Server running in an squirrely scenario?

My colleges asked me if 'squirrely' is a technical term and for this post the answer is yes.  CSS is not going to deny support to customers but SQL Server was not tested in this scenario so you may have chased yourself up a tree, hence I use the term squirrely.

SQL Server 2005 introduced snapshot databases and modified DBCC to create secondary snapshot streams for online DBCC operations.   The online DBCC creates a secondary stream of the database files that is SPARSE.  CSS has found that if a 3rd party backups and utilities or NT Backup is used against the database files the SPARSE setting may get incorrectly, propagated to the parent stream.   In the case of DBCC this is the original database files(s).

Repro

  • Create Database MyDB
  • DBCC checkdb(MyDb)   -- Completed without error
  • Utility like NT Backup touches the database files  (Incorrectly makes SPARSE sticky on on the main file stream)

The next time the database is opened (recovery, restart, etc…) the sparse attribute it detected by SQL Server and the status shown in sys.database_files is updated to indicate is_sparse = TRUE.

use MyDB

select

is_sparse, * from sys.database_files

If the is_sparse is not equal to 0 it indicates that SQL Server is treating the file as a sparse file.  This causes alternate, inappropriate, code lines to be used in areas such as auto grow.  Future releases of SQL Server may contain additional messages in the error log when this situation is encountered.

The case I am working on today shows a primary database file (not a snapshot or secondary sparse stream) using the Windows API DeviceIoControl to zero the contents during an auto grow.   This is not the normal code line as the DeviceIoControl to zero the contents is only used for sparse files.   The MSDN documentation associated with this (http://msdn.microsoft.com/en-us/library/aa364597(VS.85).aspx - FSCTL_SET_ZERO_DATA) indicates that the processing may deallocate other locations in the file while handling the request.

"If the file is sparse or compressed, the NTFS file system may deallocate disk space in the file. This sets the range of bytes to zeroes (0) without extending the file size."

SQL Server does not support backup and restore of snapshot/sparse files so when the primary database is treated as sparse the SQL Server is running in untested situations and the support boundaries blur.

What Should I Do?

Run a query against each of your databases and look for the is_sparse <> 0.  For any files that are not snapshot databases you need to copy the data out of the file, drop the file, create a new file and load the data.  I.E.: Transfer your data to a new file.

Then determine what utility is touching the file and propagating the sparse attribute and configure it to avoid the SQL Server files.

 Bob Dorr - Principal SQL Server Escalation Engineer

 

[Sep 27 Update]

One of the things I love about the blog interactions.  I have had a great list of questions related to this blog already so I would like to add a Q/A section.

Are you saying that if any Snapshots exist when using these third party tools that it could cause the is_sparse flag to be turned on?

[[rdorr]] No it is a Windows based issue.  When DBCC runs it creates a sparse stream.   SQL destroys the stream at the end of DBCC but the sparse bit becomes ‘sticky’ and will get upgraded to the primary stream.

 

Is the problem because the snapshots exist _while_ those backups are taking place?

[[rdorr]] No, the DBCC just needs to be executed so at some time the secondary steam existed.   It does not apply to snapshot databases (Create database for snapshot)

 

In the case of the DBCC activity – is the problem only going to happen if the DBCC is taking place while the third party utility kicks in?

[[rdorr]] No it can occur after DBCC has completed successfully.

You talk about copying the contents out of one file and putting them into another file -- i'm assuming you are meaning a new FileGroup and moving the objects into a new filegroup to transfer all the data to the new filegroup.

 

[[rdorr]] Yes a new file in the same file group or a new file group.

 

The question I have there is that if it is the filegroup that contains the system objects for the database, how do we get that information over into the new file?

 

[[rdorr]] New database you can't move system objects.

 

 

Is it not possible to change the SPARSE attribute on the file system back so that SQL treats the file correctly?

 

[[rdorr]] Not cleanly.  If you have such a situation the file system is tracking it as a sparse file and we have all kinds of unknowns.

 

This sounds to me like a very dangerous situation that could easily result in dataloss.

 

[[rdorr]] Should not result in data loss.  NTFS tracks the correct allocations.  The problem is that sparse files are limited in size so you are running along and you can’t grow anymore or a backup may not restore.  This is where you get into possible data loss.

!!! NOTE !!! This brings up a good point.  After the problem is corrected you should take a full backup.

 

Is the problem only encountered during growths?   if so, should we disable growth until we can reset the file attribute or move the contents somehow to another file(group)?

[[rdorr]] Grow shows the behavior but the file is already getting tracked as sparse.

 

[Sep 29 Update] - HOW CAN I FIX THIS

Continued work on this has revealed SQL 2005 and SQL 2008 differences.    Windows has published various KB articles on how to change a file at the NTFS level from sparse to non-sparse.   The basics are a file copy.   Copy expands the destination file (does not retain the sparse attribute).

 

FIX STEPS for SQL 2008  

!!! WARNING !!!  Take appropriate backups and precautions when attempting these corrections.  A failed step in this process could render the database unusable and you may lose data.

  • Close the database files

ALTER DATABASE MyDB OFFLINE

  • Validate the sparse setting using an elevated admin cmd prompt  Examples shown below.

C:\Program Files\Microsoft SQL Server\MSSQL10.SQL2008\MSSQL\DATA>fsutil sparse queryflag dbTest.mdf

This file is NOT set as sparse

 

C:\Program Files\Microsoft SQL Server\MSSQL10.SQL2008\MSSQL\DATA>fsutil sparse queryflag dbTest.mdf

This file is set as sparse

Note: File that are 'set as sparse' that are NOT databases created using 'CREATE DATABASE FOR SNAPSHOT' should be corrected.

  • Rename the database file

ren MyDB.mdf  MyDBSparse.mdf

  • Copy the data into a new file.  (Note:  Do not copy with overwrite as copy will retain the sparse attribute)

copy MyDBSparse.mdf MyDb.mdf

  • Use FSUtil to validate all files for the database are no longer sparse.
  • Bring the database back online

    ALTER DATABASE MyDB ONLINE

VALIDATE

Validate that is_sparse returns the expected outcome.

use MyDB

select is_sparse, * from sys.database_files

BACKUP

Take a full SQL Server backup of the database and make sure other backups are no longer causing the issue to re-occur.

FIX STEPS for SQL 2005

Use the database copy wizard to copy the data into a new database or upgrade to SQL Server 2008 and use the steps outlined above.   I tested the sp_detach_db on SQL 2005 with an sp_attach_db on SQL 2008 and the is_sparse status can be restored to the expected running state

Posted by psssql | 0 Comments

If you use linked server queries, you need to read this….

If you use the linked server feature with SQL Server 2005 and 2008, please read through this carefully. We have discovered several problems that can result in memory leak(s). The conditions are a bit complicated so let me try to describe the problems, how you could be affected, and what action(s) you can take.

SQL Server 2008

  • ANY remote stored procedure execution will leak memory for each execution on the local server (the server where you initiated the remote procedure execution). The leak is not large for each execution (around 40 bytes) but over time this can add up. This specific problem involves the RETURN status of a stored procedure so there is no way to avoid it (even if you don’t use RETURN in your proc a return status is sent back to the client) if you execute remote stored procedures.
  • If you use sql_variant SQL Server data types you could also face a different memory leak (again on the local server only). These leaks can occur under the following conditions: 
    • If you use a sql_variant type for an OUTPUT parameter of a remote stored procedure you will leak around 40 bytes just for using a sql_variant OUTPUT parameter (the leak is per parameter). If the value of the parameter is of type char, varchar, nchar, nvarchar,binary, or varbinary, you will also leak the size of the value itself for each parameter. So a large character string over time could result in a considerable memory leak.
    • if you run a remote query that returns a result with a sql_variant column AND the value of the column is a char, varchar, nchar, nvarchar, binary, or varbinary value, you will leak the value of that variant for each row returned to the local server. This one has a potential to cause a fairly significant leak depending on how big the values are and how many rows are returned.

SQL Server 2005

  • You are only affected by the sql_variant problems listed above.

This problem occurs if the linked server query uses the the SQL Server Native Client (SNAC) provider (this will happen with both the SNAC9 and SNAC10 providers), or the Microsoft OLE-DB Provider for SQL Server. The problem is specific to how the SQL Server engine handles a SSVARIANT structure so only applies to SQL Server providers. If you add a linked server query through Management Studio and pick “SQL Server” you are using the SQL Server Native Client so you would be subject to this problem.

I’ve tried to plan head of time a FAQ on this topic to help fill in more details:

1) Are these memory leak problems fixed?

The first problem is..…but not completely (yet). The first problem for a leak of return status values is fixed in SQL Server 2008 CU3 for SP1 (See fix article http://support.microsoft.com/kb/971622). That problem doesn’t happen in SQL Server 2005 so no need for a fix with that version. We are also looking to create a fix for this in the next cumulative update for SQL Server 2008 RTM.

For the sql_variant problem, we are also looking to create fixes for SQL Server 2005 CU6 for SP3 and for SQL Server 2008 RTM and SP1 in a future cumulative update.

Please note that the fix for this problem is within the SQL Server Engine code, not in the SNAC code. Therefore, when you are applying a fix from the CU updates, you need to pick the complete update package. For SQL Server 2008 CU3 for SP1 this package is called SQL_Server_2008_SP1_Cumulative_Update_3. This package also includes any update to SNAC. The package called SQL_Server_2008_SP1_Cumulative_Update_3_SNAC just contains updates to the SNAC providers so don’t try to just pick this package to get a fix for this problem.

2) What can I do to avoid the problem if a fix is not available yet?

Aside from not running remote procedures or using sql_variant columns, there is no way to avoid the problem. For 32bit customers, you may be able to mitigate VAS issues this problem causes by running all linked server queries out of process.

3) What symptoms can I see due to this problem?

It is possible you will never see a problem depending on which condition of the leak(s) you are hitting. For example, for the “return status” leak, the leak is only around 40 bytes for each procedure execution. You may run for weeks or even months without this causing a problem. For 32bit users, this may cause you problems quicker because of the limited amount of VAS space. Running the linked server out of process can definitely help your situation for 32bit. Note the memory that is leaked here is in the default Windows heap for the SQLSERVR.EXE process, not in the Buffer Pool. This is why for a 32bit SQL Server installation the problem might result in memory errors because the VAS is limited (e.g.MemToLeave)

The problem was first reported to us because of a 32bit system that saw errors in the ERRORLOG like:

Date Time SPID AppDomain 1411 (mch.dbo[runtime].1516) is marked for unload due to memory pressure.
Date Time SPID AppDomain 1411 (mch.dbo[runtime].1516) unloaded.

These messages can occur due to memory pressure (typically VAS pressure for 32bit and physical memory pressure for 64bit). But there are other types of memory related errors that might occur due to a leak of heap memory.

As future fixes for this problem are released, I’ll update this blog post

 

Bob Ward
Microsoft

Posted by psssql | 0 Comments

You may not see the data you expect in Extended Event Ring Buffer Targets….

Not sure how many of you have started using the new Extended Events feature of SQL Server 2008. For those who have not, there are several good resources out there already for you mostly done by the community:

http://sqlblog.com/blogs/jonathan_kehayias/archive/2009/05/27/whitepaper-using-sql-server-2008-extended-events-published.aspx

http://sqlblog.com/blogs/adam_machanic/archive/2009/04/22/teched-coming-up-recommend-background-for-my-extended-events-talk.aspx

http://technet.microsoft.com/en-us/magazine/2009.01.sql2008.aspx

I also have a few blog posts on our CSS blog that may interest you as well:

http://blogs.msdn.com/psssql/archive/2008/07/15/supporting-sql-server-2008-the-system-health-session.aspx

http://blogs.msdn.com/psssql/archive/2009/04/30/why-should-i-use-extended-events-in-sql-server-2008.aspx

So enough plugging, what about this problem I refer to in the title? As we in CSS have used in XEvent (can’t help myself this is what we call this feature internally) and seen feedback from others in the community (See Jonathan’s Connect post at http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=432548), a limitation has been found with the “Ring Buffer” target.

If you have an event session setup for a ring_buffer target and the data you feed the target exceeds 4Mb, you may not be able to retrieve all XML nodes from the target data. This behavior can look like “torn XML” where some of the pieces of a captured extended event cannot be seen.

Let me be more specific about this problem and what I mean by the 4Mb limit. Data is retrieved for “in-memory” targets such as ring_buffer from querying the column target_data from sys.dm_xe_session_targets. An example query for the installed system_health session looks like this:

select s.name, CAST(t.target_data as xml)
from sys.dm_xe_session_targets t
join sys.dm_xe_sessions s
on t.event_session_address = s.address
and s.name = 'system_health'

If you look at the definition of the sys.dm_xe_session_target DMV in our documentation, the target_data column is defined as nvarchar(max). But in reality this column this column is limited to 4Mb of formatted XML text.

The data produced for an in-memory target is binary as it is produced and consumed within the engine. But when you query the DMV we format the internal data into XML. It is the formatted, expanded XML data that is limited to 4Mb.  Here is where it might be confusing. An XEvent target has a “max memory” for its buffering but that is for the binary version of the data. By default, this value is 4Mb, but you can change this when you create an event session. However, for “in-memory” targets you will never be able to see more than 4Mb of formatted XML. It is very possible for the binary data collected in an in-memory target to be less than 4MB but the formatted version of the XML exceeds 4Mb.

You are more likely to see an issue like this when setting up an event session where a lot of text is formatted from the target (Some possible examples are events like xml_deadlock_report, sql_statement_starting,…) or if you use actions such as sql_text.

There are a few solutions here should you encounter the problem.

  • Use a file target such as asynchronous_file_target or etw_classic_sync_target
  • Set the MAX_MEMORY of your target for the event session lower than 4Mb. This second solution is not something that will guarantee you won’t hit the problem because it is impossible to determine what is the right size to avoid the 4Mb limit of the formatted XML You will need to experiment with whatever events and actions you are using for your session. When using this method (and even if you didn’t limit your memory), you can query the DMV and save off the results to a file periodically.

This problem can occur for any “in-memory” target including ring_buffer, synchronous_bucketizer, asynchronous_bucketizer, pair_matching, and synchronous_event_counter since all of these targets require you to query the sys.dm_xe_session_target DMV to retrieve your data. Note that the probabilities are higher for a ring_buffer target since this is just a “raw feed” of the event data while the others listed here are aggregations of events.

Bob Ward
Microsoft

Posted by psssql | 0 Comments

How It Works: What are the RING_BUFFER_RESOURCE_MONITOR telling me?

The ring buffer records (which can be sent to XEvent) for Resource Monitor and Memory Broker are the key aspects to understanding RM.  The record is produced when a change is detected in state monitored by RM.   

CREATE EVENT SESSION RingBufferInfo

ON SERVER

    ADD EVENT sqlos.resource_monitor_ring_buffer_recorded,

    ADD EVENT sqlos.memory_broker_ring_buffer_recorded

    ADD TARGET package0.asynchronous_file_target

       (SET filename = N'E:\XEvent\RingBuffer.etx', metadatafile = N'E:\XEvent\RingBuffer.mta',

                        max_file_size = 50, max_rollover_files = 10)

    WITH (MAX_MEMORY=4MB, MAX_EVENT_SIZE=4MB,STARTUP_STATE = ON);

Allow me to breakdown the following record.    ( select * from sys.dm_os_ring_buffers where ring_buffer_type = 'RING_BUFFER_RESOURCE_MONITOR'  )

<ResourceMonitor>

<Notification>RESOURCE_MEMPHYSICAL_LOW</Notification>  <---------------- Current notification state being broadcast to clerks

<IndicatorsProcess>2</IndicatorsProcess>   <-----------------------   Indicator applies to the process as low physical memory

<IndicatorsSystem>0</IndicatorsSystem> <-----------------------  0 means it is NOT a system wide indicator situation

<NodeId>0</NodeId>

<Effect type="APPLY_LOWPM" state="EFFECT_OFF" reversed="0">0</Effect>

<Effect type="APPLY_HIGHPM" state="EFFECT_IGNORE" reversed="0">128163281</Effect>

<Effect type="REVERT_HIGHPM" state="EFFECT_OFF" reversed="0">0</Effect>

</ResourceMonitor>  

The RM record has 3 major components showing various memory details.  

                The <RESOURCEMONITOR> portion of the record is handled from the local resource monitor.   You have a RM per scheduling node. 
                The <MEMORYNODE> details are from the memory node association or the RM.

                The <MEMORYRECORD> comes from global state information.  

The key for understanding why RM is running is in the ResourceMonitor section of the output.   This can be broken down into the Notification, Indicators and Effects.  

Notification

Considered the broadcasted notification state.      

RESOURCE_MEMPHYSICAL_HIGH - SQL can grow memory usage
RESOURCE_MEMPHYSICAL_LOW - System or internal physical memory - shrink
RESOURCE_MEM_STEADY
RESOURCE_MEMVIRTUAL_LOW – Virtual address range for SQL Server process is becoming exhausted. Commonly the largest free block is less than 4MB.

IndicatorsProcess

Process wide indicator using an |= of the following values 

IDX_MEMPHYSICAL_HIGH = 1
IDX_MEMPHYSICAL_LOW = 2
IDX_MEMVIRTUALL_LOW = 4

IndicatorsSystem

System wide indicator an |= of the following values

IDX_MEMPHYSICAL_HIGH = 1
IDX_MEMPHYSICAL_LOW = 2
IDX_MEMVIRTUALL_LOW = 4

It is considered a system indicator if the query routine returns TRUE.   SQL Server listens to the Windows physical memory notifications so it can be signaled when physical memory becomes low or available.


This state is often the windows memory notifications unless an override occurs because of the the EFFECT information.

Effect

Currently 3 types of effects exist so a row for each is produced.

 

Type = indicator type

State = current effect state  (ON, OFF or IGNORE are valid states)

Reserved= this maps to an applied state that toggles from 0 or 1 based on if the effect has been applied.  Applied indicates that the memory state has broadcast and we have achieved somewhat of a steady state for this indicator.  

Value = duration that the effect has been in the reported state.

NodeId

Memory Node association of the RM. 

 

The notification is the current RM state.  For example, once RM detects LOW PM state it is going to execute and attempt to shrink caches and other memory users.     

The indicators tell you if this is an internal or external condition.   System wide indicates the system had a hand in the indication and process means it was something internal that SQL Server detected.  

The Effects represent the state SQL thinks it is in by looking at the working set size and basic memory targets.   The EFFECT logic is used during system level checks.   For example the RM can look at the physical memory and it shows high but the effects indicate Windows paged SQL Server out so the high should be ignored so we don’t consume more memory and cause more paging.    The effects logic can be disabled using a startup trace flag –T8020 to avoid honoring the memory effects.   This might be an option on the servers to narrow down what can trigger RM to execute as long as the working sets of the SQL Server instances are running steady with the target memory but it should only be used for troubleshooting purposes.  

From the ring buffer example is stating that this is not a system wide issue but a process issue.   The effects are off or ignore so it is not a working set type of issue so what turns this on to low physical?  

There are memory broker decisions that come into play at this time.   What the information is is saying is that a cache or the cache’s predicted usage will exceed internal targets and we need to start doing cache cleanup.    By looking at the memory broker, ring buffer entries, you can see the behavior that broker it asking of RM.   Using the clock hands you can see the internal and external hand movement.   This will help you determine which caches are involved.    The dbcc memorystatus() show good details and last notification states as well.  

When the SHRINK appears this is when it could tell RM to help out and result in RM showing the ring buffer above of LOW PM.  

select * from sys.dm_os_memory_cache_clock_hands

 

MEMORYBROKER_FOR_CACHE (default)         Pages

---------------------------------------- -----------

Allocations                              1778

Rate                                     0

Target Allocations                       460798

Future Allocations                       0

Overall                                  482733

Last Notification                        1  ß---------------------------  Memory Broker NOTIFICATION

        ACTIONS (STABLE) = 0
        ACTIONS (GROW) = 1
        ACTIONS (SHRINK) = 2

 Bob Dorr - Principal SQL Server Escalation Engineer

Posted by psssql | 0 Comments

Fun with Locked Pages, AWE, Task Manager, and the Working Set…

I realize that the topic of “locked pages’ and AWE can be confusing. I don’t blame anyone for being confused on this. I also realize we have blogged and talked about this topic many times perhaps “beating it to death”. And I certainly know this is not really fun to anyone(but it made for a catchy title). But I still get questions both from customers and internally within Microsoft about these topics for both 32bit and 64bit SQL Server systems. So I thought a blog post that summarizes and clarifies a few points might be valuable. Maybe this will be the one resource that “turns on the light” for you. Many people respond well to a “FAQ” so I’ll create this topic in that format. I recommend you read through each FAQ one at time in order, because some of the answers need to be read first to understand the ones below it:

1. What is the difference between AWE on 32bit systems and “Locked Pages” on 64bit systems?

I think it is all about using the right terms. SQL Server introduced a “feature” called AWE before we even shipped a 64bit version of SQL Server. The concept was to extend the ability of SQL Server to address more memory than the limits of the virtual address space (VAS) on 32bit systems (which is 2Gb-3Gb). But the SQL team couldn’t do this alone. So the Windows team built an API to support this capability called Address Windowing Extensions.(AWE) If you want to go a step further and find out what is the difference between Physical Address Extensions (PAE) and AWE, read this resource.

One of the interesting effects for any application that uses the AWE API is that any memory allocated with these APIs is not part of the process working set. Therefore, Windows considered this memory as “locked” (i.e. cannot be paged to disk). Therefore, the user who launches an app that uses the AWE API must have the “Locked Pages in Memory” privilege (for example the service account for SQL Server) set.

So now the SQL team introduces a x64 edition of SQL Server with SQL Server 2005. Because a process running on x64 doesn’t have the same VAS limits as 32bit (the numbers are crazy here. Theoretically, a 64bit process has a 16 Exabyte address limit which Windows doesn’t even support yet. In fact, Windows limits VAS to 8TB but physical memory is limited to 2TB….today), there is no need to use any special APIs to address memory bigger than the VAS. In other words, SQL Server should be free to go back and just use plain ol’ VirtualAlloc() to allocate memory. Well….the developers of the product discovered that if they still use the AWE APIs to allocate memory even though it is not really needed, two things would happen:

  • A small performance gain occurs within the kernel. For more details, read this blog post from Slava Oks: http://blogs.msdn.com/slavao/archive/2005/04/29/413425.aspx
  • Just as with 32bit systems, any memory allocated using  the AWE API is not part of the working set and therefore cannot be paged to disk. Therefore it is considered “locked”.

Thus was born the concept most refer to as “locked pages” for 64bit SQL Server editions.

Now remember the requirement to use the AWE APIs for any Windows application? The user account running the process must have the “Locked Pages in Memory” privilege set. So in the 64bit SQL Server engine, if this privilege is set (and a SKU check. See a different FAQ below for more discussion on this), we internally use the AWE APIs to allocate memory. By using the AWE APIs, we have in effect enabled a “locked pages” feature for the SQL Server engine.

So in summary the AWE APIs for 32bit and 64bit SQL Server systems are used for different purposes. In 32bit it is really to extend memory access beyond 4Gb or to enable the AWE feature. For 64bit systems, it is to possibly gain performance and to “lock pages” for the buffer pool.

2. Do I need to use the “awe enabled” sp_configure option on 64bit systems for SQL Server to “lock pages”?

No. In fact, the code for SQL Server for 64bit systems ignores this sp_configure option. It is a “no-op” for 64bit SQL Server systems. You may ask why is this the case if I just told you that AWE APIs are used in 64bit SQL Server systems to “lock pages”?

The answer is based on the purpose for that sp_configure option. The purpose of this sp_configure option on 32bit systems is for the user to “enable” the “AWE” feature, which is I explained above is to extend the ability to reference memory > 4Gb. Now as I mentioned already in order to use the AWE APIs you must have the “Locked Pages in Memory” Privilege. So, when you try to use sp_configure to set ‘awe enabled’ on a 32bit we actually will fail this command if “Locked Pages in Memory” is not set.

3. Why is Task Manager not showing all of the memory allocated for SQL Server?

Imagine you walk up to a 64bit SQL Server installation and the customer tells you the computer has 8Gb of physical memory and SQL Server perfmon counters such as “Total Server Memory” show SQL Server is using around 3Gb of that. But you open up Task Manager on this Windows Server 2008 machine and look at the columns for SQLSERVR.EXE in Task Manager and see something like this:

image

You can see in this example, that the column for “Memory” for Task Manager shows only ~135Mb. But on this computer, if I look at Perfmon and show Total Server Memory, I see this:

image

Why the difference? Well, the most likely reason is that this SQL Server 64bit instance is using “locked pages” as I’ve described in an earlier question. Notice the name on the Task Manager column is called Memory (Private Working Set). Remember we also said that if SQL Server 64bit instances use “locked pages” this memory would not be part of the working set (because remember AWE APIs are used on 64bit to “lock” pages and that memory is not part of the working set). So since the locked pages are not part of the working set, they won’t appear in this column in Task Manager. On the Task Manager for Windows Server 2003, this column is called “Mem Usage” but it also reflects the working set of the process.

How can we prove this “difference” in memory is due to locked pages? Well, there are several ways to do this, but one way to easily see this in SQL Server 2008 is to query the sys.dm_os_process_memory DMV. On my computer where I saw this behavior I queried this DMV and saw these results:

image

You can see from this DMV that the column locked_page_allocations_kb is close to the Total Server Memory perfmon counter and shows that this memory is actually “locked”.

4. Now I know that SQL Server on x64 can use “Locked Pages”, what exactly is locked?

The simple, direct answer to this question is any memory allocated through the Buffer Pool Manager for SQL Server. But what is the “Buffer Pool Manager”? Does this mean only “database” pages are locked? Starting in SQL Server 2005, all memory allocations (in other words all access to the Windows API for memory) go through the SQLOS component of the engine. Any code in the SQL engine that need to allocate memory <= 8Kb use something called the “Single Page Allocator” (SPA) in SQLOS. It just so happens that SQLOS redirects any SPA requests to the Buffer Pool Manager. This is the same buffer pool code that has been used since SQL 7.0 to allocate memory. So any memory needed in the engine that wants “single pages” ultimately goes through the Buffer Pool. And… the Buffer Pool is the code that has the logic to lock pages via the AWE APIs.

So…any code using the Single Page Allocator (SPA) which uses the Buffer Pool manager, will have its memory “locked” if the Buffer Pool Manager is using locked pages. What code uses the Single Page Allocator? Aside from the what you may already guess which are database pages, you can find out what “type” of memory uses the SPA by querying the sys.dm_os_memory_clerks DMV. Look for any row where the single_pages_kb column is > 0.  If you run this query on your server you are likely to see clerk types of CACHESTORE_OBJCP and CACHESTORE_SQLCP. This procedure cache memory. I won’t list out all of them here but you can see several “types” of memory use SPA which means they use Buffer Pool which means this memory can be locked.

5. Does the Standard Edition of SQL Server 32bit support AWE? What about “Locked Pages” for 64bit?

I’ve seen some people ask me and others at Microsoft whether the Standard Edition of SQL Server for 32bit supports the “AWE” feature. I think the confusion is related to the fact that Standard Edition for SQL Server for 64bit until recently did not support locked pages.

So let’s dispel this myth here:

  • The Standard Edition of SQL Server for 2005 and 2008 32bit DOES support the AWE feature. One source to confirm this is at http://msdn.microsoft.com/en-us/library/cc645993.aspx. This lists “Dynamic AWE” as a feature support for both Standard and Enterprise Editions. As I wrote this blog I wanted to confirm this by going to the source..literally. I checked our code and our logic to “enable AWE” is based on Enterprise OR Standard Editions.
  • Up until recently the Standard Edition of SQL Server 2005 and 2008 64bit did NOT support the “locked pages” feature as I’ve described it earlier in this blog post. Based on customer feedback we changed this in recent cumulative updates for both 2005 and 2008.  Enabling this for Standard requires a trace flag and you can read more about this at http://blogs.msdn.com/psssql/archive/2009/05/19/an-update-for-standard-sku-support-for-locked-pages.aspx

6. How do I know if AWE is enabled on a 32bit SQL Server? How do I know if “Locked Pages” is working on a 64bit SQL Server?

Read this blog post first which gives details on the algorithm in the engine for enabling AWE or for enabling “Locked Pages” http://blogs.msdn.com/psssql/archive/2007/10/18/do-i-have-to-assign-the-lock-privilege-for-local-system.aspx.

For 32bit systems, one tip. If you think you set “awe enabled” to 1 and are not sure why you are not seeing Address Windowing Extensions is enabled in the ERRORLOG, it is likely you either didn’t run RECONFIGURE after changing “awe enabled” to 1 OR the “Lock Pages in Memory” privilege is not set. If you try to reconfigure after changing “awe enabled” to 1 and the “Lock Pages in Memory” privilege is not set, the RECONFIGURE command will fail with:

Msg 5845, Level 16, State 1, Line 1
Address Windowing Extensions (AWE) requires the 'lock pages in memory' privilege which is not currently present in the access token of the process.

 

I hope this information has helped you understand the concepts of AWE and “Locked Pages” with respect to SQL Server. I know this can be a confusing topic and I don’t blame anyone for asking question to understand this. If I see comments to this post or run into any common questions that make sense I’ll add this to the FAQ on this blog.

 

Bob Ward
Microsoft

Posted by psssql | 0 Comments
Filed under:

The Future of SQL Server Express Installation and Voting with Microsoft Connect

The SQL Product team that owns setup is looking to improve the SQL Express installation experience including application developers who look to embed SQL Server Express with the install of their application.

Peter Saddow is a Program Manager on this team and he and I go way back (Peter was the primary owner of hotfix releases way back in the 7.0 days when I first met him). Peter has created two URL links for you to complete a survey to help shape feedback for SQL Server Express Setup for SQL Server 2008 R2 and future releases. If you want to participate in these surveys go to these links:

If you have general feedback about SQL Server Express setup go here:

http://blogs.msdn.com/petersad/archive/2009/09/08/general-sql-server-express-survey.aspx

If you are a developer and have specific feedback on the embedded SQL Server Express installation experience go here:

http://blogs.msdn.com/petersad/archive/2009/09/08/embedding-sql-server-express.aspx

These surveys are linked to the Microsoft Connect website. This system is where most Microsoft Product teams get direct feedback from customers on product features, enhancements, and “what I wish for” type of ideas. You can also file bugs here you find without having to contact Microsoft CSS. All it requires is setting up a Windows Live account. One of the features I like about Connect is that you get to “vote” on existing items. So if you really want to push for a bug fix, idea, or feature, make your case, gather support from the community and have everyone vote on the idea.

I have seen some good and bad opinions on this site but I can tell you from personal experience the product teams do listen to this feedback. Any connect entry is automatically filed as a “bug” at Microsoft and the product teams are required to respond to every one of them. Furthermore, don’t think that you have to wait for an entire new release to see something happen. Some connect items are considered in SQL Server for Service Packs, depending on their complexity, risk, etc. The direct link for SQL Server is http://connect.microsoft.com/SQLServer

 

Bob Ward
Microsoft

Posted by psssql | 0 Comments
Filed under: , ,

Another update for the Support Policy for Virtualization for SQL Server….

We now have live an update to the KB article for the Support policy for virtualization for SQL Server at:

Support policy for Microsoft SQL Server products that are running in a hardware virtualization environment

I created this blog post, because there are some changes to the article I believe are important to our SQL Server customers:

  • Added in support for Windows Server 2008 R2 with Hyper-V and Hyper-V Server 2008 R2
  • Reformatted the article in several places to (hopefully) reduce questions and confusion on what is supported, not supported, and certain restrictions on support.
  • Added in statement for non-support of snapshots from any virtualization vendor including Hyper-V. It is very possible you could restore a virtual machine image snapshot where SQL Server was running and not see any problems. However, there are possible problems that can occur with SQL Server (and there is no finite list of problems to list out) in this situation. Therefore, we have made the statement that we do not support this. Again, you may restore from a snapshot and never see a problem. But, if you call Microsoft CSS and have an issue with SQL Server in a virtual machine that was restored from a snapshot, CSS will tell you this is not supported.
  • Added in the FAQ section that we support Quick and Live Migration. I knew we would get questions on these features, so we specifically call them out.
  • Added in a statement in the FAQ about support for non-Microsoft virtualization vendor features. In the SVVP program, this is called “additional functionality”.  Any of these additional functionality features are not part of the SVVP program and must be supported by that vendor. This includes any problems that may occur with SQL Server when using these features. This is nothing new, but I’ve seen questions on this so calling it out in the FAQ.

Bob Ward
Microsoft

Distributed Queries – Remote Login Permissions and Execution Plans

Recently I troubleshooted an issue where a distributed query was randomly getting a poor plan and slow performance where the join would not be remoted.  However, based on the actual data distribution and the join condition, you would expect the join to be remoted.  All of the remote servers were SQL Server, so we were not dealing with heterogeneous data sources.  Further investigation in to the plan also showed that there were some unexplained estimations higher up in the plan.

This all boiled down to a permissions issue with the login used on the remote servers.  Your initial reaction may be ‘how can that be a permissions issue!’.  Especially in view that the query runs successfully without error and retrieves data from the remote servers successfully.

The root cause is regarding the permissions required when the local server is gathering remote statistics.  When SQL Server compiles a distributed query that is accessing data in other SQL Servers, the local server will make calls to the remote server database(s) to gather statistics about the tables referenced by the query.  If you gather a Profiler trace on the remote server during one of these compiles (In addition to the RPC and Batch completed events, add the ‘Audit DBCC Event’ and the ‘Exception’ event), you will see calls like the following:

image

Note the LoginName is plain.  So the way I had my linked server configured is to connect to the remote server using the login plain.  In this case, the login plain had the necessary permissions to run dbcc show_statistics in my remote database.  So the optimizer was able to get the statistics it needed and create a good plan.

However, suppose my plain login is only a reader in the database.  Now see what the trace on the remote server shows:

 image

There is now an Exception event with “Error:  2557”.  What is this?  You can pull this directly from sys.messages:

select * from sys.messages
where message_id = 2557

And 2557 is “User '%.*ls' does not have permission to run DBCC %ls for object '%.*ls'.”

When this happens, the optimizer on the local server continues to compile the plan without the remote statistics, but will likely make very poor remoting decisions and develop a plan that may not appropriately remote joins or predicates.  In the plan you may see that estimates are far off from actual.

Since the permissions are tied to how you have your linked server security defined, you can see how this can introduce some very random plan behavior where sometimes you see a good plan and other times you see a bad plan, and may have trouble reproducing it consistently.  Say for example you have the linked server defined to impersonate the login.  Some logins have db_owner in the remote database and some logins do not.  If the plan is being cached and able for the optimizer to reuse, based on the login permissions at the time the plan was cached you may get a bad plan and then subsequent executions use this bad plan.  Then the plan is aged from cache, the next user that runs does have permissions on the remote server, now a good plan is cached and everyone now starts using the good plan.

The permission requirements for linked server in relation to performance are documented in SQL 2008 Books Online http://msdn.microsoft.com/en-us/library/ms175537.aspx.  However, the same guidelines also apply to SQL Server 2005. 

In a nutshell, for SQL Server 2005 and SQL Server 2008 - the remote login must have the permissions required to run DBCC SHOW_STATISTICS in the remote database http://msdn.microsoft.com/en-us/library/ms174384.aspx.

-HTH

Sarah Henwood

Microsoft SQL Server Escalation Services

Posted by psssql | 0 Comments

Come on 64bit so we can leave the mem….

I recently saw a question the other day about some errors that indicate memory pressure for the SQL Server Engine and how it might be related to the infamous “MemToLeave” for SQL Server. The problem with this situation was that the errors were happening on a 64bit version of SQL Server. So…I thought it made sense to make sure and dispel any questions or myths about “MemToLeave’ and its relationship to 64bit SQL Server.

Ready for the big answer?

There is no “MemToLeave” for the 64bit version of the SQL Server Engine!

Let me explain why….

This concept has to do with virtual address space (VAS) memory and not physical (or virtual memory) for the computer. We created the concept of “MemToLeave” memory for the virtual address space of a 32bit SQLSERVR.EXE because the virtual address space of a 32bit process in Windows is a limited resource. (2Gb by default, up to 3Gb with 4GT tuning options, and 4Gb in WoW64). And since the VAS is limited, the designers of the engine felt that by default the buffer pool should reserve as much of the VAS it could at startup (provided the amount of physical memory is 2Gb or higher. No sense reserving a big portion of the VAS if the amount of RAM is less than the max VAS for the process). So… since we are reserving most of the VAS for the Buffer Pool, we realized we can’t “take it all”. Why? Memory may be needed in the VAS for other “things” such as:

  • Thread Stacks – When a thread is created, reserved memory is required for a thread stack (this is why you might seen some type of error like 17189 SQL Server failed with error code 0xc0000000 to spawn a thread to process a new login or connection. Check the SQL Server error log and the Windows event logs for information about possible related problems. This error typically means the thread stack could not be reserved)
  • Heaps – Default Windows heap and any other heap created by a DLL
  • SQL Server Multi-Page Allocations (MPA) – The engine itself may allocate memory outside the Buffer Pool because the size required is bigger than a SQL page which is 8Kb.
  • Any other DLL that needs to use VirtualAlloc

What are the other “things” besides thread stacks and SQL MPA? Extended Stored Procedures, Linked Server Providers, COM objects, and host of other edge cases where DLLs get loaded in the process space of SQL Server.

The actual term “MemToLeave” comes from an internal variable name in the code that refers to the total amount of memory we “leave around” for these “things”.

The general algorithm at server startup calls for us to reserve this space using VirtualAlloc() and then after reserving the space for the buffer pool, we free this reservation. Now we have “left it” for these other “things” to reserve and commit memory. The general algorithm for the amount to “leave” is:

thread stack size * ‘max worker threads’ + “the value of –g startup parameter” (measured in Mb)

  • The thread stack size is platform dependent (for example 512Kb on x86 and 2Mb on x64)
  • ‘max worker threads’ when set to 0 is dynamic and based on number of CPUs. See BOL for details
  • The default for –g is 256 (which stands for 256Mb)

By now I hope you can guess that this algorithm and code are not needed for 64bit SQL Engine systems because the virtual address space for 64bit Windows applications are not as limited as with 32bit. In theory, the VAS for 64bit applications is 16 Exabytes.but in practical terms for Windows it is 8TB. (you should also note that the physical memory limit of Windows Server 2008 is 2TB so it is not even possible today to address the maximum VAS for a 64bit Windows application). Because of this, SQL Server doesn’t need to make any special VAS reservation at startup.  In fact, on 64bit SQL Server systems, the engine doesn’t reserve a large amount of memory for the buffer pool as with 32bit. So there is no need to leave any VAS space around since we don’t reserve most of it. You might be fooled on a x64 machine with 2Gb or greater if you monitor the “Virtual Bytes” of SQLSERVR.EXE right after startup. On my laptop with 3Gb of physical RAM, my virtual bytes was close to this number of 3Gb. But that is not the engine reserving space at startup. That is just the natural growth of the Buffer Pool when it allocates memory for overhead structures at startup such as SQLOS and lock manager. When we “grow” the Buffer Pool we reserve space in large, small number of blocks instead of small, larger number of blocks.

I hope this can help you and other bust the myth that “MemToLeave” exists on the x64 versions of SQL Server. It only applies to 32bit versions of SQL Server (even those 32bit versions running in Wow64).

Bob Ward
Microsoft

Posted by psssql | 0 Comments
Filed under:
More Posts Next page »
 
Page view tracker