Welcome to MSDN Blogs Sign in | Join | Help

SQL Server 2005 SP3 and Self-Service Hotfixes

If you have not seen or heard the news, we have announced our plans to release a service pack for SQL Server 2005.  If you want to read more about the announcement, look at this blog post

http://blogs.technet.com/dataplatforminsider/archive/2008/04/15/sql-server-2005-sp3-coming-soon.aspx

We are excited to see this in CSS because this will be an opportunity for the larger community to install a build that contains fixes to problems already addressed that you might not otherwise have. The plans are to have this out before the end of this calendar year. As we get closer to having a build released, I'll post more details about what will be in SP3.

Now let me talk some about hotfixes. Many of you have become familiar with our Incremental Services Model (ISM) which involves something called a Cumulative Update (CU). One of the complaints I have received over the years from customers is that they must call CSS to obtain any of these fixes. Premier customers have had the ability to download fixes through the Microsoft Premier Online (MPO) system. But, over the last few months we introduced a system where you could click a link in the article for these CU fixes. You were prompted to complete a form and then a person at Microsoft emailed you back information on how to download the hotfix. While this was a step forward in avoiding a phone call for something you know you already need, it was not quite a true self-service hotfix system (and sometimes it could take a day or so to get back the email). Well, I'm happy to announce that we have recently introduced a new system to automate this process.

Let me walk you through an example you can use today to download the latest CU build, CU #7 for 2005 SP2.

1) Find the KB article that talks about CU 7 by going to http://support.microsoft.com and searching for the KB article 949095 (you can also just search for the keywords "Cumulative update package 7" (use the quotes) off of live.com or support.microsoft.com). Here is the direct link to make it quicker if you want to try this out as you are reading this blog: http://support.microsoft.com/kb/949095/

2) You will notice when you bring up this article that there is a new option at the top to download the fix for this article

image

3) When you click this link, you are presented with a new form to select what you want to download

image

By default, we recognize the platform of your browser so will present the builds for that platform (in my case it was x64). However, if you want builds for a different platform just click on the link that says "Show hotfixes for all platforms". You may also want to select the link to the right of this that says "Show additional information" as I've done in this screenshot. It will show you the version number for this build and file size of the package.

4) Once you select the builds you want to download by selecting the checkboxes, you fill in your email address and the numbers in the picture, and click the "Request hotfix" button.

5) Now you are presented a screen notifying you that you will be getting an email automatically with the information on how to download the fix. Pay special attention to the name of the email account for the sender of this email. I say this because it may show up as Junk Mail (it did for me within seconds of getting this screen).

image

6) This is what the email will look like. You are provided a link to click to download the package containing the hotfix along with a password. If you selected multiple packages to download you will get a separate link and password for each. These passwords expire 7 days after you receive this email so you should download them immediately. If for some reason, you forget to do this and it expires, you will need to go back and download the fix again from the KB article. I've been told this system is designed to get you the email within seconds or minutes. For me it was seconds. It was so quick by the time I had looked at the previous screen, the email was in my Junk mail folder.

image

So as much as I want you to feel comfortable calling Microsoft CSS, if you just need to download one of our CU packages for 2005 SP2, no need. Just go to this article and download the fix yourself. As with any hotfix, I always recommend you look at the article and install this for problems you can confirm you are encountering. But some of you want to stay ahead of the game and test out the latest cumulative update so this is now a way for you to do that without having to call CSS.

We've just made these changes so there are more tweaks to this system. For example, not every individual hotfix article has this new link and there may be some issues with localized languages. I thought about waiting to post this blog when those were resolved, but I couldn't wait. So many people have asked me about this in the past, I wanted to tell you now. So as of today, you can do this for any Cumulative Update Package for 2005 SP2 we have shipped. I'll post more details in the coming weeks as we bring other fix articles on line. When I do this, I'll also post some more details behind our cumulative update packages. We do already have good content on this already. Chris Williams from our Release Services team (these are the folks that own everything about our hotfixes) has a good blog post on CU #7 at:

http://blogs.msdn.com/sqlreleaseservices/archive/2008/04/15/cumulative-update-7-for-sql-server-2005-service-pack-2-2.aspx

This is a good blog to read if you want to keep up with the latest fixes and cumulative update packages as they are released.

I hope you find this new system a time savings for you in your continuing effort to manage and support SQL Server.

 

Bob Ward

Microsoft

Posted by psssql | 4 Comments

How It Works: What is a Sleeping / Awaiting Command Session

This issue is as old as SQL Server.  In fact, it goes back to Sybase days but continues to fool and puzzle administrators.

A session with that status of sleeping / awaiting command is simply a client connection with no active query to the SQL Server. The table below shows the transitions from running to sleeping states for a session.

Connect Running
Connect Completed Sleeping / Awaiting Command
select @@VERSION Running
select completed Sleeping / Awaiting Command

The question usually arises around a session that is holding locks and its state is sleeping / awaiting command.  If the client has an open transaction and the client did not submit a commit or rollback command the state is sleeping / awaiting command.    I see this quite often with a procedure that times out.  

 

Create proc myProc

As

                Begin tran

                Update authors ….

                Waitfor delay ’10:00:00’   --- time out will occur here  (simulates long workload)

                rollback

go

 

When run from the client with a 30 second query timeout the transaction will remain open because the client indicated it wanted to ‘cancel execution' and do no further processing.   To get automatic rollback in this situation transaction abort must be enabled.  You now have an open transaction with a SPID sleeping/awaiting command.

The situation can be caused by many other variations but it is always a situation where the SQL Server is waiting for the next command from the client.   Outside a physical connection problem these are always application design issues.

Bob Dorr
SQL Server Senior Escalation Engineer

Changes introduced in CU#7 may impact Analysis Services backup and sync operations

Cumulative Update #7 for SQL Server includes a fix for Analysis Services backups to correct a problem with truncation of metadata files as described in KB Article 950968.  The fix involves changes to the locking levels used for backups, thereby avoiding the truncation issue which could have been encountered previously.   Unfortunately, the increased locking levels come at a price. 

Prior to this fix, a running backup was not treated as a blocking event for a processing commit, and thus would be allowed to run to completion, even if it meant that a processing operation could not commit for an extended period of time.  After the fix a backup is treated as a blocker for a commit operation.  Due to the way that lock compatibility operates in Analysis Services, once the commit operation is blocked, all incoming requests against objects in the database are also blocked. Analysis Services handles blocked commit requests via the ForceCommitTimeout server setting.  As a result after CU #7 is applied a backup operation will be cancelled if a commit of a processing operation is blocked for more than 30 seconds (default ForceCommitTimeout value) by the backup. 

If the backup is cancelled the following error message will be returned:

Server: The operation has been cancelled.

 (Microsoft.AnalysisServices)

 

For common situations this scenario can be avoided by scheduling database backups at a time that will not overlap with processing of objects in the database.  However, for some scenarios, such as proactive caching enabled partitions that process frequently, the frequent proactive caching processing operations will prevent a backup from ever completing successfully.  In this type of scenario you have several options available for generating a successful backup.

The first two options involve adjustments to the ForceCommitTimeout and/or CommitTimeout settings on the server. 

One option is to increase the ForceCommitTimeout value so that it is long enough to allow the backup to complete.  However, it is important to note that the blocked commit will also block any other operations on the database being backed up.  This essentially means that once a processing operation attempts to commit on an object in the database being backed up, the database will be unavailable for the remaining duration of the backup.

A second option is to set the CommitTimeout to a value other than the server default of 0 (infinite), while also increasing the ForceCommitTimeout such that it is greater than the CommitTimeout.  With this approach you are essentially setting the maximum amount of time that you are willing to wait for a backup or query to complete before cancelling the commit (CommitTimeout) for a process operation and allowing the backup or long running query to continue uninterrupted.  If you choose this approach, the commit of the processing operation will wait for up to the amount of time specified in CommitTimeout to elapse, possibly blocking access to the database for the specified time, and then rollback the processing operation if it was unable to commit within the allotted time.   Proactive Caching can automatically handle these processing failures and will reattempt processing at the next specified interval.  If you have other processing operations which may be impacted by this approach you will need to make sure that they have sufficiently robust retry logic to handle failures caused by long running queries.

Changes to these settings do not require server restarts to take place, so they can be changed prior to the backup and then reset back to the default values after the backup completes.

Another option is to temporarily disable proactive caching for the proactive caching enabled partitions in your database, making the backup, and then re-enabling proactive caching.  Unfortunately, there is no built in AMO command to disable proactive caching for an object.  To disable proactive caching you can either issue alter XMLA commands for the partitions to change the proactive caching settings, or alter the datasource for the proactive caching enabled partitions so it points to a non-existent RDBMS server.  The drawback to this approach is that the backup will contain objects which have been altered to settings other than those desired for actual production operations.

There may be other options, but these are the ones tested and discovered so far.

The change in locking levels and behavioral changes described above also apply to sync operations.  In the case of sync operations the same potential remedies apply.

 

Wayne Robertson

SQL Server Escalation Engineer

Posted by psssql | 3 Comments
Filed under:

How It Works: SQL Server Checkpoint (FlushCache) Outstanding I/O Target

I ran into an issue yesterday that brought up questions as to how the checkpoint's max outstanding, I/O target is determined.   There is different behavior across the various versions of SQL Server and it would take an entire white paper to fully explain the checkpoint process.  I am going to try to boil it down to the most relevant information.

I previously outlined some of the checkpoint behavior in the following white papers.

http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx
http://www.microsoft.com/technet/prodtechnol/sql/2005/iobasics.mspx

Checkpoint sweeps the hashed data buffers.  When it finds a dirty buffer it calls the routine (WriteMultiple).  WriteMultiple does hash lookups for adjacent, dirty pages.  The outstanding page count is incremented when a page write request is made.  When the write completes the value is decrement.

The I/Os are written asynchronously.  Checkpoint limits the number of I/Os it can post by comparing the number of outstanding requests with the target.   When the target is reached checkpoint waits.  Once an I/O completes Checkpoint can post another I/O.

Some high level pseudo-code helps.

Void FlushCache()
{

bufferPos = 0;
while(bufferPos < maxBuffers)
{

if(buf[bufferPos] is dirty)
{

pagesInMotion = WriteMultiple(buf[bufferPos]);
}

       SafeAdd(totalCheckpointPagesOutstanding, pagesInMotion); 

while (totalCheckpointPagesOutstanding >= maxOutstandingPages)

{

       WaitFor..(5 milli-seconds)

}

bufferPos++;

}

}

 

Void CheckpointIoCompletion(….)
{

       Check success of WriteOperation

       SafeDecrement(totalCheckpointPagesOutstanding);

}

Max Outstanding I/O Target

Version Max Outstanding I/Os for Checkpoint
SQL Server 2000 SP3 100
SQL Server 2000 SP4 64 * Logical Schedulers
SQL Server 2005 *Formula

As I stated there are other parts to FlushCache.  For example SQL Server 2000 SP3 and SP4 attempt to maintain a rate that meets the configured recovery interval.  This may include additional sleeps.

See: KB 906121: Checkpoint resumes the behavior that it exhibited before you installed SQL Server 2000 SP3 when you enable trace flag 828 on a computer that is running SQL Server 2000 SP4
http://support.microsoft.com/kb/906121

Generally speaking the loop will continue without additional waits.   This means that the I/O load during SQL Server 2000 checkpoints is held steady.   As soon as one of the I/Os complete Checkpoint can issue a new request. 

The reason I specifically point this out is that I have seen systems where some of the I/Os become stalled.   Lets say the first 50 I/Os are stalled and the last 50 complete.  Checkpoint puts out 50 more I/O requests that can again complete.   This can make the performance counters for average disk sec/transfer and disk queue length more challenging to interpret.

*Formula (SQL 2005)

Checkpoint (FlushCache) was significantly upgraded for SQL Server 2005.  It can honor a target (see manual checkpoint) and various other parameters can affect Checkpoint.   I am going to describe one of these behaviors.

The starting point for max outstanding I/O level is loosely calculated as:

  1. Try to checkpoint entire SQL Server memory footprint in 300 seconds assuming 20ms I/O target.   Divide total SQL Server commit buffers to determine the number buffer rate.
  2. Divide outcome of by the number of logical schedulers
  3. If outcome of is less than 20 default to 20
  4. Multiply outcome by 10  

    Note: Portions of this calculation are used to determine max outstanding lazy writes.

Each time SQL Server 2005 checkpoint executes the calculated target establishes the max outstanding I/O level.   Depending on the type of checkpoint this value is adjusted as the I/Os complete. 

SQL Server 2005 checkpoint adds duration tracking to the I/O requests.  As each I/O completes the duration is used to adjust a running I/O average.  Here is some pseudo-code showing a very high level running average calculation.   A running average is used to accommodate the 'immediate' state of the I/O path so Checkpoint can react quickly to the current state and not the historical states.

Void CheckpointIoCompletion(….)
{

       Check success of WriteOperation

             duration = Current Time - I/O Start Time

             runningAvg = (duration + (runningAvg * 7)) / 8;


       SafeDecrement(totalCheckpointPagesOutstanding);

}

Using this running average, Checkpoint adjusts the number of allowed max outstanding I/Os with a goal to keep I/O response time below 20ms.

 

Bob Dorr
SQL Server Senior Escalation Engineer

Posted by psssql | 0 Comments
Filed under: ,

Troubleshooting xp_cmdshell failures

This post assumes you have properly enabled the xp_cmdshell feature using the Surface Area Configuration tool and you have used Management Studio | Server Properties | Security to establish a proxy for xp_cmdshell execution.

Sample command:   master..xp_cmdshell 'whoami'
SQL Authentication: TestLogin   (Public in pubs)

When xp_cmdshell is executed it returns initialization errors with error message 15121. 

Msg 15121, Level 16, State 200, Procedure xp_cmdshell, Line 1
An error occurred during the execution of xp_cmdshell. A call to 'API' failed with error code: 'OS ERROR CODE'.

Use the API and OS ERROR CODE to troubleshoot problems.   The API and OS ERROR CODEs can be looked up on MSDN.  You can also use 'net helpmsg ####' to retrieve the text of the OS ERROR CODE.

Bad Password  (Server Properties | Security | Proxy does not validate password)
If password is changed outside of Management Studio (Computer Management) so it is not in sync with SQL.
If user was dropped and recreated from the system (Computer Management) so it has a SID change.

Msg 15121, Level 16, State 200, Procedure xp_cmdshell, Line 1
An error occurred during the execution of xp_cmdshell. A call to 'LogonUserW' failed with error code: '1326'.

 1326 = Logon failure: unknown user name or bad password.

User does not have login as BATCH rights (Local Security Policy | User Rights Assignement | Logon As A Batch Job)

Msg 15121, Level 16, State 200, Procedure xp_cmdshell, Line 1
An error occurred during the execution of xp_cmdshell. A call to 'LogonUserW' failed with error code: '1385'.

1385 = Logon failure: the user has not been granted the requested logon type at this computer.

Note: Drop the 'W' when searching  LoginUser not LoginUserW for the unicode versus ascii version LoginUserA

Success

master..xp_cmdshell 'whoami'

output
-----------------
dorrdell\testuser

Note: When it is called by a user that is not a member of the sysadmin fixed server role, xp_cmdshell connects to Windows by using the account name and password stored in the credential named ##xp_cmdshell_proxy_account##. If this proxy credential does not exist, xp_cmdshell will fail.

Bob Dorr
SQL Server Senior Escalation Engineer

Posted by psssql | 1 Comments
Filed under: ,

How It Works: File Streams Requires Integrated Security (Windows Authentication)

SQL Server authentication versus Windows Authentication seems to cause File Stream users confusion.   I thought this e-mail chain would help.

From: Robert Dorr
Subject: RE: OpenSqlFilestream problem

 

It is really not the SQL Server Service account that matters here.   SQL Server has to have access to the files in order to handle the file stream file groups.

 

The problem that external access to the files occur through a UNC like facility.  \\server\instance\....    Column.PathName    This goes through the Windows IOMgr path and to our RsFx kernel components.   The security context at this point in time is the user making the call to OpenSqlFileStream  that ends up calling NtCreateFile.   To validate the user can access the file under the transaction the user is impersonated and checked by SQL Server for proper transaction access.   Since the transaction was opened under mixed security the impersonation will fail. 

 

The entire loop has to be done under the same integrated security.

 

From: Another User
Subject: RE: OpenSqlFilestream problem

 

A SQL login means the FileStream is using the SQL Server service account and not the user's service account. It's quite possible the service account may have limited rights. e.g. if it's a local account, it will not be able to access files on a network share, etc.

 

 


From: Robert Dorr
Subject: OpenSqlFilestream problem


File Streams only support integrated security access.    The Windows HANDLE model does not understand SQL Server logins.

From: User
Subject: OpenSqlFilestream problem

 

  I make the call to OpenSqlFileStream I getting an “Invalid Handle” error. I see in sqlserver error log following message. 

 

The current user cannot use this FILESTREAM transaction context. To obtain a valid FILESTREAM transaction context, use GET_FILESTREAM_TRANSACTION_CONTEXT. 

<fsautils.cpp, 0347, NtStatusFromSqlError                            > SqlServer error code 3934 is converted to 0xc0000022  <  fsfsql.cpp, 0287, CFsaFstrSqlSession::OnFileOpen                  > CFsaFstrSqlSession::OnFileOpen () failed: Error 0xc0000022 (-1073741790) 

 

From sql profiler I captured the same error also.

 

Error: 3934, Severity: 14, State: 1.

The current user cannot use this FILESTREAM transaction context. To obtain a valid FILESTREAM transaction context, use GET_FILESTREAM_TRANSACTION_CONTEXT.

 

 

Switching the application connection string from  SQL Server login account , which is sysadmin, to Integrated Security resolves the problem. But using sqllogin is mandatory in this case. If anyone can help , I appreciate.

 

 

          //SqlConnection cxn = new SqlConnection("Data Source=(local);Initial Catalog=Northwind;Integrated Security=True;");   //It works

          SqlConnection cxn = new SqlConnection("Data Source=(local);Initial Catalog=Northwind;User ID=sa;Password=111;"); 

    cxn.Open();

          SqlTransaction txn = cxn.BeginTransaction();

    Guid DocId = Guid.NewGuid();

 

                               ……

                      //It works with Integrated Security but not sa

    SafeFileHandle handle = SqlNativeClient.OpenSqlFilestream(

                    sqlFilePath.Value,

                    SqlNativeClient.DESIRED_ACCESS_WRITE,

                    0,

                    transactionToken.Value,

                    (UInt32)transactionToken.Value.Length,

                    new SqlNativeClient.LARGE_INTEGER_SQL(0));

                 

                    //Invalid handle. Parameter name: handle

                  .......

 

Bob Dorr
SQL Server Senior Escalation Engineer

Posted by psssql | 1 Comments

European PASS Conference 2008

I'm leaving this Saturday for the Europe PASS Conference in Neuss, Germany. I have the following two sessions to present while at the conference:

1) Performance Toolset Workshop

This is day long pre-conference seminar on SQL Server Engine Performance. This session is all about the tools used by CSS to work on customer problems related to SQL Engine performance. This includes tools like Performance Dashboard Report, SQLDiag, SQLNexus, RML Utilities, and some interesting new perf tools in Windows Server 2008/Vista such as Resource Monitor.

While these are the tools of SQL Server 2005, I'll spend the last part part of the day talking about SQL Server 2008 including:

  • A New Activity Monitor for Management Studio
  • The Management Data Warehouse and Reports
  • Engine "Goodies" including new query hints, plan guide enhancements, and something called "query and plan fingerprint"
  • T-SQL Object Search and Debugging
  • Extended Events

2) Debugging SQL Server 2005 Toughest Problems

This is a talk I gave at last year's PASS Conference in Denver which include topics such as latches, access violations, memory internals, crashes, and odd wait_types.

I'll add more this time on SQL Server 2008 given where we are with the release including:

  • Extended Events
  • Preemptive Waits
  • Tempdb Checksum
  • New Memory DMVs
  • Powershell

If you don't happen to be traveling to Europe for the conference, have no fear. I'll be doing a regular series when I return from the conference on this blog on new supportability features coming out for SQL Server 2008 including these topics I have listed above. Look for these posts starting in late April/early May.

Bob Ward
Microsoft

Posted by psssql | 0 Comments
Filed under:

How It Works: SQL Server Page Allocations

 

When I drop an object why does SQL Server not immediately re-use those extents for a new object?  This is because the storage engine is optimized for speed to continue allocating new extents from available free space in the file rather than always going back in to routines to identify and reclaim previously dropped extents.

 

The storage engine maintains a pointer in the FCB for each file to point to the next available (uniform or mixed extent).  It continues to move this pointer along the file as objects and extents are allocated.  The storage engine does not constantly evaluate to move this pointer back to previously dropped extents  until it reaches the end of the file and is faced with an auto-grow or out of space.  Then routines will kick in to aggressively start going back through the file to reclaim available extents.  The reason why the storage engine does the allocations this way is primarily for speed and performance.  If the storage engine was always trying to go back through the file to look for previously available extents then allocation of new extents would be much slower and fragmented rather than continuing along the available space in the file.  The storage engine will only get aggressive to reclaim space when it has no other alternative and space at the end of the file is not sufficient and then go back through to see where it can find space to handle the new allocations.

 

As an example, say my database file looks like the following with free available extents at the end of the file and my FCB points to extent 101 as the next available uniform extent:

clip_image002

 

 

I then create a new table and insert some data.  The storage engine allocated extents 101-103 for my new data.  The FCB pointer was moved along and now points to extent 104 as the next available uniform extent:

 

clip_image002[9]

Then I do what I needed to do with that object data and decide to drop the object.  Extents 101-103 are no longer in use (there are other cleanup routines that go through and mark extents as available).  Note that the FCB pointer continues to point to extent 104 as the next available uniform extent:

 

clip_image002[11]

 

I then decide to create another table and insert some data.  The storage engine looks at the pointer for the next uniform extent and starts allocation for my new object at extent 104.  Extents 105 and 105 are allocated for my new object and the FCB next available uniform extent pointer is moved along to 106:

 

clip_image002[13]

 

Author: Sarah Henwood - SQL Server Escalation Services

Posted by: Bob Dorr - SQL Server Senior Escalation Engineer

Posted by psssql | 1 Comments
Filed under: ,

Microsoft to Increase Use of SQL Server 2005 Best Practices Analyzer (SQLPBA) - So Should You

I don't want you to be surprised when you see more use of SQLBPA.   It was 18 months ago when the development team approached the support team asking us to help identify issues for SQLBPA.   Several of us, including myself helped identify and rank a list of rules for SQLBPA that target the repeated, most difficult and critical cases.  

SQL Server 2005 Best Practices Analyzer
http://www.microsoft.com/downloads/details.aspx?FamilyID=da0531e4-e94c-4991-82fa-f0e3fbd05e63&DisplayLang=en

The Exchange team has the ExBPA utility and it is being used when issues are reported to Microsoft as well as for onsite engagements.   Since Exchange started using ExBPA for proactive engagements the rate of serious issues reported to Microsoft for Exchange has dropped 20+ percent.  We expect the SQLBPA rules to assist SQL Server customers in the same way. 

As a testament to the current SQLBPA rules I helped on 39 cases the week of Easter.   Of those 39 issues 8 would have been directly identified by the current SQLBPA rules and 5 other issue significantly reduced in scope. 

This week the SQL Server support team has increased the use of SQLBPA on support calls and will continue to do so to assist customers.   I would recommend to any DBA that they run the utility on their systems to identify possible issues before they lead to unwanted support calls.

We intend to continue adding more rules and policies to assist our customers.   We are working to provide new rules every 8 weeks to assist SQL Server users.  

SQL Server 2008 DMF  (Declarative Management Framework)

As we get closer to the release of SQL Server 2008 I should mention DMF in this conversation.   DMF is the, out-of-box, replacement for the SQLBPA utility.   It extends the capabilities beyond what SQLBPA can do and allows actions such as enforcement of policies/rules.   If you have not looked into the DMF capabilities I suggest you do so.   You could develop an entire series of your own best practices and enforcement's to meet the needs of your applications.

Bob Dorr
SQL Server Senior Escalation Engineer

Posted by psssql | 1 Comments
Filed under: ,

SQL Server 2005 Memory Limits and Related Questions

I had a very nice conversation this week with one of the blog readers.   The question related to adding more memory to their server.   It lead to an exchange of various links that I think you might find handy.  We discussed the operating system limitations and the SQL Server limitations.   Each of these are outlined in different locations.

Operating System Limits

The operating system establishes installed memory limitations by SKU.   SQL Server is limited by what the operating system presents.  

Memory Limits for Windows Releases
http://msdn2.microsoft.com/en-us/library/aa366778.aspx

SQL Server Limits

Memory Supported by the Editions of SQL Server 2005
http://msdn2.microsoft.com/en-us/library/ms143685.aspx

Updated: 15 September 2007

The following table specifies the maximum memory support for each edition of Microsoft SQL Server 2005.

SQL Server 2005 edition Maximum memory supported (32-bit) Maximum memory supported (64-bit)

Enterprise Edition

OS maximum1

OS maximum1

Developer Edition

OS maximum1

32 TB

Standard Edition

OS maximum1

32 TB

Workgroup Edition

3 GB

N/A2

SQL Server Express Edition

1 GB

N/A2

Evaluation Edition

OS maximum1

OS maximum1

1This edition of SQL Server 2005 will support the maximum memory supported by the operating system.

2This edition of SQL Server 2005 is not available for the 64-bit platform in this release.

PAE Bug

In the process of discussing the memory upgrade I mentioned that there is a known set of Windows PAE related bugs that can lead overall system instability.   In fact, I would go as far to say that I would even consider running SQL Server without the fixes.   Microsoft SQL Server Support has helped dozens and dozens of customers that encountered problems because of these bugs and limits how much debugging we will do on a system that has /PAE enabled or the operating system supports hot add memory when the fixes are not applied.   We have found that in many cases it is wasted effort for the customer and Microsoft to attempt to debug such a random natured problem.

These problems range from unexpected behaviors, exceptions to database corruption and data loss.  The bug looks like the system has a memory chip that is failing and returning incorrect data.   The bug causes incorrect physical page is mapped into the virtual address space of any process. 

Bob Ward and I have both mentioned this issue during SQL PASS and other presentations for a couple of years and we have added rules to the SQL Server 2005 BPA utility to flag systems that are unprotected from this bug.

If you are running Windows 2000 or Windows 2003 RTM you should update your server.

Windows 2000 fixes

[Post Win 2000 SP4]

838647  Access violations when you use the /PAE switch in Windows 2000

http://support.microsoft.com/default.aspx?scid=kb;EN-US;838647

[Post Win 2000 Sp3]

885272  You receive a "Stop 0x0000001a" Stop error message on your Windows 2000 Server-based computer that has Physical Address Extensions (PAE) enabled

http://support.microsoft.com/default.aspx?scid=kb;EN-US;885272

 

Windows 2003 fixes

[Included in Win 2003 SP1]

834628  Data is corrupted when PAE is enabled on a Windows Server 2003-based computer

http://support.microsoft.com/default.aspx?scid=kb;EN-US;834628

[Included in Win 2003 SP1]

895575  A process that runs in the Physical Address Extension (PAE) kernel may experience memory corruption in Windows Server 2003

http://support.microsoft.com/default.aspx?scid=kb;EN-US;895575

[Security Update for RTM]

840987  MS04-032: Security update for Microsoft Windows

http://support.microsoft.com/default.aspx?scid=kb;EN-US;840987

SQL Server 2005 Best Practices Analyzer
http://www.microsoft.com/downloads/details.aspx?FamilyID=da0531e4-e94c-4991-82fa-f0e3fbd05e63&DisplayLang=en

[APR 16 2008 REVISION - Bob Dorr and Suresh Kandoth]

The original post indicated you needed to move to Windows 2003 SP2 to avoid the PAE issue.   This was incorrect, I was thinking of some of the other issues we have encountered and I would still suggest Windows 2003 SP2 if you can't go to Windows 2008.

922658 SQL Server 2000 or SQL Server 2005 may temporarily stop responding on a Windows Server 2003 Service Pack 1-based computer
http://support.microsoft.com/default.aspx?scid=kb;EN-US;922658

919341 A program that uses structured exception handling may stop responding for several seconds in Microsoft Windows Server 2003 with SP1
http://support.microsoft.com/default.aspx?scid=kb;EN-US;919341


SQL Server Working Set Trim Problems? - Consider...
http://blogs.msdn.com/psssql/archive/2008/03/03/sql-server-working-set-trim-problems-consider.aspx


Bob Dorr
SQL Server Senior Escalation Engineer

SQLIOSim is "NOT" an I/O Performance Tuning Tool

This blog post is to again inform you "NOT" to consider SQLIOSim as an I/O performance evaluation tool!   Every other month or so I get a rash of questions about using SQLIOSim to validate I/O performance and I have to repeat the same information.

As I have stated in other posts and documents SQLIOSim and SQLIOStress were designed to test the I/O stability not performance characteristics.   SQLIOSim specifically includes random patterns as various stages of testing that will invalidate any performance comparison attempts.  This even includes one SQLIOSim run to another.

SQLIOSim also includes segments of unbounded I/O attempting to to push memory and various queues to extreme depths.   This too invalidates performance comparisons.

Don't consider SQLIOSim for performance benchmarking, instead use SQLIO Disk Subsystem Benchmark Tool: http://www.microsoft.com/downloads/details.aspx?familyid=9A8B005B-84E4-4F24-8D65-CB53442D9E19&displaylang=en 

Bob Dorr
Senior SQL Server Escalation Engineer

Posted by psssql | 2 Comments
Filed under: ,

How It Works: Non-Yielding Resource Monitor

I have addressed similar conditions error in a white paper: www.microsoft.com/technet/prodtechnol/sql/2005/diagandcorrecterrs.mspx

2008-01-20 19:01:26.11 Server      ***Stack Dump being sent to K:\MSSQL.3\MSSQL\LOG\SQLDump0001.txt

2008-01-20 19:01:26.11 Server      *

2008-01-20 19:01:26.11 Server      * BEGIN STACK DUMP:

2008-01-20 19:01:26.11 Server      *   01/20/08 19:01:26 spid 0

2008-01-20 19:01:26.11 Server      *

2008-01-20 19:01:26.11 Server      * Non-yielding Resource Monitor


SQL Server 2005 added the additional Resource Monitor progress check along with the 1788* series (17883, 17884, 17887, 17888) of checks.   Resource Monitor sets a start cache notification time when it begins processing actions against a specific cache.   When the cache processing is completed the time is cleared.

SchedulerMonitor executes (every 5 seconds) and checks to see if the Resource Monitor is processing a cache and for how long.   If the Resource Monitor has not returned to an idle state or advanced past a single cache in 60 seconds the message is logged and mini-dump is captured.

SQL Server 2005 can log this message falsely when cleaning up a large amount cache entries.   The most frequently seen false report by CSS is after a query with a huge lock list completes.   The lock lists are populated with many free lock resources that resource monitor can attempt to release.   If the work to release the entries takes longer than 60 seconds the message is produced but there is no real problem for the SQL Server.

Using the mini-dump and the Windows Debuggers with the public symbols, as outlined in the before mentioned white paper find the ResourceMonitor thread.   Here is an example of it maintaining a large lock free list.

...

sqlservr!commondelete+0x850

sqlservr!LockManager::Destroy+0x50

sqlservr!LockOwner::Destroy+0x20

sqlservr!SOS_ObjectStore::ObjectSubStore::DestroyObjects+0x260

sqlservr!SOS_ObjectStore::Notify+0xc0

ResourceMonitor::NotifyMemoryConsumers+0x580

ResourceMonitor::ResourceMonitorTask+0x220

...

Bug: There is a known bug that can cause the Resource Monitor to deadlock when processing entries associated with the security cache.   This has been corrected in SQL Server 2005 Service Pack 1.

In many instances this message is a warning of a negative influence from external factors.   The Resource Monitor runs on a hidden scheduler so it has limited impact on active connections.   If you can't determine the problem using the public symbols make sure you capture a full set of performance monitor counters during a problem occurrence and look closely at memory usage on the computer.

Bob Dorr
Senior SQL Server Escalation Engineer

How It Works: SQL Server 2005 DBCC Shrink* May Take Longer Than SQL Server 2000

SQL Server 2005 adds additional shrink logic to compress TEXT/IMAGE data, referred to as LOBs.     The dbcc shrink* code uses the same underlying methods that ALTER INDEX .... WITH (LOB_COMPACTION ON) uses to compact the LOB space in the database files.  

There are various stages of dbcc shrink* activity.   These stages are indicated in the sys.dm_exec_requests command column along with the progress indications.

 

Step

Command

Description

1

DbccSpaceReclaim

Clean up deferred allocations and purge empty extents preparing for data moves.  

2

DbccFilesCompact

Moves pages beyond the target to before the target and truncate file as required.

3

DbccLOBCompact

Compacting the LOB data.

 

A target extent is calculated based on the information supplied to the shrink* command.   If you have a file that is 1GB in size and you specify a target of 75%.   The target extent start at 750MB.   

image

 

Shrink  Steps

·         Cleans up space that is available in the entire file, such as deferred allocations.  This helps expose space in the lower portion of the file.