Sql Stuff

Ramblings of all kinds of Sql related tasks - usually related to customer questions, problems, etc. (that's what I do and all)

  • Non-Unique Clustered Index and Duplicate Value limits...

    I spent this past week with a customer giving a workshop covering Sql 2000 architecture and performance related information (interpreting query plans, index structures, etc., etc.), and during the course of the week a few good points came to light that I decided to blog about.  This first one has to do with the creation of non-unique clustered indexes...

    Many of you probably already realize that if you don't explicitly create a unique clustered index in Sql Server, the engine will unique-ify the clustered index for you anyhow, since it's used for pointers to data in nonclustered indexes. The engine will append a 4-byte value (when necessary) to any non-unique cluster key value as it's inserted into the cluster to unique-ify the key value within the cluster...there are a variety of places that cover this, including Books Online, and this isn't the topic of my blog post, but it's the basis for it, so I needed to ensure we covered that.

    Now, in the workshop I gave this last week, I mentioned how I had once upon a time tried to produce a situation where a Duplicate Key error is generated on a non-unique clustered index by simply creating a table with a single column, then clustering the table on this column (in a non-unique fashion), and then inserting a whole-lot-o-records with the same value. I mentioned in the workshop that I was unsuccessful at producing this type of situation, and attributed it to the fact that the 4-byte value was probably allowed to make use of any character, not just numeric values, which would increase the amount of duplicate key values that could be sustained on a single cluster exponentially compared to the case where the value could only be numeric data. Well, if any of you guys from the workshop are reading along, I WAS WRONG!!!!  You can in fact produce this type of scenario, and it's even easier than I originally thought before mistakenly thinking the 4-byte value could include characters in addition to numeric values...

    I started thinking about it more on the flight back from the customer, and then pulled up the script I used originally to try and produce the situation, and found that I had a logic error in my data insertion (nice one huh?)...in addition to this, I noticed on an internal DL that someone mentioned that the 4-byte value that is used to unique-ify a nonunique cluster key is in fact integer based, and doesn't include character data.

    Well, this should make it easier to reproduce then, assuming I can get the script right anyhow :-). After I produced the situtation, I was surprised to find that not only does the uniqueifier contain integer data only, but only positive integer data at that...so, given that we know a 4-byte integer can only contain values between -2^31 and 2^31-1 (-2,147,483,648 thru 2,147,483,647), which simple math tells us results in about 4,294,967,295 distinct values. I figured originally that I'd be able to produce a Duplicate Key error as soon as I tried inserting more than 4,294,967,295 records with the same key value - however, the Duplicate Key error is actually produced once you try and insert the 2,147,483,648th record, which indicates that the unique-ifier is actually made up of postive 4-byte integer values only...intersting huh?

    So, to produce the scenario I mention is pretty simple...just create yourself a table with a single column, then create a nonunique clustered index on the column, then start inserting data into the table using the same value continuously over and over and over and over and over until you get to just about 2,147,483,647 records...then, as soon as you try inserting the next one, you'll get an error as follows:

       Cannot insert duplicate key row in object 'dbo.ztblDupClusterKeyTest' with unique index 'ix_ztblDupClusterKeyTest_clust'.

    So, if you ever plan to have a scenario where you are deploying an application that will make use of a clustered table that contains more than 2,147,483,647 records per distinct cluster key value, you'll have to figure out an alternative design...this could include any of the following:

       1) Add additional keys to the cluster key
       2) Physically partition the data into seperate tables
      
          NOTE: Using partitioning with Sql 2005 wouldn't solve this problem, since the cluster is table-wide
         
       3) Use a heap
      
    Of course, this really isn't that much of a problem, since you probably won't see this scenario due to a variety of reasons, and not to mention that if it was a problem, we'd have known about it long ago from customers :-). But, nonetheless, something interesting...

    I've included the script I used to produce this situation below...a single run of the entire test took about 2.75 hours on my machine...by the time I was producing the error, my test table had 2,147,483,647 records and was about 45,000 MB in total size...

    On a side note, if anyone out there is working with record counts where you think this may be an issue, I'd be curious to hear from you...

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.


    ----------------------------------------------------------------------
    ------------------ CODE ONLY BELOW ------------------
    ----------------------------------------------------------------------

    -- 292,754,944 million rows to start with on my machine...
    select 1 as clustId
    into dbo.ztblDupClusterKeyTest
    from sys.columns a
    cross join sys.columns b
    cross join sys.columns c;
    go

    -- Cluster now...
    create clustered index ix_ztblDupClusterKeyTest_clust on dbo.ztblDupClusterKeyTest (clustId);
    go

    set nocount on;
    go

    -- Ok...now keep adding until we break...
    declare @i int;
    while 1=1 begin

     begin try
      -- about 10.17 million rows on my machine per iteration...
      insert dbo.ztblDupClusterKeyTest (clustId)
      select 1
      from sys.columns a
      cross join sys.columns b
      cross join (select top 23 column_id from sys.columns) c;
     end try
     begin catch
      print 'ERROR: ' + error_message();
      break;
     end catch

     -- Just to introduce a bit of a delay between iterations...
     checkpoint;
    end

  • Tracking index maintanence information...

    Similar to my post here (http://blogs.msdn.com/chadboyd/archive/2007/03/11/what-indexes-are-used-most-often-or-least-often-on-my-server.aspx) regarding index usage, you can use the following code below to get index operational information (exposed via the sys.dm_db_index_operational_stats dynamic management function in Sql 2005).  You can include table filters, specify ordering operations, and specify specific row numbers...correlate this information with other data from the usage data, missing index information, etc., and you've got quite a bit of insight into what parts of your system are working efficiently, those that aren't, those that can be improved, possibly removed, etc., etc.

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

    ----------------------------------------------------------------------
    ------------------ CODE ONLY BELOW ------------------
    ----------------------------------------------------------------------

    if object_id('dbo.fn_indexColumnList') > 0
     drop function dbo.fn_indexColumnList
    go

    create function dbo.fn_indexColumnList(@objectId int, @indexId int)
    returns nvarchar(max)
    as
    /*
    Returns a text-based list of column names, in key order, for the object/index combination passed
    */
    begin
     declare @colList nvarchar(max);
     set @colList = N'';

     -- First, get just the key columns...
     select @colList = @colList + case when len(@colList) > 0 then ',' else '' end + c.name
     from sys.index_columns ic with(nolock)
     join sys.columns c
     on  ic.object_id = c.object_id
     and  ic.column_id = c.column_id
     where  ic.object_id = @objectId
     and  ic.index_id = @indexId
     and  ic.key_ordinal > 0
     and  ic.is_included_column = 0
     order by ic.key_ordinal;
     
     -- Now append any included columns...
     if exists(select * from sys.index_columns where object_id = @objectId and index_id = @indexId and is_included_column > 0) begin
      set @colList = @colList + ' (';
      
      select @colList = @colList + c.name + ','
      from sys.index_columns ic with(nolock)
      join sys.columns c
      on  ic.object_id = c.object_id
      and  ic.column_id = c.column_id
      where  ic.object_id = @objectId
      and  ic.index_id = @indexId
      and  ic.is_included_column > 0;
      
      set @colList = @colList + '$$^^$$';
      set @colList = replace(@colList,',$$^^$$',')');
     end

    return @colList;
    end

    use master
    go

    if ((object_id('sp_indexOperationalInfo') is not null) and (objectproperty(object_id('sp_indexOperationalInfo'), 'IsProcedure') = 1))
     drop proc [dbo].sp_indexOperationalInfo
    go

    create proc [dbo].sp_indexOperationalInfo
     @tableName nvarchar(255) = null, -- Name of a specific table/view/object to retrieve index information for - if null/default/0,
              -- no specific table filter is used
     @rowcount int = null,    -- Value to limit the result set to (top x) - if not passed, all data is returned
     @order  nvarchar(100) = null -- sort to use for the batch request - can be a valid specification of columns in the query
             
    as

    /*

    NOTE: Use of this procedure requires the existence of the following procedures/functions as well:
     1.  dbo.fn_indexColumnList()


    exec dbo.sp_indexOperationalInfo '', 10, 's.row_lock_count'

    */

    set nocount on;
    set transaction isolation level read uncommitted;

    declare @sql   nvarchar(max),
      @databaseId  int,
      @objectId  int;

    -- Format incoming data
    select @sql = N'',
      @databaseId = db_id(),
      @rowcount = case when @rowcount > 0 then @rowcount else 0 end,
      @objectId = object_id(@tableName);

    if (charindex(' ', replace(@order,' desc','')) + charindex(char(9), @order) + charindex(char(13), @order) + charindex(char(10), @order) + charindex('-', @order) + charindex('/', @order)) > 0 begin
     raiserror ('ta ta ta...you can only use valid column names for ordering the list...', 17, 1);
     return;
    end

    select @sql = N'select ' + case when @rowcount > 0 then ' top (@rowcount) ' else '' end + '
          object_name(i.object_id) as objectName, isnull(i.name,''HEAP'') as indexName, i.type_desc as indexType,
          case when i.type_desc = ''HEAP'' then ''HEAP'' else dbo.fn_indexColumnList(i.object_id, i.index_id) end as columnList,
          sizeData.rowCnt as rowCnt, sizeData.totalSpaceMB as totalSpaceMB,
          partition_number as partitionNumber, leaf_insert_count as leafInsertCount, leaf_delete_count as leafDeleteCount,
          leaf_update_count as leafUpdateCount, nonleaf_insert_count as nonleafInsertCount, nonleaf_delete_count as nonleafDeleteCount,
          nonleaf_update_count as nonleafUpdateCount, leaf_allocation_count as leafAllocCount, nonleaf_allocation_count as nonleafAllocCount,
          leaf_page_merge_count as leafPageMergeCount, nonleaf_page_merge_count as nonleafPageMergeCount, range_scan_count as rangeScanCount,
          singleton_lookup_count as singletonLookupCount, lob_fetch_in_pages as lobFetchPages, lob_fetch_in_bytes as lobFetchBytes,
          lob_orphan_create_count as lobOrphanCreateCount, lob_orphan_insert_count as lobOrphanInsertCount,
          row_overflow_fetch_in_pages as rowOverflowFetchPages, row_overflow_fetch_in_bytes as rowOverflowFetchBytes,
          row_lock_count as rowLockCount, row_lock_wait_count as rowLockWaitCount, row_lock_wait_in_ms as rowLockWaitMS,
          page_lock_count as pageLockCount, page_lock_wait_count as pageLockWaitCount, page_lock_wait_in_ms as pageLockWaitMS,
          index_lock_promotion_attempt_count as indexLockPromotionAttemptCount, index_lock_promotion_count as indexLockPromotionCount,
          page_latch_wait_count as pageLatchWaitCount, page_latch_wait_in_ms as pageLatchWaitMS, page_io_latch_wait_count as pageIoLatchWaitCount,
          page_io_latch_wait_in_ms as pageIoLatchWaitMS
        from sys.indexes i
        join (
           select i.object_id as objectId, i.index_id as indexId,
             (sum(a.total_pages) * 8) / 1024 as totalSpaceMB,
             (select sum(p2.rows) from sys.partitions p2 with(nolock) where p2.object_id = i.object_id and p2.index_id = i.index_id) as rowCnt
           from sys.indexes i
           join sys.partitions p
           on  i.object_id = p.object_id
           and  i.index_id = p.index_id
           join sys.allocation_units a
           on  p.partition_id = a.container_id ' +
           case when len(@tableName) > 0 then ' where i.object_id = @objectId ' else '' end +
           'group by i.object_id, i.index_id, i.name
          ) sizeData
        on  i.object_id = sizeData.objectId
        and  i.index_id = sizeData.indexId
        left join sys.dm_db_index_operational_stats(@databaseId,' + case when len(@tableName) > 0 then '@objectId,' else 'default,' end + 'default,default) s
        on  i.object_id = s.object_id
        and  i.index_id = s.index_id
        and  s.database_id = @databaseId ' +
        case when len(@tableName) > 0 then ' where i.object_id = @objectId ' else '' end +
        case when len(@order) > 0 then 'order by ' + @order else '' end;
        

    exec sp_executesql @sql, N'@objectId nvarchar(255), @databaseId int, @rowcount int', @objectId, @databaseId, @rowcount;
    go

  • CMAP CodeCamp in Baltimore Area

    Well, I'm now also presenting at the CMAP Code Camp the week following the NOVA CodeCamp, so, if you're in the Baltimore/DC area (or even NOVA and can't make the NOVA CodeCamp the week before), I'll be presenting at least the same 2 sessions at the CMAP CodeCamp on April 21st at the UMBC Tech Center in Catonsville, MD and possibly giving a Chalk Talk or 2... 

    Here's the link to the CMAP site where you can register and get information on the group:

    http://www.cmap-online.org/

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

     

  • NOVA Code Camp in DC

    If you're in the DC/NOVA area (or even some parts of Maryland for that matter), I'll be presenting 2 sessions at the NOVA CodeCamp on April 14th at the Microsoft offices in Reston, VA.  My presentations will cover new development features in Sql Server 2005 including the CLR, WebServices, T-SQL enhancements, XML, Query Notifications, and ADO.NET enhancements with Sql 2005 (I think that's it anyhow).  If interested in attending, here's a link to the registration page for more information:

    https://www.clicktoattend.com/invitation.aspx?code=116036

    Code camps are great (IMHO) in that we don't spend a lot of time with fancy, pretty presentations, but instead just start digging in to code demo's, samples, questions, etc. on how to use this stuff.

    Stop by and say hi...

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • PAE and /3GB and AWE oh my...

    Many customers (and non-customers) are often confused about all the different memory configurations and options available on 32bit systems (64bit systems do not have so many considerations thanks to the large flat memory space and a VAS size of 16 terabytes as of my typing this)...so, here's an attempt to address each of these a bit and simplify everyone's understanding of them.

    Before we get started, much of what is going to be discussed is vastly over-simplified for sake of brevity and sanity, as trying to explain everything about everything would make for a very, very, very, very long blog posting...this one will be long enough trying to keep things as simplified as possible without leaving out any important parts (which I'm sure will still be missed somewhere, so by all means post a comment if something is missing or unclear)...

    First, a bit of a 32bit memory architecture primer for Windows:

       Windows implements a virtual memory system, which means the OS virtualizes all access to physical memory by
       user-mode processes (Sql Server is a user-mode process, as is all commodity software that runs on windows with
       the exception of things like device drivers, filter drivers, etc.) that run on the system and make use of standard
       memory allocation API's. A kernel mode process referred to as the "Virtual Memory Manager" is the sub-system
       responsible for handling direct memory access for requests to memory, translating virtual memory addresses from an
       application into physical addresses, and handling paging operations (moving data to/from physical RAM to disk, i.e .
       the page/swap file) as necessary based on many different considerations. By default, the 32bit OS can only 'see' and
       use up to 4gb of memory, as it uses a 32bit range to map physical memory space (with a valid range of 0x00000000 up
       to 0xFFFFFFFF, or 0 thru 4,294,967,295 in decimal format, which correlates to a maximum high range pointer of 4GB
       (divide 4294967296 slots (don't forget the 0 slot counts) by 1073741824 (the # of bytes per GB)).  Yes, this means
       that on a default 32bit system no matter how much physical memory you install on the system, the system would only
       ever use at most 4gb of it (kind of a waste if you have a 32bit server with 64GB of memory, huh?)...we'll get into
       how you can make use of more than 4GB on the system later.

       When a process is created on a 32bit Windows system, the OS allocates what is referred to as VAS (Virtual Address
       Space) to that process, and all user-mode processes have their own VAS. The processes' VAS is basically what that
       process can 'see' as memory - for all intents and purposes, the process thinks that this is a map of physical memory
       that it has access to - and this is ALL the memory it thinks it has access to (note that I didn't say that this is
       the only physical memory it can potentially access...more on that later...).  A processes' VAS in 32bit windows is 4GB
       in size, and a VAS in 32bit Windows is typically divided into 2 primary sections: 1 section that is available to the
       process, and 1 section reserved for the system/kernel.  In Windows, this range of memory is split up into the 2 following
       ranges by default:

          Low 2GB Range (0x00000000 through 0x7FFFFFFF) - available to the user-mode process
          High 2GB Range (0x80000000 through 0xFFFFFFFF) - reserved for the kernel/system
     
       So, though a VAS is 4GB in size, a user-mode process by default only has access to up to 2GB of that space - yes,
       this means that by default, a 32bit user-mode process can only ever access 2GB of memory at the most using standard API's.

       This is where a question usually arises - "If every user-mode process gets 4GB of VAS, and I have a machine with only
       2GB of memory, how in the world can I run so many applications concurrently on this machine????".  Good question - it all
       comes back to the fact that Windows implements a virtual memory system, and is also where the page/swap file comes into
       play. First off, just because a user-mode application is given 2GB of VAS space to use doesn't mean that amount of space
       has to be physically backed by memory/storage - this gets into the difference between reserving and committing space,
       which I'm not going to discuss here, but most OTS applications on a standard user computer use memory in the 10's of
       megabytes, not in the hundreds of megabytes or even gigabytes range. Second, if there isn't enough physical memory space
       available for the OS to store all the committed memory for all applications running on the machine, this is where the
       page/swap file comes in - the OS pushes data that is currently not in use by some application to the page file to make
       room in physical memory for currently needed data.  If data that has been pushed to the page file is later requested to
       be read by a given application, the VMM pulls the data back from the page file into physical memory, possibly causing
       other data in physical memory to be pushed out to the page file, updates page entries, and then grants the read request.
       This process of moving data to/from the page file in/out of physical memory is referred to as page faulting, of which
       there are many different types (hard, soft, etc.).

    Ok, so that was lots of fun, yes?  Now, let's try and tackle the remaining options in a logical order - I'm going to start with the /3GB switch (and it's related cousin the /USERVA switch).

       The /3GB and /USERVA switches are specified within the boot.ini file, and to take effect require a restart of the system.
       The /3GB and /USERVA switches affect the sizes of the 2 sections of the VAS as mentioned above - by default these 2
       sections are each 2GB in size each, the low range of which is used by the user-mode process, the high range used by
       the kernel system. Again, this means that a user-mode process (such as Sql Server) has the ability to directly
       address only up to 2GB of memory. The /3GB switch changes the split of the VAS from 2GB for each the user and
       system process to be instead 3GB for user processes and leaving only 1GB for the kernel/system processes. The /USERVA
       switch does the same exact thing, however you define the split by specifying a value between 2048 and 3096, which in
       effect becomes the size of the user-mode addressable portion of the VAS (for example, if you specify /USERVA 2560, you'll
       end up with a VAS split of 2.5 GB for the user-mode process and 1.5 GB for the system/kernel). Use of the /3GB and
       /USERVA switches does NOT depend on using /PAE or AWE - you do NOT need /PAE enabled to use these switches, and you
       don't need AWE enabled either - these options are mutually exclusive. You can use either of these switches on servers
       with <4GB of physical memory and still affect the addressable memory space for user mode applications and the kernel.
      
       So, using /3GB effectively changes the partition split for the VAS from what is mentioned above to the following:
      
          Low 3GB Range (0x00000000 through 0xBFFFFFFF) - available to the user-mode process
          High 1GB Range (0xC0000000 through 0xFFFFFFFF) - reserved for the kernel/system
         
       Using the /3GB switch does have some impact to be aware of - one important side-effect being that at most a server with
       /3GB enabled will be able to use 16GB of physical memory in total (due to the limited 1GB of kernel space limiting
       space to be able to store internal memory mapping structures). Additionally, an application must be compiled and linked
       using a special switch to take advantage of the increased addressable memory space. For more information on the /3GB
       switch and related requirements, see the following articles:
      
          Available switch options for the Windows XP and the Windows Server 2003 Boot.ini files
          http://support.microsoft.com/kb/833721
         
          Information on Application Use of 4GT RAM Tuning
          http://support.microsoft.com/kb/171793/

    Great, more fun.  Ok, next we'll discuss PAE -

       Similar to the /3GB and /USERVA switches, /PAE is an OS level boot.ini switch/option, not a Sql Server technology of
       any kind. PAE in a nutshell allows the OS to see more than 4GB of physical memory - remember above in the memory
       architecture overview how I mentioned that a 32bit Windows system by default would NEVER be able to see more than 4GB
       of memory due to the limitation of a 32bit pointer size? Well, PAE is the mechanism by which 32bit Windows systems can
       'see' and address above 4GB of physical memory - this is achieved by system level changes that allow mapping of 32bit
       pointers through to an equivalent 36bit physical memory location (allowing up to 64GB of physical memory, 2^36). One
       such system level change is the addition of an extra level in the MMU called the page directory pointer table. Adding
       the PAE switch has NO effect whatsoever on the VAS size in a 32bit system - it remains at 4GB in total, and the sizes of
       the 2 portions of the VAS remain exactly the same also (2GB each by default, unless /3GB or /USERVA is used as mentioned
       above). User-mode applications do not need to do anything special to take advantage of the increased visible memory
       available to the OS, because due to the fact that memory management is virtualized anyhow, nothing changes from the
       application perspective. If nothing else was changed on a given system, what enabling PAE would in effect do for a system
       (assuming >4GB of memory exists before and after the change that is) would be to allow the OS more physical memory space
       for storing data instead of having to page this data to disk (i.e. to the page/swap file). And again, similar to /3GB, PAE
       is NOT dependent on any of the other options (3GB, USERVA, AWE, etc.) - although they can be used together by all means
       with certain distinctions (like for example those mentioned above with /3GB).
      
    Ok, and finally, how does AWE fit in here -

       AWE is a bit different from /3GB and PAE in multiple respects. First, AWE is simply a mechanism to access/map memory
       from outside a processes' VAS into it's VAS and vice versa - it is not a type of memory, it is not an extension of
       memory, and it doesn't change any memory structures (this is a bit over-simplified, but again, not going to go much
       deeper than that here). Many people also think AWE is a Sql Server technology - it is not...in fact, it is a
       technology that Sql Server uses, just as many other applications can and do. Second, it is specific to a single
       process, it is not a shared/system wide change (multiple processes can use the AWE mechanism, but they do so
       independently of one another if they do so).
      
       So, what is AWE - AWE is basically a set of system API's that allow a process to access memory outside/larger than
       it's VAS; however, using AWE isn't direct memory access - AWE is a windowing technology by which a process basically
       uses physical memory outside it's VAS for storage of data, however to access this data, it has to map that data into
       it's VAS before accessing it - this is what the AWE API's do, they provide an interface by which a process can reserve
       memory outside it's VAS for storage, however to actually 'touch' the data stored in these memory locations, the process
       has to map that data into it's VAS to read it. For all intents and purposes, you can think of using AWE memory
       management API's as very similar to what happens in the kernel level memory manager when a soft-page fault occurs. Using
       the AWE mechanisms allow a process to allocate physical memory, then using these same API's allocate space within the
       processes' VAS to map to these physical memory locations. So although AWE allows you to access/use memory outside
       the processes' VAS, the process is still confined by the VAS in terms of being able to read/write data into memory
       mapped by AWE mechanisms.
      
    Phew...there, we made it through. Now, how does this all affect Sql Server? Well, I'm glad you asked, cause I'm not sick of typing yet at all...

    /3GB can benefit Sql Server in many scenarios where you are incurring VAS memory pressure. There are a couple different types of memory pressure (physical memory pressure and VAS memory pressure are the 2 biggest I guess), and you'll typically encounter VAS memory pressure in workloads that generate lots of hashing/sorting operations and/or very heavy hashing/sorting operations (Data Warehouse workloads for example). You can also encounter VAS memory pressure in scenarios where plan cache miss-use is occurring (perhaps do to ad-hoc or poor T-Sql), servers with large numbers of user connections, or cursor-based operations are occurring frequently. You also typically see this type of pressure more pronounced on servers with large data-caches that are using PAE/AWE mechanisms - the reason for this is simple: the only memory consumer in any version of Sql Server that can make use of AWE-mapped memory is the data cache. No other memory consumer can make use of AWE mapped memory, including your plan cache, connection cache, etc. The reason this becomes more profound on these large-cache systems is typically because the data sizes are larger in addition to the things mentioned above.  There are many other considerations to take into account before simply enabling a system with /3GB or /USERVA, this is just a start...here are some VERY GENERAL pointers you can use as guides for determining if /3GB or /USERVA will be useful in your environment, and also some things to look out for (again, by no means an end-all, be-all list of things to consider):

        1) Using /3GB limits the total physical memory that can used on a server to 16GB - this is a side-effect of/due to the
        decreased VAS space for the system/kernel, which is where memory-mapping structures are stored by the VMM. Therefore, if
        you plan to/want to use more than 16GB of memory on your server, you can't use /3GB.
       
        2) Though 16GB is a hard upper limit, most workloads will actually show decreased throughput on systems with 12GB of
        memory, and many on systems/workloads with as low as 8GB of memory. The use of /3GB is an advanced configuration that
        should be tested, understood, and documented within your environment/workload prior to implementing to ensure increased
        throughput.
       
        3) If /3GB or /USERVA is used, monitor "Memory:Free System Page Table Entries" perfmon counter closely - PTE's are one of
        the memory management structures used by the VMM to manage virtual-physical memory mappings, and typically requires an
        absolute minimum of 7,000 free entries for efficient system operation. Systems operating with insufficient Free PTE's are
        prone to major problems such as STOP errors and blue-screens. This is only one of the considerations for monitoring a system
        using /3GB - the kernel is responsible for many, many critical operations on a system. Typically if the kernel is starved
        for memory as a whole, some of the first systems to incur side-effects include network operations (dropped/lost packets) and
        general IO operations (network IO, storage IO).
       
        4) /3GB or /USERVA should only be used if you can properly determine that your workload is VAS-pressured (and none of the
        other considerations preclude the use as well) - if your workload is not under VAS pressure, or if the VAS is not your
        bottleneck, you will probably see minimal (if any) increased throughput.
       
    PAE can benefit Sql Server in obvious ways - if the OS can see more memory, than so can Sql Server (both directly via AWE mechanisms, and also indirectly via use of additional physical memory by the OS for multiple system processes). By far, the largest benefit here is for larger Sql systems that can benefit from additional memory space via AWE mechanisms.

    Similarly, AWE can benefit Sql Server in obvious ways - the more memory the server has to store data in (even if it isn't 'direct' memory access in a sense, it is still infinitely faster than disk-based access) the better things will typically operate. There are also additional considerations to take into account when working with a Sql Server system that implements AWE (especially with Sql 2000 and/or clustered systems), so ensure you have a good understanding of those prior to sprinting down this path as well.

    Finally, thanks goes out to many of my colleagues for assistance with reviewing/recommending/adding/collaborating on content to this particular post:

     Dwayne Lanclos
     Uttam Parui
     Ernie DeVore
     Pam Lahoud
     David Sheaffer
     Shen Xu
     Curtis Krumel

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • What indexes are used most often (or least often) on my server?

    A customer this past week was curious how to determine which indexes on their server were being used either very frequently or very infrequently. This type of information can be invaluable when tuning a server on both ends: finding the most frequently used structures can aid in tracking down hotspots in your system for update-based operations and also provide some guidance on where to target your performance tuning efforts; finding the least frequently used structures can help with finding unnecessary indexes or indexes which are used sometimes but the maintenance overhead associated with keeping them updated outweighs their benefit.

    With Sql 2000, determining this type of information was extremely difficult to say the least - it would have required capturing an extensive server trace and applying some very intense analytics to the captured data. With Sql 2005, this type of information is infinitely easier to find using the DMV's provided with the engine. I'm not going to go into a deep discussion on the DMV's themselves, as they've been talked about quite extensively elsewhere - I will however provide you some initial scripts that gather this type of information in a variety of ways.

    The new sys.dm_db_index_usage_stats DMV provides you with information on usage characteristics for indexes within your system including the number of user-based and system-based seeks, scans, lookups, and update operations executed against each index. User based counters are incremented due to user operations (like running a query) and system based counters are incremented due to system level operations (like stat updates). Seeks, scan, and lookups occur to satisfy read-based operations, whereas updates occur to satisfy write-based operations (insert/update/deletes).

    So, with this information, what types of things should I look for?  Good question, here's some thoughts:

        -  Sorting on user seeks, scans, and/or lookups (or a combination thereof) will provide you a list of the most heavily used indexes to satisfy user queries - this shows you generally the 'hottest' objects in your system (i.e. those that are queried most frequently).
       
        -  Sorting on user lookups will provide you a list of the cluster/heap structures that are used most frequently to satisfy bookmark-lookup type operations for additional data from a non-clustered index seek/scan. Combined with correlating this information against non-clustered index structures for the same object that have high user seek values can give you some insight into the non-clustered indexes that are probably used most frequently to satisfy user requests without enough data to 'cover' the request.
       
        -  Sorting on update operations will give you a list of indexes that are most frequently updated by user/system operations (insert/update/delete) - correlating high update operations with low seek/scan/lookup operations will give you clues as to indexes that are probably costing more to maintain than are useful for improved read-speed.
       
    The information contained in the DMV is not persisted across service starts/stops, so if you want to keep information from this (and other) DMV's, be sure to persist it elsewhere by capturing the data and storing in a table of some sort.

    Ok, so how can you query this data?  Well, here's a sample stored procedure that I use for querying against it...in addition to the information from the DMV itself, this procedure gives you the ability to sort in multiple ways, size information for each index (row counts, storage space), and column lists for each index (using a helper function also included). I'll let you play around with it if you like, or by all means just use it as a guide to come up with something of your own that suites you or your organization better...enjoy...

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

     

    ----------------------------------------------------------------------
    ------------------ CODE ONLY BELOW ------------------
    ----------------------------------------------------------------------

    if object_id('dbo.fn_indexColumnList') > 0
     drop function dbo.fn_indexColumnList
    go

    create function dbo.fn_indexColumnList(@objectId int, @indexId int)
    returns nvarchar(max)
    as
    /*
    Returns a text-based list of column names, in key order, for the object/index combination passed
    */
    begin
     declare @colList nvarchar(max);
     set @colList = N'';

     -- First, get just the key columns...
     select @colList = @colList + case when len(@colList) > 0 then ',' else '' end + c.name
     from sys.index_columns ic with(nolock)
     join sys.columns c
     on  ic.object_id = c.object_id
     and  ic.column_id = c.column_id
     where  ic.object_id = @objectId
     and  ic.index_id = @indexId
     and  ic.key_ordinal > 0
     and  ic.is_included_column = 0
     order by ic.key_ordinal;
     
     -- Now append any included columns...
     if exists(select * from sys.index_columns where object_id = @objectId and index_id = @indexId and is_included_column > 0) begin
      set @colList = @colList + ' (';
      
      select @colList = @colList + c.name + ','
      from sys.index_columns ic with(nolock)
      join sys.columns c
      on  ic.object_id = c.object_id
      and  ic.column_id = c.column_id
      where  ic.object_id = @objectId
      and  ic.index_id = @indexId
      and  ic.is_included_column > 0;
      
      set @colList = @colList + '$$^^$$';
      set @colList = replace(@colList,',$$^^$$',')');
     end

    return @colList;
    end

    use master
    go

    if ((object_id('sp_indexUsageInfo') is not null) and (objectproperty(object_id('sp_indexUsageInfo'), 'IsProcedure') = 1))
     drop proc [dbo].sp_indexUsageInfo
    go

    create proc [dbo].sp_indexUsageInfo
     @tableName nvarchar(255) = null, -- Name of a specific table/view/object to retrieve index usage information for - if null/default/0, no specific table filter is used
     @rowcount int = null,    -- Value to limit the result set to (top x) - if not passed, all data is returned
     @opts  int = 0     -- Bit flags that indicate what to return for the procedure call:
              -- 1 bit - if set, order of results will be descending instead of ascending (by default, we order showing least used first up to most used,
              --   if this bit is set, we do the opposite instead...
              -- 2 bit - if set, whatever sort order is used will be applied to system values instead of user values
    as

    /*

    NOTE: Use of this procedure requires the existence of the following procedures/functions as well:
     1.  dbo.fn_indexColumnList()

    -- Show all structures for the current db, in order of least user-based uses and then largest rowCount
    exec dbo.sp_indexUsageInfo default, default, default
    -- Only top 25
    exec dbo.sp_indexUsageInfo default, 25, default

    -- Show all structures for the current db, in order of most user-based uses and then largest rowCount
    exec dbo.sp_indexUsageInfo default, default, 1

    -- Show all structures for the current db, in order of least system-based uses and then largest rowCount
    exec dbo.sp_indexUsageInfo default, default, 2

    -- Show all structures for the current db, in order of most system-based uses and then largest rowCount
    exec dbo.sp_indexUsageInfo default, default, 3

    -- Show all structures for the current db, for a single tabled called 'tblBatches', in order of least user-based uses and then largest rowCount
    exec dbo.sp_indexUsageInfo 'tblBatches', default, default

    -- Show all structures for the current db, for a single tabled called 'tblBatches', in order of most user-based uses and then largest rowCount
    exec dbo.sp_indexUsageInfo 'tblBatches', default, 1
    */

    set nocount on;
    set transaction isolation level read uncommitted;

    declare @sql   nvarchar(max),
      @databaseId  int,
      @order   nvarchar(1000);

    -- Format incoming data
    select @opts = case when @opts > 0 then @opts else 0 end,
      @sql = N'',
      @databaseId = db_id(),
      @rowcount = case when @rowcount > 0 then @rowcount else 0 end;

    select @order = case
         when @opts & 3 = 3 then '(u.system_seeks + u.system_scans + u.system_lookups) desc, sizeData.rowCnt desc'
         when @opts & 3 = 2 then '(u.system_seeks + u.system_scans + u.system_lookups), sizeData.rowCnt desc'
         when @opts & 3 = 1 then '(u.user_seeks + u.user_scans + u.user_lookups) desc, sizeData.rowCnt desc'
         when @opts & 3 = 0 then '(u.user_seeks + u.user_scans + u.user_lookups), sizeData.rowCnt desc'
        end;
        
    select @sql = @sql + N'
     select ' + case when @rowcount > 0 then ' top (@rowcount) ' else '' end +
     '  object_name(i.object_id) as objectName, isnull(i.name,''HEAP'') as indexName, i.type_desc as indexType,
       case when i.type_desc = ''HEAP'' then ''HEAP'' else dbo.fn_indexColumnList(i.object_id, i.index_id) end as columnList,
       sizeData.rowCnt as rowCnt, sizeData.totalSpaceMB as totalSpaceMB, sizeData.usedSpaceMB as usedSpaceMB,
       u.user_seeks as userSeeks, u.user_scans as userScans, u.user_lookups as userLookups, u.user_updates as userUpdates,
       u.system_seeks as sysSeeks, u.system_scans as sysScans, u.system_lookups as sysLookups, u.system_updates as sysUpdates,
       datediff(minute,u.last_user_seek,getdate()) as minutesSinceLastUserSeek, datediff(minute,u.last_user_scan,getdate()) as minutesSinceLastUserScan,
       datediff(minute,u.last_user_lookup,getdate()) as minutesSinceLastUserLookup, datediff(minute,u.last_user_update,getdate()) as minutesSinceLastUserUpdate,
       datediff(minute,u.last_system_seek,getdate()) as minutesSinceLastSystemSeek, datediff(minute,u.last_system_scan,getdate()) as minutesSinceLastSystemScan,
       datediff(minute,u.last_system_lookup,getdate()) as minutesSinceLastSystemLookup, datediff(minute,u.last_system_update,getdate()) as minutesSinceLastSystemUpdate
     from sys.indexes i
     join (
        select i.object_id as objectId, i.index_id as indexId,
          (sum(a.total_pages) * 8) / 1024 as totalSpaceMB, (sum(a.used_pages) * 8) / 1024 as usedSpaceMB,
          (select sum(p2.rows) from sys.partitions p2 with(nolock) where p2.object_id = i.object_id and p2.index_id = i.index_id) as rowCnt
        from sys.indexes i
        join sys.partitions p
        on  i.object_id = p.object_id
        and  i.index_id = p.index_id
        join sys.allocation_units a
        on  p.partition_id = a.container_id ' +
        case when len(@tableName) > 0 then ' where i.object_id = object_id(@tableName) ' else '' end +
        'group by i.object_id, i.index_id, i.name
       ) sizeData
     on  i.object_id = sizeData.objectId
     and  i.index_id = sizeData.indexId
     left join sys.dm_db_index_usage_stats u
     on  u.object_id = i.object_id
     and  u.index_id = i.index_id
     and  u.database_id = @databaseId ' +
     case when len(@tableName) > 0 then ' where i.object_id = object_id(@tableName) ' else '' end +
     'order by ' + @order;

    exec sp_executesql @sql, N'@tableName nvarchar(255), @databaseId int, @rowcount int', @tableName, @databaseId, @rowcount;
    go

  • Sql 2005 Dynamic Memory Managment with AWE enabled

    Earlier this week I had a customer send me the following question:

       I have a server running the 32-bit edition of SQL 2005 SP1 with 16 GB of memory and
       8 CPUs. AWE is enabled and the max server memory setting is 12 GB. When the service
       is started, memory usage starts low and gradually increases as the workload builds
       until it reaches 12 GB where it stops (as I expect it to). However, once the workload
       slows down and usage drops, I notice CPU and Batch Requests drop to practically
       nothing, while memory usage remains constantly at 12 GB; there is no paging or memory
       thrashing, and there is nothing else running on the server. This is leading me to
       believe that Sql 2005 really doesn't do a good job of dynamic memory management as it
       claims it should with AWE enabled now. Why isn't Sql 2005 releasing un-used memory
       during low activity periods?

    First off, Sql 2005 does handle memory allocation much differently than Sql 2000 did, especially with AWE enabled. In Sql 2000, enabling AWE on your system basically caused the engine to grab a fixed memory space on startup and not release it back at any time (the actual determination on how much and what amount of memory it would grab depends on multiple considerations, which I won't go into here). Sql 2005 however introduces dynamic memory mangement even with AWE enabled (meaning you can use AWE and still have Sql grow/shrink it's memory footprint as it determines appropriate) - similarly, Sql 2005 also allows for dynamic memory management with locked pages, which is different from AWE. I won't go into these specifics here, but maybe I'll post a follow-up covering these topics.

    Back to the question at hand - what is going on in the above scenario. Well, just because we allow dynamic memory management with AWE now, this still doesn't change a basic principal of Sql's management of memory - this is actually the same no matter if you are using AWE or not - Sql won’t release memory just because usage drops; it will only release when informed to do so by a user change to the min/max configuration options, or if ‘asked’ to do so by the OS (i.e. the OS is showing that other applications need some memory space, or the OS is coming under memory pressure) - this is by design and as works as expected (also works as it always has).

    On startup, Sql will grow it’s memory space as appropriate to the configured max value if it can get it from the OS - if while the server is growing it’s memory space up to the max configuration value the OS is under pressure from other consumers, Sql won’t get to it’s max configured value for the same reasons that it will release it as mentioned above (this assumes there is enough pressure to preclude Sql from getting to it’s max configured value).

    The basic thought here is that, given that the job of Sql Server is to act as a data server, and in any enterpise-level deployment/edition if there isn’t memory pressure on the box, why would you want (or care for that matter) Sql to release reservations back?  That would slow the build-up of memory space again during workload ramp-up times, and as a result, slow user response times.

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • Sorting and comparing unicode and binary data - why do I get a weird sort order?

    A customer recently questioned me regarding a sorting issue they were having in the following scenario:

       My database is running under the "SQL_Latin1_General_CP1_CI_AS" Sql collation, but when I run
       the script below I seem to get incorrect sort results.  Why does the "abc-test" item come 
       immediately after the "abctest" item and not after "abctest1" as I would expect it to?

          create table #temp1 (
           ServerItem nvarchar(260) collate SQL_Latin1_General_CP1_CI_AS primary key clustered
          )

          insert #temp1
          select 'abctest'
          insert #temp1
          select 'abc-test'
          insert #temp1
          select 'abctest1'
          insert #temp1
          select 'abc-test1'

          select * from #temp1
          order by ServerItem

          drop table #temp1

          RESULTS:
         
          ServerItem
          ------------
          abctest
          abc-test
          abctest1
          abc-test1


       Is this right? Is this a bug?

    No, this is not a bug - and yes, this is correct and by design.  The reasoning lies in how certain SQL collations particularly handle sorting rules for unicode data...this collation in particular uses a word sort algorithm to sort the unicode data, which ignores punctuation characters (and hyphen is one of those, amoung many other such as apostrophe's, dialect characters, etc.). The simple rule to follow when comparing or sorting multiple unicode strings is to not assume that it is done character-by-character, as it isn't.

    A similar distinction occurs when using a Binary code page, like Latin1_General_BIN for example (popular amoung many cross-platform COTS applications). Did you know that if you are using a binary sort order that UPPERCASE letters will always sort before lowercase letters?  This is because binary sorting sorts based on the actual bit representation of the characters, and UPPERCASE letters have a lower ascii-code than that of lowercase letters.  Using a similar example as above, here is what you would see using a binary sort order:

       create table #temp1 (ServerItem nvarchar(260) collate Latin1_General_BIN primary key clustered)

       insert #temp1
       select 'abctest'
       insert #temp1
       select 'ABCTEST'
       insert #temp1
       select 'BBCTEST'
       insert #temp1
       select 'bbctest'
      
       select * from #temp1
       order by ServerItem

       drop table #temp1

       RESULTS:
      
       ServerItem
       -------------
       ABCTEST
       BBCTEST
       abctest
       bbctest

    See how BBCTEST comes before abctest?  You'd see the same type of result if you changed "BBCTEST" to just "Bbctest".

    Multiple things are taken into consideration when sorting/comparing data in Sql Server (and many other platforms) including your collation setting, code page, binary/non-binary, unicode vs. ascii data, etc. Be sure to understand differences between all characteristics of your system before deploying...


    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • How does AUTO_UPDATE_STATISTICS_ASYNC work with Sql 2005?

    A customer this week inquired about auto updating of statistics in Sql 2005, particularly with regards to how and what the new AUTO_UPDATE_STATISTICS_ASYNC option is and how it works...well, here you have it...

    With Sql Server 2005, the AUTO_UPDATE_STATISTICS_ASYNC option configures a given database to update statistics asynchronously vs. synchronously (as occurs without the option enabled, or in Sql 2000 as well). Typically, if a given query request triggers an auto stat updating event without this option set, the query will wait as stats are updated, then the query will be executed. If you set this option however, the query will be executed against the old/existing stats, while submitting a request of sorts in the background telling the engine to update the stats automatically as soon as possible, without holding up the existing query request(s). As soon as the background operation completes, new query requests will begin to use the new statistics information. One thing to note in particular is that this option only comes in to play for auto updating of stats, not manual updating or on-demand updating on request by a user.

    In many scenarios, this option can be enabled with little risk to negative side-effects that stale-statistics can often cause (poor query plans for example)...if your data distribution and workload typically have little impact on overall data distribution numbers, row counts and sizes, etc., then it is probably a great candidate for seeing nothing but improved throughput with this option enabled.

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • What is a hobt_id in Sql 2005?

    The same customer that promted the post of mine titled "How can I get sysindexes information in Sql 2005?" posed a follow-up question - what is the hobt_id column in the sys.partitions catalog view and how does it relate to other catalog views/structures?

    A very good question...first, hobt (pronounced "hobbit") stands for "Heap or B-Tree".  Sql Server uses Heap structures to store data when a given table has no cluster (i.e. a clustered index), and uses a B-Tree structure (B-Tree stands for "Balanced Tree", not binary tree as is a common misconception) for indexes (including clustered indexes). A Heap is a collection of data pages that have no specific order and are not linked in any manner (like clusters which are a doubly-linked list).

    So, a hobt in Sql Server 2005 is basically a collection of data/index pages - specifically, hobt is generally used to describe a subset of data/index pages within a single partition. Currently, the relationship between a partition and a hobt is ALWAYS 1-to-1 - meaning the hobt_id and partition_id value in the sys.partitions table are completely interchangable at the moment. In fact, you'll notice that the actual values for partition_id and hobt_id are always exactly the same currently in Sql 2005.  In some future release of Sql this may change, however there's no talk of it at this point to my knowledge anyhow.

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • How can I get sysindexes information in Sql 2005?

    A customer asked me today how they could retrieve information similar to that available in the "sysindexes" system table from Sql Server 2000 in Sql Server 2005 using the new catalog views.  Though in Sql 2005 there is a dbo.sysindexes and sys.sysindexes compatability view, it doesn't return detailed information, particularly when using partitioning, large object data (LOB data), and/or variable character data over 8060 bytes in a single row (which you can now do in Sql 2005 unlike in Sql 2000...see my prior post titled "Row sizes exceeding 8060 bytes in Sql 2005" for more information on this type of data).

    The reasons for having to move away from a single table approach to reporting this type of data in Sql 2005 revolve primarily around 2 new features in 2005: 1) partitioning 2) row-overflow data.  Partitioning allows you to split different portions of a given table/index into multiple segments (partitions) based on a partitioning column that can optionally be stored in multiple filegroups.  Given this, a single index can be made up of multiple partitions, each of which must have data sizes, page counts, etc. tracked and stored.  Row-overflow data is the methodology by which variable-length data in excess of 8060 bytes for a single row is stored in additional pages seperate from the row's primary data (in row).  Again, see my prior post mentioned above for more information on this topic.

    There are 3 catalog views in Sql 2005 that provide you the appropriate insight for this information, and they are:

        sys.indexes
      
            Contains a single row per index/heap in the database, with information such as name, index_id, type, uniqueness, etc.)
      
        sys.partitions
      
            Contains a single row per index per partition, with information such as the partition_id (unique per partition), object_id (object the partition belongs to), index_id (index the partition belongs to), and rows (count of rows in the given partition). At a minimum, there is ALWAYS at least 1 row per index entry in sys.partitions, even if you are not using partitioning at all. A single index can have up to 1,000 partitions currently.  This view also contains a column called hobt_id (hobt is pronounced "hobbit", and stands for "heap or b-tree") - for information on this column and it's meaning, see my post titled "What is a hobt_id in Sql 2005?".

        sys.allocation_units
     
            Contains a single row per partition per page type/allocation unit, with information such as allocation_unit_id, type, container_id, data_space_id, total_pages, used_pages, and data_pages.  An allocation unit in Sql 2005 is a collection of pages of a single type for a given partition/hobt.  A single partition could have as many as 3 different allocation units, 1 for each of the 3 page types in Sql 2005: In-row data (standard data/index pages), row-overflow data (variable length data for a given row in excess of 8060 bytes), and LOB-data (large object data such as text, ntext, image, any of the MAX data types, and CLR UDT's).  At a minimum, each partition/hobt will always have an allocation_unit record for In-row data.  The "container_id" value for a given allocation_unit relates to the partition_id value from sys.partitions (it also currently always relates to the hobt_id value from sys.partitions, but again, see my other post for information here :-)).

    Now that we understand what each of these views provide and their relationship with one another, we can easily write some simple queries to get the information we are looking for.

    To get the current total number of rows for the heap or clustered index in a table named 'tblTest' (which will be the number of rows in the table), try the following:

        select object_name(i.object_id) as objectName, i.name as indexName, sum(p.rows) as rowCnt
        from sys.indexes i
        join sys.partitions p
        on  i.object_id = p.object_id
        and  i.index_id = p.index_id
        where i.object_id = object_id('tblTest')
        and  i.index_id <= 1
        group by i.object_id, i.index_id, i.name

    To get the current total number of pages, used pages, and data pages for given heap/cluster in a table named 'tblTest', try the following (NOTE that this will include ALL 3 page types in sum):

        -- Total # of pages, used_pages, and data_pages for a given heap/clustered index
        select object_name(i.object_id) as objectName, i.name as indexName,
                sum(a.total_pages) as totalPages, sum(a.used_pages) as usedPages, sum(a.data_pages) as dataPages,
                (sum(a.total_pages) * 8) / 1024 as totalSpaceMB, (sum(a.used_pages) * 8) / 1024 as usedSpaceMB, (sum(a.data_pages) * 8) / 1024 as dataSpaceMB
        from sys.indexes i
        join sys.partitions p
        on  i.object_id = p.object_id
        and  i.index_id = p.index_id
        join sys.allocation_units a
        on  p.partition_id = a.container_id
        where i.object_id = object_id('tblTest')
        and  i.index_id <= 1
        group by i.object_id, i.index_id, i.name

    If you'd like the same information as above, however aggregated per page type (i.e. In-row data, Row-overflow data, LOB data), simply modify the query slightly as follows and you'll now get a single aggregate row per page type (with an added rollup bonus for totaling per group):

        -- Total # of pages, used_pages, and data_pages for a given heap/clustered index by page type
        select case when grouping(i.object_id) = 1 then '--- TOTAL ---' else object_name(i.object_id) end as objectName,
                case when grouping(i.name) = 1 then '--- TOTAL ---' else i.name end as indexName,
                case when grouping(a.type_desc) = 1 then '--- TOTAL ---' else a.type_desc end as pageType,
                sum(a.total_pages) as totalPages, sum(a.used_pages) as usedPages, sum(a.data_pages) as dataPages,
                (sum(a.total_pages) * 8) / 1024 as totalSpaceMB, (sum(a.used_pages) * 8) / 1024 as usedSpaceMB, (sum(a.data_pages) * 8) / 1024 as dataSpaceMB
        from sys.indexes i
        join sys.partitions p
        on  i.object_id = p.object_id
        and  i.index_id = p.index_id
        join sys.allocation_units a
        on  p.partition_id = a.container_id
        where i.object_id = object_id('tblTest')
        and  i.index_id <= 1
        group by i.object_id, i.name, a.type_desc with rollup

    You can obviously feel free to modify/expand on any of the above to form your own aggregations, forumlas, etc. Enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • Row sizes exceeding 8060 bytes in Sql 2005

    A customer I work with questioned me today in regards to hearing that in Sql 2005 you can now have rows that exceed 8060kb in size, the limit of a single row in Sql 2000 (not counting text/blob data). Here's some information on how this works in 2005...

    In Sql 2000, if you try to create a table that has a possible total of data storage > 8060 byes, you'll see a warning like the following (use the sample create table statement to repro on your own instance if you like):

         create table zTest (id int not null, char1 varchar(8000) not null, char2 varchar(8000) not null)

         Warning: The table 'zTest' has been created but its maximum row size (16029) exceeds the maximum number of bytes per row (8060). INSERT or UPDATE of a row in this table will fail if the resulting row length exceeds 8060 bytes.

    The table is successfully created, but if you try running a statement that causes data in a single record to exceed the 8060 byte limit, you'll get an error like the following:

         insert zTest (id, char1, char2) select 1, replicate('a',8000), replicate('b',8000)

         Msg 511, Level 16, State 0, Line 1
         Cannot create a row of size 16017 which is greater than the allowable maximum of 8060.
         The statement has been terminated.

    In Sql 2005, this is no longer a problem for VARIABLE length column data that exceeds the 8060 byte threshold - you still have a maximum FIXED row size of 8060 bytes. To demonstrate, try running the following statement, which tries to create a table using fixed-length columns only, on a Sql 2005 instance of your own and you will still receive an error:

         create table zTest (id int not null, char1 char(8000) not null, char2 char(8000) not null)

         Msg 1701, Level 16, State 1, Line 1
         Creating or altering table 'zTest' failed because the minimum row size would be 16011, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

    However, if you try to run the original create table statement from the Sql 2000 test above, not only do you not receive a warning on creation of the object:

         create table zTest (id int not null, char1 varchar(8000) not null, char2 varchar(8000) not null)
     
         Command(s) completed successfully.

    but, you also do not receive an error when you proceed to insert data into the table in exces of 8060 bytes:

         insert zTest (id, char1, char2) select 1, replicate('a',8000), replicate('b',8000)

         (1 row(s) affected)
     
    Clearly, with Sql 2005 you can now store rows that exceed 8060 bytes in size using variable length data types.

    This is made possible in Sql 2005 thanks to a new storage methodology called 'Row-overflow data'. With Sql 2000, a single row's data HAS to fit on a single page, however in Sql 2005, and variable length data in a single row that exceeds the available page-space is pushed into a seperate page, called a row-overflow page.  A pointer is left in the original page of In-row data for the record pointing to the row-overflow page(s) that contain the given column's data for the row. When a request is made to fetch the data for the given row, the engine reads through the in-row data as normal, recognizes the pointer to the overflow data, jumps to the Row-overflow page(s) to fetch the data, then returns to continue normal operations.  See my blog post titled "How can I get sysindexes information in Sql 2005?" for some information on how to retrieve meta-data information in regards to this row-overflow data for given objects.

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • SP2 Rollback requires complete reinstallation

    With the releast of SP2 for Sql 2005, a customer asked me today what they need to consider as a rollback plan when applying SP2 to an existing Sql 2005 instance. The answer is outlined in the ReadMe file for the Sql 2005 SP2 release - snipped and pasted blow for information:

     1.3 Uninstalling SQL Server 2005 SP2

      Once SQL Server 2005 SP2 has been applied, it cannot be removed without uninstalling the entire product.

      To remove SP2 and revert to the previous version of SQL Server 2005
      Using Add or Remove Programs, uninstall the instance of SQL Server 2005 .

      Reinstall SQL Server 2005.

      Apply any hotfixes that were previously installed.

      Note: 
       Additional steps are required to revert to the previous version of SQL Server 2005. For more information, see the SP2 Setup documentation. 

    Just something you may want to be aware of...here is a link to the entire SP2 readme:

    http://download.microsoft.com/download/2/B/5/2B5E5D37-9B17-423D-BC8F-B11ECD4195B4/ReadmeSQL2005SP2.htm


    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • NULL database name values? Database ID of 32,767? What is going on here?

    I had a customer inquire as to why they would see results from queries like the following where the database ID value was null in Sql Server 2005 DMV's:
     
      select distinct db_name(database_id) from sys.dm_os_buffer_descriptors b with(nolock);
     
     If you change the query to join to the sys.databases catalog view, you'll notice that the NULL database name values disappear, however so do the resulting records from the sys.dm_os_buffer_descriptors where the prior statement was returning a NULL db_name() value:
      
      select distinct d.name
      from sys.dm_os_buffer_descriptors b with(nolock)
      join sys.databases d with(nolock)
      on  b.database_id = d.database_id

     If you run the query without using the db_name() function, you'll notice that the database_id values for some of the entries in the sys.dm_os_buffer_descriptors DMV are the value of 32767...which I'm sure you realize that you don't have a matching user database for :-).
     
     What is going on here????  The answer is due to the new system database introduced in Sql Server 2005 called the RESOURCE database.  This database will always show a database_id value of 32767 when exposed within system DMVs or Catalog Views (note that it isn't exposed in all DMVs or Catalog Views, as evident by a simple query against the sys.databases Catalog View, where you won't see an entry for it). So, in this case, simply modify the first query to be something like this:
     
      select distinct case when database_id = 32767 then 'resourceDb' else db_name(database_id) end
      from sys.dm_os_buffer_descriptors b with(nolock);

     Or the 2nd query to something like this:
     
      select distinct case when b.database_id = 32767 then 'resourceDb' else d.name end
      from sys.dm_os_buffer_descriptors b with(nolock)
      left join sys.databases d with(nolock)
      on  b.database_id = d.database_id

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.

  • sys.dm_os_buffer_descriptors aggregations

    Lots of customers I visit are frequently interested in determining what objects/structures/files/etc. are consuming the largest amount of space at a given time (or over time) within the Sql Server buffer pool. In Sql 2000, this was a bit complicated to determine to say the least, however with Sql 2005's new dynamic management functions/views, it's become exponentially easier to gain this type of insight; additionally, it's also become easy to aggregate this information for use/display/reporting purposes.

     In this scenario, the use of the sys.dm_os_buffer_descriptors DMV optionally correlated against other catalog views within a given database will provide you all the information you need to get this type of information.  I'm not going to talk about the DMV or catalog views here, that's done many other places quite well, and also quite sufficiently within Books Online.  What I'm going to provide are a couple of utility procedures that wrap this functionallity for ease of use, providing options to aggregate the sum of information per structure/database, query on particular databases or all databases, etc.

     The first and simpler of 2 procedures I'll provide will give you aggregated information from the DMV rolled-up on a per database, file, and page type combination (page type being things like data pages vs. index pages vs. PFS pages vs. etc., etc.).  It's a simple procedure with no parameters and a single select statement with some grouping, rollup, and sorting...here's the code:

      use master
      go

      if ((object_id('sp_osbufferdescriptors_agg') is not null) and (objectproperty(object_id('sp_osbufferdescriptors_agg'), 'IsProcedure') = 1))
       drop proc [dbo].sp_osbufferdescriptors_agg
      go

      create proc [dbo].sp_osbufferdescriptors_agg
      as

      /*

      SAMPLE EXECUTION:
       exec sp_osbufferdescriptors_agg

      */

      set nocount on;
      set transaction isolation level read uncommitted;

      select  case when grouping(dbName) = 1 then '--- TOTAL ---' else dbName end as dbName,
         case when grouping(fileId) = 1 then '--- TOTAL ---' else fileId end as fileId,
         case when grouping(pageType) = 1 then '--- TOTAL ---' else pageType end as pageType,
         count(*) as countPages, sum(row_count) as sumRowCount, avg(row_count) as avgRowCount,
         sum(freeSpaceBytes) as sumFreeSpaceBytes, avg(freeSpaceBytes) as avgFreeSpaceBytes
      from  (select case when database_id = 32767 then 'resourceDb' else cast(db_name(database_id) as varchar(25)) end as dbName,
          cast(file_id as varchar(10)) as fileId, cast(page_type as varchar(25)) as pageType,
          row_count as row_count, free_space_in_bytes as freeSpaceBytes
         from sys.dm_os_buffer_descriptors bufferDescriptor with(nolock)) tmp
      group by dbName, fileId, pageType with rollup
      order by case when grouping(dbName) = 1 then 'zzzzzzzzzzzzzzzzzzzzzzzzzzz' else dbName end,
         case when grouping(fileId) = 1 then 'zzzzzzzzzzzzzzzzzzzzzzzzzzz' else fileId end,
         case when grouping(pageType) = 1 then 'zzzzzzzzzzzzzzzzzzzzzzzzzzz' else pageType end;
      go


     The second and more complex of the procedures provides more detailed information for each given database on the system - instead of providing only server-level information (like what database, file, etc. is consuming the buffer pool), it will dig into specific database(s) to provide more targetted information within the given database(s) in regards to specific indexes/tables/views/etc. that are chewing up the most space.  Optional parameters are included to target to a specific database, all databases on the system, system level only information, and return only a certain number of results. Here's the code:

    use master
    go

    if ((object_id('sp_osbufferdescriptors') is not null) and (objectproperty(object_id('sp_osbufferdescriptors'), 'IsProcedure') = 1))
     drop proc [dbo].sp_osbufferdescriptors
    go

    create proc [dbo].sp_osbufferdescriptors
     @top  int = 0,     -- Limits the result set to the top # specified - if null/default/0, all
               -- records are returned
     @opts  int = 0      -- Option values for execution - bit flags:
               --  <no opts> - If no opts are set, database level information is
               --   returned for the database context we're executing in
               --  1 bit - If set, system level os_buffer information is returned
               --   only - no db level information is returned
               -- 2 bit - If set, and the 1 bit is NOT set, all db specific
               --   information is gathered by iterating through all
               --   databases on the system and gathering info

    as

    /*

    NOTE: Use of this procedure requires the existence of the following procedures/functions as well:
     1.  <NA>

    -- Get database level information for the current db only
    exec sp_osbufferdescriptors;
    -- Only the top 20 results
    exec sp_osbufferdescriptors @top = 20;

    -- Get system level information only
    exec sp_osbufferdescriptors @opts = 1;
    -- Only top 5 results
    exec sp_osbufferdescriptors @top = 5, @opts = 1;

    -- Get database level information for all db's on the system
    exec sp_osbufferdescriptors @opts = 2;
    -- Only top 20 results
    exec sp_osbufferdescriptors @top = 20, @opts = 2;

    */

    set nocount on;
    set transaction isolation level read uncommitted;

    declare @sql nvarchar(4000);

    -- Format incoming data
    select @opts = isnull(@opts,0),
      @top = case when @top > 0 then @top else 0 end;

    -- If no options were specified, we get the data for the current db and exit
    if @opts = 0 begin
     -- Get largest buffer consumers for the given database
     select @sql = N'
      select ' + case when @top > 0 then N'top ' + cast(@top as nvarchar(20)) else N'' end + N' count(*) as bufferCount,
        db_name() as dbName,
        object_name(p.object_id) as objectName, isnull(i.name,''HEAP'') as indexName,
        max(p.partition_number) as partitionCount, max(p.row_count) as indexRowCount,
        sum(b.row_count) as loadedRowCount, max(p.in_row_used_page_count) as inRowUsedPages,
        max(p.in_row_data_page_count) as inRowDataPages, max(p.in_row_reserved_page_count) as inRowReservedPages,
        max(p.lob_used_page_count) as lobUsedPages, max(p.lob_reserved_page_count) as lobReservedPages,
        max(p.row_overflow_used_page_count) as rowOverflowUsedPages,
        max(p.row_overflow_reserved_page_count) as rowOverflowReservedPages,
        max(p.used_page_count) as totalUsedPages, max(p.reserved_page_count) as totalReservedPages
      from sys.dm_db_partition_stats p with(nolock)
      join sys.allocation_units a with(nolock)
      on  p.partition_id = a.container_id
      join sys.dm_os_buffer_descriptors b with(nolock)
      on  a.allocation_unit_id = b.allocation_unit_id
      join sys.indexes i with(nolock)
      on  p.object_id = i.object_id
      and  p.index_id = i.index_id
      where b.database_id = db_id()
      group by p.object_id, i.name
      order by count(*) desc, p.object_id, i.name;';

     exec (@sql);
     return;
    end

    -- If 1 bit is set, we get system level information only...
    if @opts & 1 = 1 begin
     -- Get largest buffer consumers for the system
     select @sql = N'
      select ' + case when @top > 0 then N'top ' + cast(@top as nvarchar(20)) else N'' end + N' count(*) as bufferCount,
        case when grouping(b.database_id) = 1 then ''--- TOTAL ---'' else
        case when b.database_id = 32767 then ''resourceDb'' else db_name(b.database_id) end end as dbName,
        sum(b.row_count) as loadedRows
      from sys.dm_os_buffer_descriptors b with(nolock)
      group by b.database_id with rollup
      order by case when grouping(b.database_id) = 1 then 0 else count(*) end desc;';

     exec (@sql);
     return;
    end

    -- If the 2 bit is set, we get database level information for multiple db's as appropriate
    if @opts & 2 = 2 begin
     -- Create a temp object for storage
     create table #osBufferDescriptorsDbData (bufferCount bigint, dbName nvarchar(250), objectName nvarchar(250), indexName nvarchar(250),
       partitionCount int, indexRowCount bigint, auTotalPages bigint, auUsedPages bigint, auDataPages bigint);

     -- Gather up the appropriate data from each database on the server (not system db except tempdb)
     select @sql = N'use [?];

      if ''?'' in (''master'',''model'',''msdb'') return;

      insert #osBufferDescriptorsDbData (bufferCount, dbName, objectName, indexName, partitionCount, indexRowCount, auTotalPages, auUsedPages, auDataPages)
      select ' + case when @top > 0 then N'top ' + cast(@top as nvarchar(20)) else N'' end + N' count(*) as bufferCount,
        db_name() as dbName,
        object_name(p.object_id) as objectName, isnull(i.name,''HEAP'') as indexName,
        max(p.partition_number) as partitionCount, max(p.row_count) as indexRowCount,
        sum(b.row_count) as loadedRowCount, max(p.in_row_used_page_count) as inRowUsedPages,
        max(p.in_row_data_page_count) as inRowDataPages, max(p.in_row_reserved_page_count) as inRowReservedPages,
        max(p.lob_used_page_count) as lobUsedPages, max(p.lob_reserved_page_count) as lobReservedPages,
        max(p.row_overflow_used_page_count) as rowOverflowUsedPages,
        max(p.row_overflow_reserved_page_count) as rowOverflowReservedPages,
        max(p.used_page_count) as totalUsedPages, max(p.reserved_page_count) as totalReservedPages
      from sys.dm_db_partition_stats p with(nolock)
      join sys.allocation_units a with(nolock)
      on  p.partition_id = a.container_id
      join sys.dm_os_buffer_descriptors b with(nolock)
      on  a.allocation_unit_id = b.allocation_unit_id
      join sys.indexes i with(nolock)
      on  p.object_id = i.object_id
      and  p.index_id = i.index_id
      where b.database_id = db_id()
      group by p.object_id, i.name
      group by p.object_id, i.name;';

     exec sp_MSforeachdb @sql;

     -- Return the results
     select @sql = N'
      select ' + case when @top > 0 then N'top ' + cast(@top as nvarchar(20)) else N'' end + N' *
      from #osBufferDescriptorsDbData with(nolock)
      order by bufferCount desc, dbName, objectName;';

     exec (@sql);

     -- Cleanup
     drop table #osBufferDescriptorsDbData;
    end

    go
     

     Feel free to tweak the code to match specific requirements - if you come up with an interesting morph, I'd be very interested to see what you have for additional enhancements.  Additionally, as always, be sure to understand the performance impact associated with some possible incarnations of executing these procedures...if you have a large 64-bit system with a large buffer pool, these could get somewhat intensive.

     Enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.


© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker