Proportional fill and how to check whether it works

Proportional fill and how to check whether it works

Rate This
  • Comments 3

Being on a business trip in Japan a customer described a case where they had their SAP system running on SQL Server. In their opinion they did everything right in terms of using our proportional fill features as it is described http://download.microsoft.com/download/d/9/4/d948f981-926e-40fa-a026-5bfcf076d9b9/SAP_SQL2005_Best%20Practices.doc. However after the last bigger archiving session where SD and FI objects of a whole year got archived, they saw that one file was emptied to a bigger degree than all the other files. Means they ended up with rather different free space portions in their files which were evenly sized. Using proportional fill with evenly sized files all the time this shouldn’t happen obviously. However the problem was in this specific case, that it the customer couldn’t tell whether the files of the SAP database always were managed in the most optimal way. Means it could have happened that years back when the data of the year which just got archived, got inserted one database file may have been larger or all the other database files might have been filled up already. There is no possibility to investigate whether this happened many years later. However let’s step through two short procedures to check whether the actual read and write activity is balanced between the data files and whether the data of a specific table is stored as well to even parts throughout all data files as we desire it to be for the way SAP databases should be created on SQL Server.

 

How do we check whether proportional fill works fine at the moment? Under the assumption that all data files have the same free space (even independent of the size of the data files), the following query batch  would give us some indication whether proportional fill is working in even proportions over all data files:

--Query1:

declare @avg_read bigint, @avg_write bigint

select @avg_read=avg(io.num_of_reads), @avg_write=avg(io.num_of_writes)

from sys.dm_io_virtual_file_stats(db_id(), -1) io, sys.database_files df

where io.file_id=df.file_id and df.type_desc = 'ROWS'

select io.file_id, df.name, io.num_of_reads, @avg_read as 'Avg Reads',

ceiling(((convert(decimal(10,0),num_of_reads)/@avg_read)*100)-100) as 'Percentage off Read Avg', io.num_of_writes, @avg_write as 'Avg Writes',

ceiling(((convert(decimal(10,2),num_of_writes)/@avg_write)*100)-100) as 'Percentage off Write Avg' from sys.dm_io_virtual_file_stats(db_id(), -1) io, sys.database_files df

where io.file_id=df.file_id and df.type_desc = 'ROWS'

 

The result is expected to look like:

 

 

 

file_id

name

num_of_reads

Avg Reads

Percentage off Read Avg

num_of_writes

Avg Writes

Percentage off Write Avg

1

Data1

42374382

42647992

0

2233030

2214157

1

3

Data2

42777611

42647992

1

2262091

2214157

3

4

Data3

42647377

42647992

0

2235077

2214157

1

5

Data4

42821902

42647992

1

2174742

2214157

-1

6

Data5

42453611

42647992

0

2231477

2214157

1

30

Data29

42675414

42647992

1

2178970

2214157

-1

31

Data30

42596320

42647992

0

2222319

2214157

1

32

Data31

42673096

42647992

1

2249254

2214157

2

33

Data32

42855787

42647992

1

2249878

2214157

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In this case, we look at 32 data files of a 4.5TB SAP ERP database. As one can see the variations between reads and writes per data file are in the single digit area with some files reading and writing a bit more or less. However this describes the current status of the I/O balancing. It doesn’t mean that the distribution always was that way. In order to check whether data of a table is distributed in even parts over the data files, one can run this script:

 

--Query2:

drop procedure sp_tablepart

go

create procedure sp_tablepart @tname sysname as

begin

set nocount on

declare @extentinfo9p table (file_id int, page_id int, pg_alloc int, ext_size int,

                obj_id int, index_id int, partition_number int, partition_id bigint, iam_chain_type varchar(20), pfs_bytes varbinary(8))

declare @exec_statement varchar(512), @tname2 sysname, @dbid int, @fname sysname, @page_count bigint

set @dbid=db_id()

set @tname2 = quotename(@tname)

set @exec_statement = 'dbcc extentinfo(' + convert(varchar,@dbid) + ',' + @tname2 + ') with no_infomsgs'

insert into @extentinfo9p exec (@exec_statement)

select @page_count = SUM(pg_alloc) from @extentinfo9p

select distinct (ei.file_id) as 'File ID', df.name as 'Name',

fg.name as 'Filegroup Name', SUM(ei.pg_alloc) as 'Pages allocated',

ROUND(((convert(decimal,SUM(ei.pg_alloc))/convert(decimal,@page_count))*100),1) as Percentage

from @extentinfo9p ei, sys.database_files df, sys.filegroups fg

where ei.file_id = df.file_id and df.data_space_id=fg.data_space_id

group by ei.file_id, df.name, fg.name order by ei.file_id

end

 

Executing the stored procedure with a call like

 

     sp_tablepart REPOLOAD 

 

The result would like:

 

File ID

Name

Filegroup Name

Pages allocated

Percentage

1

Data1

PRIMARY

16928

3

3

Data2

PRIMARY

17327

3

4

Data3

PRIMARY

17154

3

5

Data4

PRIMARY

17430

3.1

6

Data5

PRIMARY

18418

3.2

7

Data6

PRIMARY

17703

3.1

8

Data7

PRIMARY

18808

3.3

30

Data29

PRIMARY

17825

3.1

31

Data30

PRIMARY

17751

3.1

32

Data31

PRIMARY

17852

3.1

33

Data32

PRIMARY

17545

3.1

 

In this case the data of the specific table checked is distributed very evenly between all the different files. One should run this test only with tables of significant size. The smaller the volume of the table the more uneven the distribution might be simply due to the fact that there are not a lot of pages used by a few MB of data volume. Also be aware that the procedure introduced above as Query 2 might run several minutes. The run time depends on the table size. The larger the volume of the table the longer the runtime.

 

What to do if the I/O characteristics are uneven? Usually it helps to increase the free space of each file to an even level. The SQL Server proportional fill feature will check on the freespace in each file and then decides which proportional fill factor to take per file. Whereas the difference in freespace needs to be a factor of 2 or more to really result in uneven allocation. Means if the freespace in 11 files is 50GB and in the 12. file the freespace is 80GB, SQL Server still will pursue an even allocation rate per file. SQL Server recalculates the rate frequently. If in the example above additional 20GB would be filled in every of the files, 11 files would end up with 30GB freespace whereas the last file would have 60GB, SQL Server would decide to allocate the double amount of space in the last file compared to each one of the other 11 files. Therefore even out the freespace will help to keep the allocation rate the same for the future. This would take care on new data allocation. The read rate between the files will even out over weeks and months finally to a good degree.

 

The situation at the customer however was that after the archiving one file had significant more freespace than all the other files. The best way to compensate for that is to adjust the freespace in the other files to a degree where SQL Server would allocate in even parts again. That is the easiest cure.  Index reorganization of masses of tables or even an export and import of the whole data would be overkill for such a situation. Therefore just try to keep the freespace in a way that even portions are allocated between the files and the I/O load will balance out over time.

Leave a Comment
  • Please add 8 and 4 and type the answer here:
  • Post
  • PingBack from http://outdoorceilingfansite.info/story.php?id=18528

  • For this setup, what is your datafile growth rate ? Going with the recommended 10% maybe performance unfriendly even when on SAN. Your thoughts ..

  • The script to find proportional data fill with io_virtual_stats has a small glitch and hence doesn't yield any results at all.

    Here is the correct script which should work all the time:

    DECLARE @avg_read  BIGINT,

           @avg_write BIGINT

    SELECT @avg_read = Avg(io.num_of_reads),

          @avg_write = Avg(io.num_of_writes)

    FROM   sys.Dm_io_virtual_file_stats(Db_id(), NULL) io,

          sys.database_files df

    WHERE  io.file_id = df.file_id

          AND df.type_desc = 'ROWS'

    SELECT io.file_id,

          df.name,

          io.num_of_reads,

          @avg_read  AS 'Avg Reads',

          Ceiling(( ( CONVERT(DECIMAL(10, 0), num_of_reads) / @avg_read ) * 100 )

                  - 100)

                     AS 'Percentage off Read Avg',

          io.num_of_writes,

          @avg_write AS 'Avg Writes',

          Ceiling(( ( CONVERT(DECIMAL(10, 2), num_of_writes) / @avg_write ) * 100 )

                  - 100)

                     AS 'Percentage off Write Avg'

    FROM   sys.Dm_io_virtual_file_stats(Db_id(), NULL) io,

          sys.database_files df

    WHERE  io.file_id = df.file_id

          AND df.type_desc = 'ROWS'

Page 1 of 1 (3 items)