Partitioned Tables, Parallelism & Performance considerations

Partitioned Tables, Parallelism & Performance considerations

Rate This
  • Comments 10

Q:  When querying SS2005 Partitioned Tables, what kind of parallelism should I expect to see and how will that affect performance?

 

A: First, a brief background on partitioning.  SQL Server 2005 table partitioning provides many improvements in terms of manageability and availability.   The manageability improvements allow metadata only switch-in and switch-out of a partition of data (supporting sliding window requirements).  Availability improvements include online index rebuilds, parallel operations, and piecemeal restores of filegroups (see blog SQL Server 2005 OnLine Piecemeal Restore at http://blogs.msdn.com/sqlcat/archive/2005/10/27/485580.aspx).

 

Now let’s look at how performance is impacted by partitioned tables.  Regardless of whether you are using partitioning, the selection of a parallel plan is determined by the number of CPUs, query cost, available memory and current workload. The remaining considerations outlined here are valid only if all of these allow using parallel plan.  

 

When a query uses a single partition e.g. all but a single partition is eliminated, SQL Server 2005 Intra-Partition parallelism can parallelize row retrieval using multiple threads up to the sp_configure’d value of Maximum Degree of Parallelism (MAXDOP).   Frequently, MAXDOP is set to the default of 0 which means MAXDOP is equal to the total number of CPUs.  When a query spans two or more partitions, only a single thread per partition will be used to retrieve rows. 

 

If your number of partitions is equal to MAXDOP or less, data skewing may cause some parallel threads to complete before others resulting in CXPACKET waits.  In cases where the number of partitions exceeds MAXDOP, SQL Server 2005 on-demand parallelism means that when a thread completes its work on a partition, it will automatically start processing the next partition in line.  Say you have 16 partitions and a MAXDOP of 8.  The first 8 threads work on partitions 1-8.  The first thread completed will start processing partition 9, the next, partition 10 and so on.

 

If you have big multi-proc box of 8 or more CPUs, the worst case query performance would be a single SELECT statement that spans two partitions (see Table 1).  One thread per partition would be used to retrieve the rows (See Table 2 yellow highlight for partitions 80 and 81 & number of threads in Executes column) although MAXDOP can still be applied after row retrieval in subsequent steps (See Table 2 green highlights).

 

Table 1: Retrieve 2 weeks of data

SELECT   

       SUM(Sales_Qty) as Sales_Qty,

       SUM(Sale_Amt)  as Sales_Amount

FROM   SalesDB.dbo.Tbl_Fact_ Sales – Partitioned by week

WHERE  date_id between '20050703' and '20050716'

 

Table 2: Set Statistics Profile:  MAXDOP = 12

Rows

Executes

StmtText

1

1

SELECT SUM([Sales_Qty]) [Sales_Qty],SUM([Sale_Amt]) [Sales_Amount] FROM [SalesDB].[dbo].[Tbl_Fact_Sales] WHERE [date_id]>=@1 AND [date_id]<=@2

0

0

  |--Compute Scalar(DEFINE:([Expr1002]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010] END, [Expr1003]=CASE WHEN [globalagg1012]=(0) THEN NULL ELSE [globalagg1014] END))

1

1

       |--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009]), [globalagg1012]=SUM([partialagg1011]), [globalagg1014]=SUM([partialagg1013])))

2

1

            |--Parallelism(Gather Streams)

2

12

                 |--Stream Aggregate(DEFINE:([partialagg1007]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1009]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1011]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt]), [partialagg1013]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt])))

20577235

12

                      |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1006]) PARTITION ID:([PtnIds1006]))

2

12

                           |--Parallelism(Distribute Streams, Demand Partitioning)

2

1

                           |    |--Constant Scan(VALUES:(((80)),((81))))

20577235

2

                           |--Index Seek(OBJECT:([SalesDB].[dbo].[Tbl_Fact_Sales].[IX_Tbl_Fact_Sales_SKDteItmStrIDSalQtySalAmtDiscMkd] AS [ss]), SEEK:([ss].[SK_Date_ID] >= (20050703) AND [ss].[SK_Date_ID] <= (20050716)) ORDERED FORWARD PARTITION ID:([PtnIds1006]))

 

For instance, suppose you have a TB-sized Sales table that is partitioned by month.  A common query pattern may be to compare this month to last month, or perhaps this month to a year ago.

 

Table 3:  MONTHLY Partitions

 

WHERE clause

Partitions

Retrieval Parallelism

Single SELECT statement

SELECT …. WHERE DateCol BETWEEN ’10/1/2005’ and ‘11/30/2005’

2

1 thread per partition *

SELECT

UNION SELECT

[UNION SELECT]

Select …. Where DateCol between ’10/1/2005’ and ‘10/31/2005

UNION

Select …. Where DateCol between ’11/1/2005’ and ‘11/30/2005

1 per select

MAXDOP per partition

 

Table 4:  WEEKLY Partitions:  Sales for November 1-15

 

WHERE clause

Partitions

Retrieval Parallelism

Single SELECT statement

SELECT SUM(Sales) from WKSales

WHERE DateCol BETWEEN ’11/1/2005’ and ‘11/15/2005’

3

1 thread per partition *

SELECT

UNION SELECT

[UNION SELECT]

SELECT SUM(Sales) from WKSales

WHERE DateCol BETWEEN ’11/1/2005’ and ‘11/5/2005’

UNION

SELECT SUM(Sales) from WKSales

WHERE DateCol BETWEEN ’11/6/2005’ and ‘11/12/2005’

UNION

SELECT SUM(Sales) from WKSales

WHERE DateCol BETWEEN ’11/13/2005’ and ‘11/15/2005’

1 per select

MAXDOP per partition

 

There are cases where big monolithic tables can provide better performance although manageability and availability can be challenging.  For example, if you have a query that spans 2 partitions of a 1TB partitioned table, SQL Server 2005 can only apply 1 thread per partition while retrieving rows from each partition.  In contrast, SQL Server 2005 can apply MAXDOP  to retrieve rows from the monolithic version of the 1TB table (see Table 5).

 

Table 5:  Monolithic BigSalesTable

 

WHERE clause

Partitions

Retrieval Parallelism

Single SELECT statement

SELECT …. WHERE DateCol BETWEEN ’10/1/2005’ and ‘11/30/2005’

N/A

MAXDOP

 

Best Practices: 

 

The above comments on partitions and parallel performance are most applicable to Data Warehouses, batch processing, and reporting.  Not all data warehouse environments will want to enable parallelism for all queries.  Specifically, parallelism is most effective if the system is running only a few queries at a time and you want as many resources possible made available to those queries to minimize execution time.  If the data warehouse is already a high concurrency environment, parallelism will not improve throughput or response time since the collection of single-threaded queries are likely already consuming the available resources.  Likewise, for peak performance, you do not want parallelism in a high concurrency OLTP workload.

 

When deciding upon a partitioning granularity (e.g. daily, weekly, monthly), consider your users’ common query patterns.  Remember that for best parallel performance of a big multi-proc box of 8 CPU or more, you want your queries to maximize parallelism e.g. at least MAXDOP partitions.  As illustrated in tables 3 & 4, performance boosts can be achieved if you have the luxury of rewriting SQL statement as a union of multiple single-partition queries. 

 

In conclusion, for best parallel query performance with partitioned tables in SQL Server 2005, aim for either a single partition per SELECT statement (to get Intra-Partition parallelism), or MANY partitions (at 1 thread per partition) per SELECT statement. 

 

 Tom Davidson

SQL Server Customer Advisory Team

Leave a Comment
  • Please add 3 and 1 and type the answer here:
  • Post
Page 1 of 1 (10 items)