Q: When querying SS2005 Partitioned Tables, what kind of parallelism should I expect to see and how will that affect performance?
A: First, a brief background on partitioning. SQL Server 2005 table partitioning provides many improvements in terms of manageability and availability. The manageability improvements allow metadata only switch-in and switch-out of a partition of data (supporting sliding window requirements). Availability improvements include online index rebuilds, parallel operations, and piecemeal restores of filegroups (see blog SQL Server 2005 OnLine Piecemeal Restore at http://blogs.msdn.com/sqlcat/archive/2005/10/27/485580.aspx).
Now let’s look at how performance is impacted by partitioned tables. Regardless of whether you are using partitioning, the selection of a parallel plan is determined by the number of CPUs, query cost, available memory and current workload. The remaining considerations outlined here are valid only if all of these allow using parallel plan.
When a query uses a single partition e.g. all but a single partition is eliminated, SQL Server 2005 Intra-Partition parallelism can parallelize row retrieval using multiple threads up to the sp_configure’d value of Maximum Degree of Parallelism (MAXDOP). Frequently, MAXDOP is set to the default of 0 which means MAXDOP is equal to the total number of CPUs. When a query spans two or more partitions, only a single thread per partition will be used to retrieve rows.
If your number of partitions is equal to MAXDOP or less, data skewing may cause some parallel threads to complete before others resulting in CXPACKET waits. In cases where the number of partitions exceeds MAXDOP, SQL Server 2005 on-demand parallelism means that when a thread completes its work on a partition, it will automatically start processing the next partition in line. Say you have 16 partitions and a MAXDOP of 8. The first 8 threads work on partitions 1-8. The first thread completed will start processing partition 9, the next, partition 10 and so on.
If you have big multi-proc box of 8 or more CPUs, the worst case query performance would be a single SELECT statement that spans two partitions (see Table 1). One thread per partition would be used to retrieve the rows (See Table 2 yellow highlight for partitions 80 and 81 & number of threads in Executes column) although MAXDOP can still be applied after row retrieval in subsequent steps (See Table 2 green highlights).
Table 1: Retrieve 2 weeks of data
SUM(Sales_Qty) as Sales_Qty,
SUM(Sale_Amt) as Sales_Amount
FROM SalesDB.dbo.Tbl_Fact_ Sales – Partitioned by week
WHERE date_id between '20050703' and '20050716'
Table 2: Set Statistics Profile: MAXDOP = 12
SELECT SUM([Sales_Qty]) [Sales_Qty],SUM([Sale_Amt]) [Sales_Amount] FROM [SalesDB].[dbo].[Tbl_Fact_Sales] WHERE [date_id]>=@1 AND [date_id]<=@2
|--Compute Scalar(DEFINE:([Expr1002]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010] END, [Expr1003]=CASE WHEN [globalagg1012]=(0) THEN NULL ELSE [globalagg1014] END))
|--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009]), [globalagg1012]=SUM([partialagg1011]), [globalagg1014]=SUM([partialagg1013])))
|--Stream Aggregate(DEFINE:([partialagg1007]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1009]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1011]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt]), [partialagg1013]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt])))
|--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1006]) PARTITION ID:([PtnIds1006]))
|--Parallelism(Distribute Streams, Demand Partitioning)
| |--Constant Scan(VALUES:(((80)),((81))))
|--Index Seek(OBJECT:([SalesDB].[dbo].[Tbl_Fact_Sales].[IX_Tbl_Fact_Sales_SKDteItmStrIDSalQtySalAmtDiscMkd] AS [ss]), SEEK:([ss].[SK_Date_ID] >= (20050703) AND [ss].[SK_Date_ID] <= (20050716)) ORDERED FORWARD PARTITION ID:([PtnIds1006]))
For instance, suppose you have a TB-sized Sales table that is partitioned by month. A common query pattern may be to compare this month to last month, or perhaps this month to a year ago.
Table 3: MONTHLY Partitions
Single SELECT statement
SELECT …. WHERE DateCol BETWEEN ’10/1/2005’ and ‘11/30/2005’
1 thread per partition *
Select …. Where DateCol between ’10/1/2005’ and ‘10/31/2005
Select …. Where DateCol between ’11/1/2005’ and ‘11/30/2005
1 per select
MAXDOP per partition
Table 4: WEEKLY Partitions: Sales for November 1-15
SELECT SUM(Sales) from WKSales
WHERE DateCol BETWEEN ’11/1/2005’ and ‘11/15/2005’
WHERE DateCol BETWEEN ’11/1/2005’ and ‘11/5/2005’
WHERE DateCol BETWEEN ’11/6/2005’ and ‘11/12/2005’
WHERE DateCol BETWEEN ’11/13/2005’ and ‘11/15/2005’
There are cases where big monolithic tables can provide better performance although manageability and availability can be challenging. For example, if you have a query that spans 2 partitions of a 1TB partitioned table, SQL Server 2005 can only apply 1 thread per partition while retrieving rows from each partition. In contrast, SQL Server 2005 can apply MAXDOP to retrieve rows from the monolithic version of the 1TB table (see Table 5).
Table 5: Monolithic BigSalesTable
The above comments on partitions and parallel performance are most applicable to Data Warehouses, batch processing, and reporting. Not all data warehouse environments will want to enable parallelism for all queries. Specifically, parallelism is most effective if the system is running only a few queries at a time and you want as many resources possible made available to those queries to minimize execution time. If the data warehouse is already a high concurrency environment, parallelism will not improve throughput or response time since the collection of single-threaded queries are likely already consuming the available resources. Likewise, for peak performance, you do not want parallelism in a high concurrency OLTP workload.
When deciding upon a partitioning granularity (e.g. daily, weekly, monthly), consider your users’ common query patterns. Remember that for best parallel performance of a big multi-proc box of 8 CPU or more, you want your queries to maximize parallelism e.g. at least MAXDOP partitions. As illustrated in tables 3 & 4, performance boosts can be achieved if you have the luxury of rewriting SQL statement as a union of multiple single-partition queries.
In conclusion, for best parallel query performance with partitioned tables in SQL Server 2005, aim for either a single partition per SELECT statement (to get Intra-Partition parallelism), or MANY partitions (at 1 thread per partition) per SELECT statement.
SQL Server Customer Advisory Team
SQL Server parallelizes a nested loops join by distributing the outer rows (i.e., the rows from the first
Hi, Here in Brazil I´m working on MS Gold Certified Partner, and I act in two distinct teams - Data...
マイクロソフトの植田です。 今回はパーティション分割とパフォーマンスに関する話題をご紹介したいと思います。 http://blogs.msdn.com/sqlcat/archive/2005/11/30/498415.aspx
PingBack from http://www.hilpers.it/2537369-partizionamento-di-una-tabella
PingBack from http://www.keyongtech.com/2160525-parallel-i-o-for-partitioned
PingBack from http://patiochairsite.info/story.php?id=26502
PingBack from http://unemploymentofficeresource.info/story.php?id=15587
PingBack from http://patiosetsite.info/story.php?id=957
PingBack from http://adirondackchairshub.info/story.php?id=3744