Index Build strategy in SQL Server - Part 2: Offline, Parallel, No Partitioning
The type of parallel index build plan in SQL server depends on whether or not we have a histogram available with necessary statistics. Therefore, there are two broad categories of parallel index plans:
- Histogram available:
- No histogram
Histogram available (parallel sort and build):
X (Exchange)
| \ \
Builder… Build… Build… (write data to the in-build index)
| | |
Sort… Sort… Sort … (order by index key)
| / /
Scan (read data from source)
This type of Parallel Index build is getting chosen when we have statistics available (hence range partition information is available and can be used to identify data distribution).
How does scan happen in this case? We must have some statistics on the leading key column, so if we don’t have stats we will go ahead and generate sample statistics to determine whether and how to parallelize the index build operation. In some situations, however, we are not able to build sample stats, such as indexed view ("No stats plan"), and then different index build plan will be generated. Using the statistics and histogram we can identify data distribution (divide data in several buckets), so we can load balance the workload among workers in parallel plan, it also help us to make DOP (degree of parallelism) decision to achieve high utilization of system resource. Using the row count estimates from the histogram for each bucket in the distribution the workload is split into N ranges (N = DOP), one for each worker (this is an attempt to load balance the work among all workers).
Parallel Index Build with histogram available can give us the best performance.
For example:
Create index idx_t on t(c1, c2)
WITH (MAXDOP = 2)
-- limit # of processor to use for index build to 2
//Next time - Non stats plan (no histogram) index build plan
Posted by: Lyudmila Fokina