<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Craig Freedman's SQL Server Blog : Parallelism</title><link>http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx</link><description>Tags: Parallelism</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Partial Aggregation</title><link>http://blogs.msdn.com/craigfr/archive/2008/01/18/partial-aggregation.aspx</link><pubDate>Fri, 18 Jan 2008 22:12:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7152155</guid><dc:creator>craigfr</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/craigfr/comments/7152155.aspx</comments><wfw:commentRss>http://blogs.msdn.com/craigfr/commentrss.aspx?PostID=7152155</wfw:commentRss><description>&lt;P&gt;In some of my past posts, I've discussed how SQL Server implements &lt;A href="http://blogs.msdn.com/craigfr/archive/tags/Aggregation/default.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/tags/Aggregation/default.aspx"&gt;aggregation&lt;/A&gt; including the &lt;A title="Stream Aggregate" href="http://blogs.msdn.com/craigfr/archive/2006/09/13/752728.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/09/13/752728.aspx"&gt;stream aggregate&lt;/A&gt; and &lt;A title="Hash Aggregate" href="http://blogs.msdn.com/craigfr/archive/2006/09/20/hash-aggregate.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/09/20/hash-aggregate.aspx"&gt;hash aggregate&lt;/A&gt; operators.&amp;nbsp; I also used hash aggregation as an initial example in &lt;A title="Introduction to Parallel Query Execution" href="http://blogs.msdn.com/craigfr/archive/2006/10/11/introduction-to-parallel-query-execution.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/10/11/introduction-to-parallel-query-execution.aspx"&gt;my introductory post on parallel query execution&lt;/A&gt;.&amp;nbsp;&amp;nbsp; In this post, I'll look at a partial aggregation.&amp;nbsp; Partial aggregation is a technique that SQL Server uses to optimize parallel aggregation.&amp;nbsp; Before I begin, I just want to note that I also discuss partial aggregation in &lt;A title="Inside SQL Server 2005" href="http://www.insidesqlserver.com/index.html" mce_href="http://www.insidesqlserver.com/index.html"&gt;Inside Microsoft SQL Server 2005&lt;/A&gt;: &lt;A title="Inside Microsoft® SQL Server&lt;sup&gt;TM&lt;/sup&gt; 2005: Query Tuning and Optimization" href="http://www.microsoft.com/MSPress/books/8565.aspx" mce_href="http://www.microsoft.com/MSPress/books/8565.aspx"&gt;Query Tuning and Optimization&lt;/A&gt;.&amp;nbsp; (See the bottom of page 187.)&lt;/P&gt;
&lt;P&gt;Let's begin with a simple scalar aggregation example.&amp;nbsp; Recall that a scalar aggregate is an aggregate without a GROUP BY clause.&amp;nbsp; A scalar aggregate always produces a single output row.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;CREATE TABLE T (A INT, B INT IDENTITY, C INT, D INT)&lt;BR&gt;CREATE CLUSTERED INDEX TA ON T(A)&lt;/P&gt;
&lt;P&gt;SELECT COUNT(*) FROM T&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Not surprisingly, this query yields a trivial stream aggregate plan:&lt;/P&gt;
&lt;P&gt;&amp;nbsp; |--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[Expr1005],0)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Stream Aggregate(DEFINE:([Expr1005]=Count(*)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;|--Clustered Index Scan(OBJECT:([T].[TA]))&lt;/P&gt;
&lt;P&gt;Now, suppose that we want to parallelize this query.&amp;nbsp; Because this query outputs a single row, we cannot simply put an &lt;A title="The Parallelism Operator (aka Exchange)" href="http://blogs.msdn.com/craigfr/archive/2006/10/25/the-parallelism-operator-aka-exchange.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/10/25/the-parallelism-operator-aka-exchange.aspx"&gt;exchange (i.e., parallelism operator)&lt;/A&gt; at the root of the plan and divide the work of counting among multiple threads.&amp;nbsp; Such a strategy would yield one output row per thread which is clearly not the correct result.&lt;/P&gt;
&lt;P&gt;Alternatively, we could put a gather streams exchange between the stream aggregate and clustered index scan operators.&amp;nbsp; This strategy would permit us to use a parallel scan while still counting in a single thread and outputting a single row.&amp;nbsp; However, we would end up moving every row from the scan through the exchange - a rather costly operation.&amp;nbsp; Thus, this option, while valid, would not yield nearly the performance we'd like to see.&lt;/P&gt;
&lt;P&gt;Fortunately, there is a third option. &amp;nbsp;We can use a parallel scan, divide the work of counting among multiple threads (as in the first option), use an exchange to gather the per-thread counts into a single thread, and finally sum the per-thread counts to generate the grand total.&amp;nbsp; This strategy is much more efficient as we need only move a single row per thread through the exchange.&amp;nbsp; To get the optimizer to generate this plan, we need to add lots of data to the table.&amp;nbsp; To save time, I'm going to use UPDATE STATISTICS to trick the optimizer into thinking that we've added rows to the table:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;UPDATE STATISTICS T WITH ROWCOUNT = 1000000, PAGECOUNT = 100000&lt;BR&gt;SELECT COUNT(*) FROM T OPTION (RECOMPILE)&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;We need the RECOMPILE query hint to force the optimizer to generate a new plan with the new statistics. &amp;nbsp;Here is the plan we get:&lt;/P&gt;
&lt;P&gt;&amp;nbsp; |--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[globalagg1006],0)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Stream Aggregate(DEFINE:([globalagg1006]=SUM([partialagg1005])))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Stream Aggregate(DEFINE:([partialagg1005]=Count(*)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Clustered Index Scan(OBJECT:([T].[TA]))&lt;/P&gt;
&lt;P&gt;We call the bottommost aggregate operator a partial aggregate because it computes only part of the result.&amp;nbsp; We also sometimes refer to this operator as a local aggregate because it computes the portion of the result that is local to the thread where it executes.&amp;nbsp; We refer to the topmost aggregate as a global aggregate because it computes the full result.&lt;/P&gt;
&lt;P&gt;SQL Server is able to use partial aggregation for most aggregate functions including the standard built-ins: COUNT, SUM, AVG, MIN, and MAX.&amp;nbsp; While partial aggregation is necessary to parallelize scalar aggregates, it is also useful even for aggregates with a GROUP BY clause.&amp;nbsp; Whether the optimizer chooses to use partial aggregation depends on the number of unique groups and the size of these groups.&amp;nbsp; If the optimizer anticipates that a query will generate few large groups (such as in the scalar aggregation case), it will use partial aggregation.&amp;nbsp; However, if the optimizer expects that a query will generate many small groups, it may choose to use a single level aggregation.&amp;nbsp; With small groups, a partial aggregate cannot reduce the number of rows significantly and merely adds overhead to the query.&amp;nbsp; Moreover, with many groups it is easy to parallelize the aggregation by hashing on the GROUP BY keys and distributing different groups to different threads.&lt;/P&gt;
&lt;P&gt;Let's see how the optimizer makes this choice.&amp;nbsp; Column B in our example has the IDENTITY property.&amp;nbsp; Although we have no real data, this property is sufficient to trick the optimizer into concluding that this column is mostly unique.&amp;nbsp; (Without a unique index, the optimizer cannot be certain that the column is indeed unique and must assume that it is not.)&amp;nbsp; Suppose we aggregate on this column:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;SELECT COUNT(*) FROM T GROUP BY B&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[Expr1007],0)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Hash Match(Aggregate, HASH:([T].[B]) DEFINE:([Expr1007]=COUNT(*)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([T].[B]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Clustered Index Scan(OBJECT:([T].[TA]))&lt;/P&gt;
&lt;P&gt;Notice that this query yields a normal single level, albeit parallel, aggregate operator.&amp;nbsp; Now suppose we aggregate on column C which does not have the IDENTITY property:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;SELECT COUNT(*) FROM T GROUP BY C&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp; |--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[globalagg1006],0)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Stream Aggregate(GROUP BY:([T].[C]) DEFINE:([globalagg1006]=SUM([partialagg1005])))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Sort(ORDER BY:([T].[C] ASC))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([T].[C]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Hash Match(Partial Aggregate, HASH:([T].[C]), RESIDUAL:([T].[C] = [T].[C]) DEFINE:([partialagg1005]=COUNT(*)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Clustered Index Scan(OBJECT:([T].[TA]))&lt;/P&gt;
&lt;P&gt;This time we get a partial aggregate.&amp;nbsp; Also, observe that the partial aggregate is a hash aggregate while the global aggregate is a stream aggregate.&amp;nbsp; The optimizer is free to choose either physical aggregation operator (stream or hash) for the partial and global aggregates in a plan with partial aggregation.&amp;nbsp; The decision of which operator to use is cost based.&lt;/P&gt;
&lt;P&gt;Finally, it is worth noting that, while a stream aggregate behaves identically whether it is computing a partial or global aggregate, a partial hash aggregate does differ slightly from a normal hash aggregate.&amp;nbsp; First, a partial hash aggregate requests only a fixed minimal memory grant as it presumes that it will be computing a relatively small number of groups.&amp;nbsp; Second, a partial hash aggregate never spills rows to tempdb.&amp;nbsp; If a partial hash aggregate runs out of memory, it simply stops aggregating and begins returning non-aggregated rows.&amp;nbsp; This behavior is safe since the global aggregate will always compute the correct final results regardless of what the partial aggregate does.&amp;nbsp; The partial aggregate is merely a performance optimization and the goal is to prevent it from stealing resources from the global aggregate.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=7152155" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/craigfr/archive/tags/Aggregation/default.aspx">Aggregation</category><category domain="http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx">Parallelism</category></item><item><title>Parallel Query Execution Presentation</title><link>http://blogs.msdn.com/craigfr/archive/2007/04/17/parallel-query-execution-presentation.aspx</link><pubDate>Wed, 18 Apr 2007 01:47:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:2167013</guid><dc:creator>craigfr</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/craigfr/comments/2167013.aspx</comments><wfw:commentRss>http://blogs.msdn.com/craigfr/commentrss.aspx?PostID=2167013</wfw:commentRss><description>&lt;P&gt;For those of you readers who've been wondering whatever happened to me, I've been rather busy.&amp;nbsp; Among other activities, I've been writing a chapter for &lt;A class="" title="Kalen Delaney" href="http://sqlblog.com/blogs/kalen_delaney" mce_href="http://sqlblog.com/blogs/kalen_delaney"&gt;Kalen Delaney's&lt;/A&gt; upcoming fourth book in the &lt;A class="" title="Inside SQL Server 2005" href="http://www.insidesqlserver.com/index.html" mce_href="http://www.insidesqlserver.com/index.html"&gt;Inside SQL Server 2005 series&lt;/A&gt;: &lt;A class="" title="Inside Microsoft® SQL Server™ 2005: Query Tuning and Optimization" href="http://www.microsoft.com/MSPress/books/8565.aspx" mce_href="http://www.microsoft.com/MSPress/books/8565.aspx"&gt;Query Tuning and Optimization&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;I am hoping to resurrect my blog and to get things started, I'm posting a presentation that I recently delivered to the &lt;A class="" title="Pacific Northwest SQL Server User's Group" href="http://www.pnwsql.org/" mce_href="http://www.pnwsql.org/"&gt;Pacific Northwest SQL Server User's Group&lt;/A&gt;.&amp;nbsp; Several attendees asked whether they could have copies of this presentation, so here it is.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=2167013" width="1" height="1"&gt;</description><enclosure url="http://blogs.msdn.com/craigfr/attachment/2167013.ashx" length="1080926" type="application/pdf" /><category domain="http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx">Parallelism</category></item><item><title>Parallel Hash Join</title><link>http://blogs.msdn.com/craigfr/archive/2006/11/16/parallel-hash-join.aspx</link><pubDate>Fri, 17 Nov 2006 07:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1091692</guid><dc:creator>craigfr</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/craigfr/comments/1091692.aspx</comments><wfw:commentRss>http://blogs.msdn.com/craigfr/commentrss.aspx?PostID=1091692</wfw:commentRss><description>&lt;P&gt;SQL Server uses one of two different strategies to parallelize a &lt;A class="" href="http://blogs.msdn.com/craigfr/archive/2006/08/10/687630.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/08/10/687630.aspx"&gt;hash join&lt;/A&gt;.&amp;nbsp; The more common strategy uses hash partitioning.&amp;nbsp; In some cases, we use broadcast partitioning; this strategy is often called a “broadcast hash join.”&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Hash Partitioning&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The more common strategy for parallelizing a hash join involves distributing the build rows (i.e., the rows from the first input) and the probe rows (i.e., the rows from the second input) among the individual hash join threads using hash partitioning.&amp;nbsp; If a build and probe row share the same key value (i.e, they will join), they are guaranteed to hash to the same hash join thread.&amp;nbsp; After the data has been hash partitioned among the threads, the hash join instances all run completely independently on their respective data sets.&amp;nbsp; The absence of any inter-thread dependencies ensures that this strategy scales extremely well as we increase the degree of parallelism (i.e., the number of threads).&lt;/P&gt;
&lt;P&gt;As with all of my parallelism examples, I am using a large table to induce the optimizer to choose a parallel plan.&amp;nbsp; If you try these examples, it may take a few minutes to create these tables.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;create&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;table&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;a &lt;SPAN style="COLOR: blue"&gt;int&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; b &lt;SPAN style="COLOR: blue"&gt;int&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; x &lt;SPAN style="COLOR: blue"&gt;char&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;200&lt;SPAN style="COLOR: gray"&gt;))&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;set&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;nocount&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;on&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;declare&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: blue"&gt;int&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;set&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 0&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;while&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: gray"&gt;&amp;lt;&lt;/SPAN&gt; 1000000&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;begin&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;insert&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: blue"&gt;values&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;@i&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; @i&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; @i&lt;SPAN style="COLOR: gray"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;set&lt;/SPAN&gt; @i &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; @i &lt;SPAN style="COLOR: gray"&gt;+&lt;/SPAN&gt; 1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;end&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;into&lt;/SPAN&gt; T2 &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;into&lt;/SPAN&gt; T3 &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: gray"&gt;join&lt;/SPAN&gt; T2 &lt;SPAN style="COLOR: blue"&gt;on&lt;/SPAN&gt; T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;b &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; T2&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Hash Match(Inner Join, HASH:([T1].[b])=([T2].[a]), RESIDUAL:([T2].[a]=[T1].[b]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([T1].[b]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T1]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([T2].[a]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T2]))&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Note that unlike the &lt;A class="" href="http://blogs.msdn.com/craigfr/archive/2006/11/08/parallel-nested-loops-join.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/11/08/parallel-nested-loops-join.aspx"&gt;parallel nested loops join&lt;/A&gt;, we have exchanges (i.e., parallelism operator) between both table scans (both the build and the probe inputs)&amp;nbsp;and the hash join.&amp;nbsp; These exchanges hash partition the data among the hash join threads.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Broadcast Partitioning&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Consider what will happen if we try to parallelize a hash join using hash partitioning, but we have only a small number of rows on the build side of the hash join.&amp;nbsp; If we have fewer rows than hash join threads, some threads might receive no rows at all.&amp;nbsp; In this case, those threads would have no work to do during the probe phase of the join and would remain idle.&amp;nbsp; Even if we have more rows than threads, due to the presence of duplicate key values and/or skew in the hash function, some threads might get many more rows than others.&lt;/P&gt;
&lt;P&gt;To eliminate the risk of skew, when the optimizer estimates that the number of build rows is relatively small, it may choose to broadcast these rows to all of the hash join threads.&amp;nbsp; Since all build rows are broadcast to all hash join threads, in a broadcast hash join, it does not matter where we send the probe rows.&amp;nbsp; Each probe row can be sent to any thread and, if it can join with any build rows, it will.&lt;/P&gt;
&lt;P&gt;Here is an example:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: gray"&gt;join&lt;/SPAN&gt; T2 &lt;SPAN style="COLOR: blue"&gt;on&lt;/SPAN&gt; T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;b &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; T2&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a &lt;SPAN style="COLOR: blue"&gt;where&lt;/SPAN&gt; T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 0&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Hash Match(Inner Join, HASH:([T1].[b])=([T2].[a]), RESIDUAL:([T2].[a]=[T1].[b]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Distribute Streams, Broadcast Partitioning)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T1]), WHERE:([T1].[a]=(0)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T2]))&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Note that the exchange above the scan of T1 is now a broadcast exchange while we have completely eliminated the exchange above the scan of T2.&amp;nbsp; We do not need an exchange above T2 because the &lt;A class="" href="http://blogs.msdn.com/craigfr/archive/2006/11/01/parallel-scan.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/11/01/parallel-scan.aspx"&gt;parallel scan&lt;/A&gt; automatically distributes the pages and rows of T2 among the hash join threads.&amp;nbsp; This is similar to how the parallel scan distributed rows among nested loops join threads for the parallel nested loops join.&amp;nbsp; Similar to the parallel nested loops join, if we have a serial zone on the probe input of a broadcast hash join (e.g., due to a top operator), we may need a round robin exchange to redistribute the rows.&lt;/P&gt;
&lt;P&gt;So, if broadcast hash joins are so great (they do reduce the risk of skew problems), why don’t we use them in all cases?&amp;nbsp; The answer is that broadcast hash joins use more memory than their hash partitioned counterparts.&amp;nbsp; Since we send every build row to every hash join thread, if we double the number of threads, we double the amount of memory that we need.&amp;nbsp; With a hash partitioned parallel hash join, we need the same amount of memory regardless of the degree of parallelism.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Bitmap Filtering&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: gray"&gt;join&lt;/SPAN&gt; T2 &lt;SPAN style="COLOR: blue"&gt;on&lt;/SPAN&gt; T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;b &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; T2&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a &lt;SPAN style="COLOR: blue"&gt;where&lt;/SPAN&gt; T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a &lt;SPAN style="COLOR: gray"&gt;&amp;lt;&lt;/SPAN&gt; 100000&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Finally, suppose we have a fairly selective filter on the build input to a hash join:&lt;/P&gt;
&lt;P&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Hash Match(Inner Join, HASH:([T1].[b])=([T2].[a]), RESIDUAL:([T2].[a]=[T1].[b]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Bitmap(HASH:([T1].[b]), DEFINE:([Bitmap1008]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([T1].[b]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T1]), WHERE:([T1].[a]&amp;lt;(100000)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([T2].[a]), WHERE:(PROBE([Bitmap1008])=TRUE))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T2]))&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;What’s the bitmap operator?&amp;nbsp; The predicate “T1.a &amp;lt; 100000” eliminates 90% of the build rows from T1.&amp;nbsp; It also indirectly eliminates 90% of the rows from T2 because they no longer join with rows from T1.&amp;nbsp; The bitmap operator provides an efficient way to apply the T1 filter directly to T2 without passing the rows all the way through the exchange to the join.&amp;nbsp; As its name suggests, it builds a bitmap.&amp;nbsp; Just like a hash join, we hash each row of T1 on the join key T1.b and set the corresponding bit in the bitmap.&amp;nbsp; Once the scan of T1 and the hash join build is complete, we transfer the bitmap to the exchange above&amp;nbsp;the scan of&amp;nbsp;T2 where we use it as a filter.&amp;nbsp; This time we hash each row of T2 on the join key T2.a and test the corresponding bit in the bitmap.&amp;nbsp; If the bit is set, the row may join and we pass it along to the hash join.&amp;nbsp; If the bit is not set, the row cannot join and we discard it.&amp;nbsp; For more information on bitmaps see &lt;A class="" href="http://blogs.msdn.com/sqlqueryprocessing/archive/2006/10/27/query-execution-bitmaps.aspx" mce_href="http://blogs.msdn.com/sqlqueryprocessing/archive/2006/10/27/query-execution-bitmaps.aspx"&gt;this post&lt;/A&gt; from the SQL Server Query Processing Team blog.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1091692" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/craigfr/archive/tags/Joins/default.aspx">Joins</category><category domain="http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx">Parallelism</category></item><item><title>Parallel Nested Loops Join</title><link>http://blogs.msdn.com/craigfr/archive/2006/11/08/parallel-nested-loops-join.aspx</link><pubDate>Thu, 09 Nov 2006 06:04:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1042559</guid><dc:creator>craigfr</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/craigfr/comments/1042559.aspx</comments><wfw:commentRss>http://blogs.msdn.com/craigfr/commentrss.aspx?PostID=1042559</wfw:commentRss><description>&lt;P&gt;SQL Server parallelizes a &lt;A class="" href="http://blogs.msdn.com/craigfr/archive/2006/07/26/679319.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/07/26/679319.aspx"&gt;nested loops join&lt;/A&gt; by distributing the outer rows (i.e., the rows from the first input) randomly among the nested loops threads.&amp;nbsp; For example, if we have two threads running a nested loops join, we send about half of the rows to each thread.&amp;nbsp; Each thread then runs the inner side (i.e., the second input) of the loop join for its set of rows as if it were running serially.&amp;nbsp; That is, for each outer row assigned to it, the thread executes its inner input using that row as the source of any correlated parameters.&amp;nbsp; In this way, the threads can run independently.&amp;nbsp; SQL Server does not add exchanges to or parallelize the inner side of a nested loops join.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;A simple example&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Let’s consider a simple example.&amp;nbsp; I am using a large table to induce the optimizer to choose a parallel plan.&amp;nbsp; If you try these examples, it may take a few minutes to create these tables.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;create&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;table&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;a &lt;SPAN style="COLOR: blue"&gt;int&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; b &lt;SPAN style="COLOR: blue"&gt;int&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; x &lt;SPAN style="COLOR: blue"&gt;char&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;200&lt;SPAN style="COLOR: gray"&gt;))&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;set&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;nocount&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;on&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;declare&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: blue"&gt;int&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;set&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 0&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;while&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: gray"&gt;&amp;lt;&lt;/SPAN&gt; 1000000&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;begin&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;insert&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: blue"&gt;values&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;@i&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; @i&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; @i&lt;SPAN style="COLOR: gray"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;set&lt;/SPAN&gt; @i &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; @i &lt;SPAN style="COLOR: gray"&gt;+&lt;/SPAN&gt; 1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;end&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;into&lt;/SPAN&gt; T2 &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;into&lt;/SPAN&gt; T3 &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;create&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;unique&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;clustered&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;index&lt;/SPAN&gt; T2a &lt;SPAN style="COLOR: blue"&gt;on&lt;/SPAN&gt; T2&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;a&lt;SPAN style="COLOR: gray"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;create&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;unique&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;clustered&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;index&lt;/SPAN&gt; T3a &lt;SPAN style="COLOR: blue"&gt;on&lt;/SPAN&gt; T3&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;a&lt;SPAN style="COLOR: gray"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face="Times New Roman" size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: gray"&gt;join&lt;/SPAN&gt; T2 &lt;SPAN style="COLOR: blue"&gt;on&lt;/SPAN&gt; T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;b &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; T2&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a &lt;SPAN style="COLOR: blue"&gt;where&lt;/SPAN&gt; T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a &lt;SPAN style="COLOR: gray"&gt;&amp;lt;&lt;/SPAN&gt; 100&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;
&lt;TABLE class="" border=0&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;Rows&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;Executes&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;100&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;1&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;100&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;2&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Nested Loops(Inner Join, OUTER REFERENCES:([T1].[b], [Expr1007]) OPTIMIZED)&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;100&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;2&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T1]), WHERE:([T1].[a]&amp;lt;(100)))&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;100&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;100&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class=""&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Clustered Index Seek(OBJECT:([T2].[T2a]), SEEK:([T2].[a]=[T1].[b]) ORDERED FORWARD)&lt;/FONT&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/P&gt;
&lt;P&gt;First note that there is only one exchange (i.e., parallelism operator) in this plan.&amp;nbsp; Since this exchange is at the root of the query plan, all of the operators in this plan (the nested loops join, the table scan, and the clustered index seek) execute in each thread.&lt;/P&gt;
&lt;P&gt;I deliberately did not create an index on T1.&amp;nbsp; The lack of an index forces the table scan and the use of a residual predicate to evaluate “T1.a &amp;lt; 100”.&amp;nbsp; Because there are one million rows in T1, the table scan is expensive and the optimizer chooses a parallel scan of T1.&lt;/P&gt;
&lt;P&gt;The scan of T1 is &lt;EM&gt;not&lt;/EM&gt; immediately below an exchange (i.e., a parallelism operator).&amp;nbsp; In fact, it is on the outer side of the nested loops join and it is the nested loops join that is below the exchange.&amp;nbsp; Nevertheless, because the scan is on the outer side of the join and because the join is below a start (i.e., a gather or redistribute) exchange, we use a parallel scan for T1.&lt;/P&gt;
&lt;P&gt;If you recall my post from last week, a &lt;A class="" href="http://blogs.msdn.com/craigfr/archive/2006/11/01/parallel-scan.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/11/01/parallel-scan.aspx"&gt;parallel scan&lt;/A&gt; assigns pages to threads dynamically.&amp;nbsp; Thus, the scan distributes the T1 rows among the threads.&amp;nbsp; It does not mater which rows it distributes to which threads.&lt;/P&gt;
&lt;P&gt;Since I ran this query with degree of parallelism (DOP) two, we see that there are two executes each for the table scan and join (which are in the same thread).&amp;nbsp; Moreover, the scan and join both return a total of 100 rows though we cannot tell from this output how many rows each thread returned.&amp;nbsp; (We could determine this information using statistics XML output.&amp;nbsp; I'll return to this question below.)&lt;/P&gt;
&lt;P&gt;Next, the join executes its inner side (in this case the index seek on T2) for each of the 100 outer rows.&amp;nbsp; Here is where things get a little tricky.&amp;nbsp; Although each of the two threads contains an instance of the index seek and even though the index seek is below the join which is below the exchange, the index seek is on the &lt;EM&gt;inner&lt;/EM&gt; side of the join and, thus, the seek does &lt;EM&gt;not&lt;/EM&gt; use a parallel scan.&amp;nbsp; Instead, the two seek instances execute independently of one another on two different outer rows and two different correlated parameters.&amp;nbsp; Just as in a serial plan, we see 100 executes of the index seek: one for each row on the outer side of the join.&amp;nbsp; No matter how complex the inner side of a nested loops join is, we always execute it as a serial plan just as in this simple example.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;A more complex example&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;In the above example, SQL Server relies of the parallel scan to distribute rows uniformly among the threads.&amp;nbsp; In some cases, this is not possible.&amp;nbsp; In these cases, SQL Server may add a round robin exchange to distribute the rows.&amp;nbsp; (A round robin exchange sends each subsequent packet of rows to the next consumer thread in a fixed sequence.)&amp;nbsp; Here is one such example:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;from&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;select&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;top&lt;/SPAN&gt; 100 &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T1 &lt;SPAN style="COLOR: blue"&gt;order&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;by&lt;/SPAN&gt; a&lt;SPAN style="COLOR: gray"&gt;)&lt;/SPAN&gt; T1top&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;join&lt;/SPAN&gt; T2 &lt;SPAN style="COLOR: blue"&gt;on&lt;/SPAN&gt; T1top&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;b &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; T2&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Nested Loops(Inner Join, OUTER REFERENCES:([T1].[b], [Expr1007]) WITH UNORDERED PREFETCH)&lt;BR&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Distribute Streams, RoundRobin Partitioning)&lt;BR&gt;&lt;/STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Top(TOP EXPRESSION:((100)))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Parallelism(Gather Streams, ORDER BY:([T1].[a] ASC))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Sort(TOP 100, ORDER BY:([T1].[a] ASC))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T1]))&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Clustered Index Seek(OBJECT:([T2].[T2a]), SEEK:([T2].[a]=[T1].[b]) ORDERED FORWARD)&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;The main difference between this plan and the plan from the original example is that this plan includes a top 100.&amp;nbsp; The top 100 can only be correctly evaluated in a serial plan thread.&amp;nbsp; (It cannot be split among multiple threads or we might end up with too few or too many rows.)&amp;nbsp; Thus, we must have a stop (e.g., a distribute streams) exchange above the top and we cannot use the parallel scan to distribute the rows among the join threads.&amp;nbsp; Instead we parallelize the join, by having this exchange use round robin partitioning distribute the rows among the join threads.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Miscellaneous issues&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The parallel scan has one major advantage over the round robin exchange.&amp;nbsp; A parallel scan automatically and dynamically balances the workload among the threads while a round robin exchange does not.&amp;nbsp; As I demonstrated last week, if we have query where one thread is slower than the others, the parallel scan may help compensate.&lt;/P&gt;
&lt;P&gt;Both the parallel scan and a round robin exchange may fail to keep all join threads busy if there are too many threads and too few pages and/or rows to be processed.&amp;nbsp; Some threads may get no rows to process and end up idle.&amp;nbsp; This problem can be more pronounced with a parallel scan since it doles out multiple pages at one time to each thread while the exchange distributes one packet (equivalent to one page) of rows at one time.&lt;/P&gt;
&lt;P&gt;We can see this problem in the original (simple) example above by checking the statistics XML output:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RelOp&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;NodeId&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;1&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;PhysicalOp&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;Nested Loops&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;LogicalOp&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;Inner Join&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ...&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeInformation&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;2&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;0&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;1&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;100&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;0&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;0&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;/&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeInformation&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;...&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;lt;/&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RelOp&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;All of the join’s output rows where processed by thread 1.&amp;nbsp; Why?&amp;nbsp; The table scan has a residual predicate “T1.a &amp;lt; 100”.&amp;nbsp; This predicate is true for the first 100 rows in the table and false for the remaining rows.&amp;nbsp; The (three) pages containing the first 100 rows are all assigned to the first thread.&lt;/P&gt;
&lt;P&gt;In this example, this is not a big problem since the inner side of the join is fairly cheap and contributes a small percentage of the overall cost of the query (with the table scan itself representing a much larger percentage of the cost).&amp;nbsp; However, this problem could be more significant if the inner side of the query were more expensive.&amp;nbsp; The problem is especially notable with partitioned tables.&amp;nbsp; I will write about partitioned tables in a future post.&amp;nbsp; In the meantime, &lt;A class="" href="http://blogs.msdn.com/sqlcat/archive/2005/11/30/498415.aspx" mce_href="http://blogs.msdn.com/sqlcat/archive/2005/11/30/498415.aspx"&gt;this post&lt;/A&gt; from the SQL Server Development Customer Advisory Team blog illustrates the problem.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1042559" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/craigfr/archive/tags/Joins/default.aspx">Joins</category><category domain="http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx">Parallelism</category></item><item><title>Parallel Scan</title><link>http://blogs.msdn.com/craigfr/archive/2006/11/01/parallel-scan.aspx</link><pubDate>Thu, 02 Nov 2006 07:16:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:928168</guid><dc:creator>craigfr</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/craigfr/comments/928168.aspx</comments><wfw:commentRss>http://blogs.msdn.com/craigfr/commentrss.aspx?PostID=928168</wfw:commentRss><description>&lt;P&gt;In this post, I’m going to take a look at how SQL Server parallelizes scans.&amp;nbsp; The scan operator is one of the few operators that is parallel “aware.”&amp;nbsp; Most operators neither need to know nor care whether they are executing in parallel; the scan is an exception.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;How does parallel scan work?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The threads that compose a parallel scan work together to scan all of the rows in a table.&amp;nbsp; There is no a-priori assignment or rows or pages to a particular thread.&amp;nbsp; Instead, the storage engine dynamically hands out pages to threads.&amp;nbsp; A parallel page supplier coordinates access to the pages of the table.&amp;nbsp; The parallel page supplier ensures that each page is assigned to exactly one thread and, thus, is processed exactly once.&lt;/P&gt;
&lt;P&gt;At the beginning of a parallel scan, each thread requests a set of pages from the parallel page supplier.&amp;nbsp; The threads then begin processing these pages and returning rows.&amp;nbsp; When a thread finishes with its assigned set of pages, it requests the next set of pages from the parallel page supplier.&lt;/P&gt;
&lt;P&gt;This algorithm has a couple of advantages:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;It is independent of the number of threads.&amp;nbsp; We can add and remove threads from a parallel scan and it automatically adjusts.&amp;nbsp; If we double the number of threads, each thread processes (approximately) half as many pages.&amp;nbsp; And, if the I/O system can keep up, the scan runs twice as fast.&lt;/LI&gt;
&lt;LI&gt;It is resilient to skew or load imbalances.&amp;nbsp; If one thread runs slower than the other threads, that thread simply requests fewer pages while the other faster threads pick up the extra work.&amp;nbsp; The total execution time degrades smoothly.&amp;nbsp; (Compare this scenario to what would happen if we statically assigned pages to threads: the slow thread would dominate the total execution time.)&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Example&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Let’s begin with a simple example.&amp;nbsp; To get a parallel plan, we’ll need a fairly big table; if the table is too small, the optimizer will conclude that a serial plan is perfectly adequate.&amp;nbsp; The following script creates a table with 1,000,000 rows and (thanks to the fixed length char(200) column) about 27,000 pages.&amp;nbsp; Warning: If you decide to run this example, it could a few minutes to populate this table.&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;create&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;table&lt;/SPAN&gt; T &lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;a &lt;SPAN style="COLOR: blue"&gt;int&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; x &lt;SPAN style="COLOR: blue"&gt;char&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;200&lt;SPAN style="COLOR: gray"&gt;))&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;set&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;nocount&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;on&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;declare&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: blue"&gt;int&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;set&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 0&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;while&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; @i &lt;SPAN style="COLOR: gray"&gt;&amp;lt;&lt;/SPAN&gt; 1000000&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;begin&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;insert&lt;/SPAN&gt; T &lt;SPAN style="COLOR: blue"&gt;values&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;@i&lt;SPAN style="COLOR: gray"&gt;,&lt;/SPAN&gt; @i&lt;SPAN style="COLOR: gray"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;set&lt;/SPAN&gt; @i &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; @i &lt;SPAN style="COLOR: gray"&gt;+&lt;/SPAN&gt; 1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;end&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Now, for the simplest possible query:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Table Scan(OBJECT:([T]))&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;We get a serial plan!&amp;nbsp; Why don’t we get a parallel plan?&amp;nbsp; Parallelism is really about speeding up queries by applying more CPUs to the problem.&amp;nbsp; The cost of this query is dominated by the cost of reading pages from disk (which is mitigated by read ahead rather than parallelism) and returning rows to the client.&amp;nbsp; The query uses relatively few CPU cycles and, in fact, would probably run slower if we parallelized it.&lt;/P&gt;
&lt;P&gt;If we add a fairly selective predicate to the query, we can get a parallel plan:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T &lt;SPAN style="COLOR: blue"&gt;where&lt;/SPAN&gt; a &lt;SPAN style="COLOR: gray"&gt;&amp;lt;&lt;/SPAN&gt; 1000&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T]), WHERE:([T].[a]&amp;lt;CONVERT_IMPLICIT(int,[@1],0)))&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;By running this query in parallel, we can distribute the cost of evaluating this predicate across multiple CPUs.&amp;nbsp; (In this case, the predicate is so cheap that it probably does not make much difference whether or not we run in parallel.)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Load Balancing&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;As I mentioned above, the parallel scan algorithm dynamically allocates pages to threads.&amp;nbsp; We can see this in action.&amp;nbsp; Consider this query which returns every row of the table:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T &lt;SPAN style="COLOR: blue"&gt;where&lt;/SPAN&gt; a &lt;SPAN style="COLOR: gray"&gt;%&lt;/SPAN&gt; 2 &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 0 &lt;SPAN style="COLOR: gray"&gt;or&lt;/SPAN&gt; a &lt;SPAN style="COLOR: gray"&gt;%&lt;/SPAN&gt; 2 &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 1&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;The peculiar predicate confuses the optimizer which underestimates the cardinality and generates a parallel plan:&lt;/P&gt;
&lt;P&gt;&lt;FONT face=Tahoma size=1&gt;&amp;nbsp; |--Parallelism(Gather Streams)&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |--Table Scan(OBJECT:([T]), WHERE:([T].[a]%(2)=(0) OR [T].[a]%(2)=(1)))&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;On SQL Server 2005, using “SET STATISTICS XML ON” we can see exactly how many rows each thread processes.&amp;nbsp; Here is an excerpt of the XML output on a two processor system:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RelOp&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;NodeId&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;2&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;PhysicalOp&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;Table Scan&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;LogicalOp&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;Table Scan&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ...&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeInformation&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;2&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;530432&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;1&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;469568&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;0&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;0&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;/&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeInformation&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;...&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;lt;/&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RelOp&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;We can see that both threads (threads 1 and 2) processed approximately half of the rows.&amp;nbsp; (Thread 0 is the coordinator or main thread.&amp;nbsp; It only executes the portion of the query plan above the topmost exchange.&amp;nbsp; Thus, we do not expect it to process any rows for any parallel operators.)&lt;/P&gt;
&lt;P&gt;Now let’s repeat the experiment, but let’s run an expensive serial query at the same time.&amp;nbsp; This cross join query will run for a really long time (it needs to process one trillion rows) and use plenty of CPU cycles:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: fuchsia"&gt;min&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;T1&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a &lt;SPAN style="COLOR: gray"&gt;+&lt;/SPAN&gt; T2&lt;SPAN style="COLOR: gray"&gt;.&lt;/SPAN&gt;a&lt;SPAN style="COLOR: gray"&gt;)&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T T1 &lt;SPAN style="COLOR: gray"&gt;cross&lt;/SPAN&gt; &lt;SPAN style="COLOR: gray"&gt;join&lt;/SPAN&gt; T T2 &lt;SPAN style="COLOR: blue"&gt;option&lt;/SPAN&gt;&lt;SPAN style="COLOR: gray"&gt;(&lt;/SPAN&gt;maxdop 1&lt;SPAN style="COLOR: gray"&gt;)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;This serial query will consume cycles from only one of the two CPUs.&amp;nbsp; While it is running, let’s run the other query again:&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;select&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: gray"&gt;*&lt;/SPAN&gt; &lt;SPAN style="COLOR: blue"&gt;from&lt;/SPAN&gt; T &lt;SPAN style="COLOR: blue"&gt;where&lt;/SPAN&gt; a &lt;SPAN style="COLOR: gray"&gt;%&lt;/SPAN&gt; 2 &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 0 &lt;SPAN style="COLOR: gray"&gt;or&lt;/SPAN&gt; a &lt;SPAN style="COLOR: gray"&gt;%&lt;/SPAN&gt; 2 &lt;SPAN style="COLOR: gray"&gt;=&lt;/SPAN&gt; 1&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RelOp&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;NodeId&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;2&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;PhysicalOp&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;Table Scan&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;LogicalOp&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;Table Scan&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ...&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeInformation&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;1&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;924224&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;2&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;75776&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeCountersPerThread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt; &lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: red; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;Thread&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;=&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;"&lt;SPAN style="COLOR: blue"&gt;0&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; &lt;/SPAN&gt;&lt;SPAN style="COLOR: red"&gt;ActualRows&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;=&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt;0&lt;/SPAN&gt;"&lt;SPAN style="COLOR: blue"&gt; ... /&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&amp;lt;/&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RunTimeInformation&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;...&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;lt;/&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: maroon; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;RelOp&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"&gt;&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;This time thread 1 processed more than 90% of the rows while thread 2 which was busy executing the above serial plan processed far fewer rows.&amp;nbsp; The parallel scan automatically balanced the work across the two threads.&amp;nbsp; Since thread 1 had more free cycles (it wasn’t competing with the serial plan), it requested and scanned more pages.&lt;/P&gt;
&lt;P&gt;If you try this experiment, don’t forget to kill the serial query when you are done!&amp;nbsp; Otherwise, it will continue to run and waste cycles for a really long time.&lt;/P&gt;
&lt;P&gt;The same load balancing that we just observed applies equally whether a thread is slowed down because of an external factor (such as the serial query in this example) or because of an internal factor.&amp;nbsp; For example, if it costs more to process some rows than others, we will see the same behavior.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=928168" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/craigfr/archive/tags/Scans+and+Seeks/default.aspx">Scans and Seeks</category><category domain="http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx">Parallelism</category></item><item><title>The Parallelism Operator (aka Exchange)</title><link>http://blogs.msdn.com/craigfr/archive/2006/10/25/the-parallelism-operator-aka-exchange.aspx</link><pubDate>Wed, 25 Oct 2006 22:40:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:874419</guid><dc:creator>craigfr</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/craigfr/comments/874419.aspx</comments><wfw:commentRss>http://blogs.msdn.com/craigfr/commentrss.aspx?PostID=874419</wfw:commentRss><description>&lt;P&gt;As I noted in my &lt;A class="" title="Introduction to Parallel Query Execution" href="http://blogs.msdn.com/craigfr/archive/2006/10/11/introduction-to-parallel-query-execution.aspx" mce_href="http://blogs.msdn.com/craigfr/archive/2006/10/11/introduction-to-parallel-query-execution.aspx"&gt;Introduction to Parallel Query Execution post&lt;/A&gt;, the parallelism (or exchange) iterator actually implements parallelism in query execution.&amp;nbsp; The optimizer places exchanges at the boundaries between threads; the exchange moves the rows between the threads.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The iterator that’s really two iterators&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The exchange iterator is unique in that it is really two iterators: a producer and a consumer.&amp;nbsp; We place the producer at the root of a query sub-tree (often called a branch).&amp;nbsp; The producer reads input rows from its sub-tree, assembles the rows into packets, and routes these packets the appropriate consumers.&amp;nbsp; We place the consumer at the leaf of the next query sub-tree.&amp;nbsp; The consumer receives packets from its producer(s), removes the rows from these packets, and returns the rows to its parent iterator.&amp;nbsp; For example, a repartition exchange running at degree of parallelism (DOP) two, consists of two producers and two consumers:&lt;/P&gt;
&lt;P&gt;&lt;IMG src="http://blogs.msdn.com/photos/craigfr/images/874319/original.aspx" border=0&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note that while the data flow between most iterators is pull based (an iterator calls GetRow on its child when it is ready for another row), the data flow in between an exchange producer and consumer is push based.&amp;nbsp; That is, the producer fills a packet with rows and “pushes” it to the consumer.&amp;nbsp; This model allows the producer and consumer threads to execute independently.&amp;nbsp; (We do have flow control to prevent a fast producer from flooding a slow consumer with excessive packets.)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;How many different types of exchanges are there?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Exchanges can be classified in three different ways.&lt;/P&gt;
&lt;P&gt;First, we can classify exchanges based on the number of producer and/or consumer threads:&lt;/P&gt;
&lt;P&gt;
&lt;TABLE class="" border=1&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class=""&gt;Type&lt;/TH&gt;
&lt;TH class=""&gt;# producer threads&lt;/TH&gt;
&lt;TH class=""&gt;# consumer threads&lt;/TH&gt;&lt;/TR&gt;&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Gather Streams&lt;/TD&gt;
&lt;TD class=""&gt;DOP&lt;/TD&gt;
&lt;TD class=""&gt;1&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Repartition Streams&lt;/TD&gt;
&lt;TD class=""&gt;DOP&lt;/TD&gt;
&lt;TD class=""&gt;DOP&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Distribute Streams&lt;/TD&gt;
&lt;TD class=""&gt;1&lt;/TD&gt;
&lt;TD class=""&gt;DOP&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/P&gt;
&lt;P&gt;A gather streams exchange is often called a “start parallelism” exchange since the operators above it run serially while the operators below it run in parallel.&amp;nbsp; The root exchange in any parallel plan is always a gather exchange since the results of any query plan must ultimately be funneled back to the single connection thread to be returned to the client.&amp;nbsp; A distribute streams exchange is often called a “stop parallelism” exchange.&amp;nbsp; It is the opposite of a gather streams exchange: the operators above a distribute steams exchange run in parallel while the operators below it run serially.&lt;/P&gt;
&lt;P&gt;Second, we can&amp;nbsp;classify exchanges based on how they route rows from the producer to the consumer.&amp;nbsp; We refer to this property as the “partitioning type” of the exchange.&amp;nbsp; Partitioning type only makes sense for a repartition or a distribute streams exchange since there is only one way to route rows in a gather exchange: to the single consumer thread.&amp;nbsp; SQL Server supports the following partitioning types:&lt;/P&gt;
&lt;P&gt;
&lt;TABLE class="" border=1&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class=""&gt;Partitioning Type&lt;/TH&gt;
&lt;TH class=""&gt;Description&lt;/TH&gt;&lt;/TR&gt;&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Broadcast&lt;/TD&gt;
&lt;TD class=""&gt;Send all rows to all consumer threads.&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Round Robin&lt;/TD&gt;
&lt;TD class=""&gt;Send each packet of rows to the next consumer thread in sequence.&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Hash&lt;/TD&gt;
&lt;TD class=""&gt;Determine where to send each row by evaluating a hash function on one or more columns in the row.&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Range&lt;/TD&gt;
&lt;TD class=""&gt;Determine where to send each row by evaluating a range function on one column in the row.&amp;nbsp; (A range function splits the total possible set of values into a set of continguous ranges.&amp;nbsp; This partition type is rare and is used only by certain parallel index build plans.)&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class=""&gt;Demand&lt;/TD&gt;
&lt;TD class=""&gt;Send the next row to the next consumer that asks.&amp;nbsp; This partition type is the only type of exchange that uses a pull rather a push model for data flow and is used only&amp;nbsp;in query plans with partitioned tables.&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/P&gt;
&lt;P&gt;Third, we can classify exchanges as&amp;nbsp;merging (or order preserving) and non-merging (or non-order preserving).&amp;nbsp; The consumer in a merging exchange ensures that rows from multiple producers are returned in a sorted order.&amp;nbsp; (The rows must already be in this sorted order at the producer; the merging exchange does not actually sort.)&amp;nbsp; A merging exchange only make sense for a gather or a repartition streams exchange; with a distribute streams exchange, there is only one producer and, thus, only one stream of rows and nothing to merge at each consumer.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Showplan&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;SQL Server includes all of the above properties in showplan (graphical, text, and XML).&lt;/P&gt;
&lt;P&gt;Speaking of showplan, in graphical showplan you can also tell at a glance which operators are running in parallel (i.e., which operators are between a start exchange and a stop exchange) by looking for a little parallelism symbol on the operator icons:&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;IMG src="http://blogs.msdn.com/photos/craigfr/images/874321/original.aspx" border=0&gt;&lt;/P&gt;
&lt;P&gt;In my next post about parallelism, I’ll begin to explore some parallel query plans and demonstrate the different types of exchanges in action.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=874419" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx">Parallelism</category></item><item><title>Introduction to Parallel Query Execution</title><link>http://blogs.msdn.com/craigfr/archive/2006/10/11/introduction-to-parallel-query-execution.aspx</link><pubDate>Wed, 11 Oct 2006 21:11:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:817279</guid><dc:creator>craigfr</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/craigfr/comments/817279.aspx</comments><wfw:commentRss>http://blogs.msdn.com/craigfr/commentrss.aspx?PostID=817279</wfw:commentRss><description>&lt;P&gt;SQL Server has the ability to execute queries using multiple CPUs simultaneously.&amp;nbsp; We refer to this capability as parallel query execution.&amp;nbsp; Parallel query execution can be used to reduce the response time of (i.e., speed up) a large query.&amp;nbsp; It can also be used to a run a bigger query (one that processes more data) in about the same amount of time as a smaller query (i.e., scale up) by increasing the number of CPUs used in processing the query.&amp;nbsp; For most large queries SQL Server generally scales linearly or nearly linearly.&amp;nbsp; For speed up, this means that if we double the number of CPUs, we see the response time drop in half.&amp;nbsp; For scale up, it means that if we double the number of CPUs and the size of the query, we see the same response time.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;When is parallelism useful?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;As I noted above parallelism can be used to reduce the response time of a single query.&amp;nbsp; However, parallelism comes with a cost: it increases the overhead associated with executing a query.&amp;nbsp; While this overhead is relatively small, it does make parallelism inappropriate for small queries (especially for OLTP queries) where the overhead would dominate the total execution time.&amp;nbsp; Moreover, while SQL Server does generally scale linearly, if we compare the same query running serially (i.e., without parallelism and on a single CPU) and in parallel on two CPUs, we will typically find that the parallel execution time is more than half of the serial execution time.&amp;nbsp; Again, this effect is due to the parallelism overhead.&lt;/P&gt;
&lt;P&gt;Parallelism is primarily useful on servers running a relatively small number of concurrent queries.&amp;nbsp; On this type of server, parallelism can enable a small set of queries to keep many CPUs busy.&amp;nbsp; On servers running many concurrent queries (such as an OLTP system), we do not need parallelism to keep the CPUs busy; the mere fact that we have so many queries to execute can keep the CPUs busy.&amp;nbsp; Running these queries in parallel would just add overhead that would reduce the overall throughput of the system.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;How does SQL Server parallelize queries?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;SQL Server parallelizes queries by horizontally partitioning the input data into approximately equal sized sets, assigning one set to each CPU, and then performing the same operation (e.g., aggregate, join, etc.) on each set.&amp;nbsp; For example, suppose that we want to use two CPUs to execute a hash aggregate that happens to be grouping on an integer column.&amp;nbsp; We create two threads (one for each CPU).&amp;nbsp; Each thread executes the same hash aggregate operator.&amp;nbsp; We might partition the input data by sending rows where the group by column is odd to one thread and rows where the group by column is even to the other thread.&amp;nbsp; As long as all rows that belong to one group are processed by one hash aggregate operator and one thread, we get the correct result.&lt;/P&gt;
&lt;P&gt;&lt;IMG src="http://blogs.msdn.com/photos/craigfr/images/817261/original.aspx" border=0&gt;&lt;/P&gt;
&lt;P&gt;This method of parallel query execution is both simple and scales well.&amp;nbsp; In the above example, both hash aggregate threads execute independently.&amp;nbsp; The two threads do not need to communicate or coordinate their work in any way.&amp;nbsp; If we want to increase the degree of parallelism (DOP), we can simple add more threads and adjust our partitioning function.&amp;nbsp; (In practice, we use a hash function to distribute rows for a hash aggregate.&amp;nbsp; The hash function handles any data type, any number of group by columns, and any number of threads.)&lt;/P&gt;
&lt;P&gt;The actual partitioning and movement of data between threads is handled by the parallelism (or exchange) iterator.&amp;nbsp; Although it is unique in many respects, the parallelism iterator implements the same interfaces as any other iterator.&amp;nbsp; Most of the other iterators do not need to be aware that they are executing in parallel.&amp;nbsp; We simply place appropriate parallelism iterators in the plan and it runs in parallel.&lt;/P&gt;
&lt;P&gt;Note that this method of parallelism is not the same as “pipeline” parallelism where multiple unrelated operators run concurrently in different threads.&amp;nbsp; Although SQL Server frequently places different operators in different threads, the primary reason for doing so is to allow repartitioning of the data as it flows from one operator to the next.&amp;nbsp; With pipeline parallelism the degree of parallelism and the total number of threads would be limited to the number of operators.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;IMG src="http://blogs.msdn.com/photos/craigfr/images/817262/original.aspx" border=0 mce_src="http://blogs.msdn.com/photos/craigfr/images/817262/original.aspx"&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Who decides whether to parallelize a query?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The query optimizer decides whether we should execute a query in parallel.&amp;nbsp; This decision, like most others, is cost based.&amp;nbsp; A complex and expensive query that processes many rows is more likely to result in a parallel plan than a simple query that processes very few rows.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Who decides the degree of parallelism (DOP)?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The DOP is not part of the cached compiled plan and may change with each execution.&amp;nbsp; We decide the DOP at the start of execution by considering the number of CPUs on the server, the “max degree of parallelism” and “max worker threads” sp_configure settings (only visible if the “show advanced options” setting is on), and the query MAXDOP hint if any.&amp;nbsp; In short, we choose a DOP that maximizes parallelism while ensuring that we do not run out of worker threads.&amp;nbsp; If you choose MAXDOP 1, we remove all parallelism iterators and run the query as a serial plan using a single thread.&lt;/P&gt;
&lt;P&gt;Note that the number of threads used by a parallel query may exceed the DOP.&amp;nbsp; If you check sys.sysprocesses while running a parallel query, you may see more threads than the DOP.&amp;nbsp; As I noted above, if we need to repartition data between two operators, we place them in different threads.&amp;nbsp; The DOP determines the number of threads per operator not the total number of threads per query plan.&amp;nbsp; In SQL Server 2000 if the DOP was less than the number of CPUs, the extra threads could use the extra CPUs effectively defeating the MAXDOP settings.&amp;nbsp; In SQL Server 2005, when we run a query with a given DOP, we also limit the number of schedulers to the selected DOP.&amp;nbsp; That is, all threads used by the query are assigned to the same set of DOP schedulers and the query uses only DOP CPUs regardless of the total number of threads.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=817279" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/craigfr/archive/tags/Parallelism/default.aspx">Parallelism</category></item></channel></rss>