<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx</link><description>In both .NET 4 and .NET 4.5, PLINQ supports enumerables with up to Int32.MaxValue elements.&amp;#160; Beyond that limit, PLINQ will throw an overflow exception.&amp;#160; LINQ to Objects itself has this limitation with certain query operators (such as the indexed</description><dc:language>en-US</dc:language><generator>Telligent Evolution Platform Developer Build (Build: 5.6.50428.7875)</generator><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10375677</link><pubDate>Fri, 07 Dec 2012 22:16:56 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10375677</guid><dc:creator>Stephen Toub - MSFT</dc:creator><description>&lt;p&gt;@Caleb Bell: You could if you wanted to, though you still probably wouldn&amp;#39;t end up using YBE &amp;quot;on its own&amp;quot;, given that it does &amp;quot;yield return source.Current&amp;quot; without having called MoveNext() directly, which means it&amp;#39;s relying on the caller to have already done that.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10375677" width="1" height="1"&gt;</description></item><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10375634</link><pubDate>Fri, 07 Dec 2012 18:42:31 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10375634</guid><dc:creator>Caleb Bell</dc:creator><description>&lt;p&gt;This might be a nitpick (or I might be overlooking something): Would it not be better to move the &amp;quot;batchSize - 1&amp;quot; down into the YieldBatchElements&amp;lt;T&amp;gt;() method? &amp;nbsp;That way, YBE could be used on its own with the correct number of elements yielded.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10375634" width="1" height="1"&gt;</description></item><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10370243</link><pubDate>Tue, 20 Nov 2012 15:50:13 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10370243</guid><dc:creator>Stephen Toub - MSFT</dc:creator><description>&lt;p&gt;@raboof.com: Thank you for your comments. &amp;nbsp;I&amp;#39;m well aware that there can be advantages to the implementations in MoreLINQ, IX, etc., advantages I actually already alluded to in my previous reply. &amp;nbsp;There are also disadvantages, and for this particular situation, I&amp;#39;m suggesting that for the kinds of queries folks have run into with this PLINQ limit, the buffering hasn&amp;#39;t been necessary and the cost of the buffering has been prohibitive. &amp;nbsp;(I also don&amp;#39;t understand your comment about creating batches of batches, as I don&amp;#39;t see how that&amp;#39;s significantly different from just using a smaller batch size to begin with.) &amp;nbsp;&lt;/p&gt;
&lt;p&gt;Just to put things in perspective, on my machine, the cost of the first MoveNext call on an enumerator from such a large Batch call takes upwards of 30 seconds, just to retrieve the first element, due to all of the elements needing to be enumerated in order to buffer them so that the first can be returned (and that&amp;#39;s if I didn&amp;#39;t OOM first from the large, contiguous buffer being allocated). &amp;nbsp;&lt;/p&gt;
&lt;p&gt;In any event, this comment stream has devolved into a discussion of a implementing a Batch extension method, which wasn&amp;#39;t the point of this post; the point of the post was that there&amp;#39;s a limitation which a user can work around, and I provided one possible wayto do so. &amp;nbsp;If folks would prefer to do so in another way, that&amp;#39;s perfectly fine.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10370243" width="1" height="1"&gt;</description></item><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10370109</link><pubDate>Tue, 20 Nov 2012 09:07:30 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10370109</guid><dc:creator>raboof.com</dc:creator><description>&lt;p&gt;@Stephen: The reason Batch in MoreLINQ buffers is for correctness. The batch your implementation yields is not restart-able and has the side-effect of potentially producing wrong results or breaking an existing query downstream when it is batched. For example, &amp;nbsp;you would expect the following:&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;var query = &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;from b in Enumerable.Range(1, 20).Batch(3)&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;select b.Count() + &amp;quot;: &amp;quot; + string.Join(&amp;quot;, &amp;quot;, b);&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;Array.ForEach(query.ToArray(), Console.WriteLine);&lt;/p&gt;
&lt;p&gt;to print:&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 1, 2, 3&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 4, 5, 6&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 7, 8, 9&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 10, 11, 12&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 13, 14, 15&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 16, 17, 18&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;2: 19, 20&lt;/p&gt;
&lt;p&gt;Instead, it prints:&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 3, 4, 5&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 8, 9, 10&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 13, 14, 15&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;3: 18, 19, 20&lt;/p&gt;
&lt;p&gt;To get it to work right, you&amp;#39;d have to buffer per batch like this:&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;var query = &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;from b in Enumerable.Range(1, 20).Batch(3)&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;select b.ToArray() into b // buffer&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;select b.Count() + &amp;quot;: &amp;quot; + string.Join(&amp;quot;, &amp;quot;, b);&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;Array.ForEach(query.ToArray(), Console.WriteLine);&lt;/p&gt;
&lt;p&gt;Then again, if you just ToArray on the batches, like this:&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;var query = &lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;from b in Enumerable.Range(1, 20).Batch(3).ToArray()&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;select b.ToArray() into b&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;select b.Count() + &amp;quot;: &amp;quot; + string.Join(&amp;quot;, &amp;quot;, b);&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;Array.ForEach(query.ToArray(), Console.WriteLine);&lt;/p&gt;
&lt;p&gt;you get yet another (and more) bizarre output:&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp;1: 20&lt;/p&gt;
&lt;p&gt;With MoreLINQ&amp;#39;s Batch, all of the above produce the same results and while buffering has its downsides at the scale at which you&amp;#39;re discussing, correctness becomes increasingly important as your LINQ queries get large and you need to be able to reason about them as you compose and often without execution.&lt;/p&gt;
&lt;p&gt;I understand your motivation for your implementation but it needs a big warning/disclaimer as the user needs to be more cautious when combining it with different operators.&lt;/p&gt;
&lt;p&gt;I think if I had to work around the same 2GB limitation, I&amp;#39;d start by creating batches of batches with MoreLINQ&amp;#39;s Batch. :) The cost is then diminished significantly to buffering of the inner-most batch (ideally just shy of avoiding the LOB heap) and that too during only one iteration.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10370109" width="1" height="1"&gt;</description></item><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10369540</link><pubDate>Sat, 17 Nov 2012 15:52:20 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10369540</guid><dc:creator>Robert Schroeder</dc:creator><description>&lt;p&gt;You could also look at the recently open sourced IX Extensions -- Extensions to Enumerable from the RX team.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10369540" width="1" height="1"&gt;</description></item><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10369534</link><pubDate>Sat, 17 Nov 2012 14:53:45 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10369534</guid><dc:creator>xor88</dc:creator><description>&lt;p&gt;I think that is a valid approach. One needs to be careful not to introduce a sequential bottleneck doing this. Amdahl&amp;#39;s Law applies.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10369534" width="1" height="1"&gt;</description></item><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10369457</link><pubDate>Sat, 17 Nov 2012 02:23:58 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10369457</guid><dc:creator>Stephen Toub - MSFT</dc:creator><description>&lt;p&gt;@James: While that&amp;#39;s true and a good reference, folks should keep in mind that the implementations differ in a very important way: the MoreLinq solution buffers each batch into an array before yielding any elements. That has some advantages in some cases, but for this purpose I explicitly opted away from doing that, as otherwise here you&amp;#39;d be buffering Int32.MaxValue elements, and beyond the delays/pauses in processing that would cause, it would also be much more likely to hit out of memory conditions due to inability to find contiguous memory of 2GB or greater for the array allocations. &lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10369457" width="1" height="1"&gt;</description></item><item><title>re: PLINQ and Int32.MaxValue</title><link>http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx#10369453</link><pubDate>Sat, 17 Nov 2012 02:10:33 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10369453</guid><dc:creator>James Manning</dc:creator><description>&lt;p&gt;FWIW, Batch *does* exist in the _awesome_ MoreLinq project by Jon Skeet and Aziz Atif.&lt;/p&gt;
&lt;p&gt;&lt;a rel="nofollow" target="_new" href="http://code.google.com/p/morelinq/source/browse/MoreLinq/Batch.cs"&gt;code.google.com/.../Batch.cs&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And it looks like Aziz is working on updating the NuGet build for it as well!&lt;/p&gt;
&lt;p&gt;&lt;a rel="nofollow" target="_new" href="http://nuget.org/packages/morelinq"&gt;nuget.org/.../morelinq&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There&amp;#39;s a ton of other useful things in that project - the *By methods (DistinctBy, MinBy, MaxBy, etc) are incredibly nice to have. &amp;nbsp;Pipe lets me write simple preprocess logic &amp;#39;inline&amp;#39;, etc.&lt;/p&gt;
&lt;p&gt;Know it, learn it, love it. :)&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10369453" width="1" height="1"&gt;</description></item></channel></rss>