<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Parallel Scalability Isn’t Child’s Play, Part 2: Amdahl’s Law vs. Gunther’s Law</title><link>http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx</link><description>Part 1 of this series of blog entries discussed results from simulating the performance of a massively parallel SIMD application on several alternative multi-core architectures. These results were reported by researchers at Sandia Labs and publicized</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Developer Division Performance Engineering blog : Parallel Scalability Isn???t Child???s Play</title><link>http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx#9575060</link><pubDate>Wed, 29 Apr 2009 08:31:23 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9575060</guid><dc:creator>Developer Division Performance Engineering blog : Parallel Scalability Isn???t Child???s Play</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://blogs.msdn.com/ddperf/archive/2009/03/16/parallel-scalability-isn-t-child-s-play.aspx"&gt;http://blogs.msdn.com/ddperf/archive/2009/03/16/parallel-scalability-isn-t-child-s-play.aspx&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>re: Parallel Scalability Isn’t Child’s Play, Part 2: Amdahl’s Law vs. Gunther’s Law</title><link>http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx#9576239</link><pubDate>Wed, 29 Apr 2009 21:25:38 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9576239</guid><dc:creator>Sunny Egbo</dc:creator><description>&lt;p&gt;Parallelism exists everywhere: instruction level, memory level, loop level, task level etc. Hence, parallelism has been with us all along for quite some time now (hardware engineers have been quietly busy for the last several decades solving them for us). The major difference now is that software engineers are being asked to step up to the plate due to the fact that the brick walls created by memory speed and power have now forced CPU architects to go multi-core in a major way (2x Frequency bumps every eighteen months are no longer practical).&lt;/p&gt;
&lt;p&gt;In practice, the theoretical treatment of computing by Amdahl, Gunther or Gustafson can obfuscate the opportunities inherent in parallel computing. These treatments often part ways from reality by glossing over a number of important points in the use of parallelism for general purpose computing, including:&lt;/p&gt;
&lt;p&gt;1. Many user tasks are non-monolithic; they can be solved in a distributed fashion; background tasks (e.g. virus scans) in a way that improves user experiences. &lt;/p&gt;
&lt;p&gt;2. Some algorithms have inefficient sequential solutions but surprisingly efficient parallel solutions. [The fact that many problems that have very poor sequential algorithms are easier to parallelize should be comforting to fans of algorithms.] For example, many applications require matrix multiplication, which turns out to be easily parallelizable. Although the best sequential algorithm for matrix multiplication has a time complexity of O(n2.376), a straight-forward parallel solution has an asymptotic time complexity of O(log n) using n2.376 processor. In other words, we can readily find a parallel solution for matrix multiplication that improves its runtime as more and more processor cores become available.&lt;/p&gt;
&lt;p&gt;3. Some poor sequential algorithms can be (easily) parallelized to execute in less time than their sequential solutions. For example, bubble-sort takes (n2+n)/2 time units when programmed to run sequentially; but it takes (n2 +6n)/4 time units when programmed to run in parallel in two processors. Thus, the parallel solution to bubble sort runs better in a multi-core environment when the size of the list is greater than four elements. Parallel bubble-sort takes even less time when many more processors are available. Hence, at some point, the benefit of the simpler parallelization of some solutions can outweigh the benefit of an equivalent complex but efficient sequential algorithm&lt;/p&gt;
&lt;p&gt;In summary, it is where the rubber meets the road, in practice, that matters.&lt;/p&gt;</description></item><item><title>Parallel Scalability Isn’t Child’s Play, Part 3: The Problem with Fine-Grained Parallelism</title><link>http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx#9712661</link><pubDate>Tue, 09 Jun 2009 07:57:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9712661</guid><dc:creator>Developer Division Performance Engineering blog</dc:creator><description>&lt;p&gt;In the last blog entry in this series , I introduced the model for parallel program scalability proposed&lt;/p&gt;</description></item><item><title>Parallel Scalability Isn’t Child’s Play, Part 3: The Problem with Fine-Grained Parallelism</title><link>http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx#9718400</link><pubDate>Tue, 09 Jun 2009 23:32:48 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9718400</guid><dc:creator>Developer Division Performance Engineering blog</dc:creator><description>&lt;p&gt;In the last blog entry in this series , I introduced the model for parallel program scalability proposed&lt;/p&gt;
</description></item></channel></rss>