<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Developer Division Performance Engineering blog : Parallel programming</title><link>http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx</link><description>Tags: Parallel programming</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Parallel Scalability Isn’t Child’s Play, Part 3: The Problem with Fine-Grained Parallelism</title><link>http://blogs.msdn.com/ddperf/archive/2009/06/09/parallel-scalability-isn-t-child-s-play-part-3-the-problem-with-fine-grained-parallelism.aspx</link><pubDate>Tue, 09 Jun 2009 23:17:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9718379</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/9718379.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=9718379</wfw:commentRss><description>&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT face=Calibri&gt;&lt;A title="Part 2 in this series" href="http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx"&gt;&lt;FONT size=3&gt;In the last blog entry in this series&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3&gt;, I introduced the model for parallel program scalability proposed by Neil Gunther, which I praised for being a realistic antidote to more optimistic, but better known, formulas. Gunther’s model adds a new parameter to the more familiar Amdahl’s law. The additional parameter&lt;I style="mso-bidi-font-style: normal"&gt; k&lt;/I&gt;, representing &lt;I style="mso-bidi-font-style: normal"&gt;coherence&lt;/I&gt;-related delays, enables Gunther’s formula to model behavior where the performance of a parallel program can actually degrade at higher and higher levels of parallelization.&amp;nbsp;&lt;/FONT&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Although I don’t know that the coherence delay factor that Gunther’s formula adds fully addresses the range and depth of the performance issues surrounding fine-grained parallelism, it is certainly one of the key factors Gunther’s law expresses that earlier formulations do not.&lt;/FONT&gt;&lt;/P&gt;&lt;/FONT&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Developers experienced in building parallel programs recognize that Gunther’s formula echoes an inconvenient truth, namely, that the task of achieving performance gains using parallel programming techniques is often quite arduous. For example, in a recent blog entry entitled “&lt;/FONT&gt;&lt;A href="http://software.intel.com/en-us/articles/when-to-say-no-to-parallelism/"&gt;&lt;FONT size=3 face=Calibri&gt;When to Say No to Parallelism&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt;,” Sanjiv Shah, a colleague at Intel, expressed similar sentiments. One very good piece of advice Sanjiv gives is that you should not even be thinking about parallelism until you have an efficient single-threaded version of your program debugged and running. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Let’s continue, for a moment, in the same vein as “&lt;/FONT&gt;&lt;A href="http://software.intel.com/en-us/articles/when-to-say-no-to-parallelism/"&gt;&lt;FONT size=3 face=Calibri&gt;When to Say No to Parallelism&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt;.” Let’s look at the major sources of coherence-related delays in various kinds of parallel programs, how and why they occur, and what, if anything, can be done about them. Ultimately, I will try to tie this discussion into one about tools, especially some great new tools in Visual Studio Team System (see &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/hshafi/archive/2009/05/18/visual-studio-2010-beta-1-parallel-performance-tools.aspx"&gt;&lt;FONT size=3 face=Calibri&gt;Hazim Shafi’s blog&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt; for details) that help you understand contention in your multi-threaded apps. When you use these new tools to gather and analyze the thread contention data for your app, it helps when you understand some of the common patterns to look for. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;The first aspect of the coherence delays Gunther is concerned with that we will look at are the bare &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;I style="mso-bidi-font-style: normal"&gt;minimum&lt;/I&gt; additional costs that a parallel program running multiple threads must pay, compared to running the same program single threaded. To simplify matters, we will look at the best possible prospect for parallel programming, an algorithm that is both &lt;I style="mso-bidi-font-style: normal"&gt;embarrassingly parallel&lt;/I&gt; and easy to partition into roughly equal sized subprogram chunks. There are two basic costs to consider: one that is paid up front for initialization of the parallel runtime environment, and one that is paid incrementally each time one of the parallel tasks executes. It is also worth noting that these are unavoidable costs. I will lump both costs into an &lt;I style="mso-bidi-font-style: normal"&gt;overhead&lt;/I&gt; category associated with Gunther’s coherence delay factor &lt;I style="mso-bidi-font-style: normal"&gt;k&lt;/I&gt;. The embarrassingly parallel programs we will consider&amp;nbsp;here will incurr these minimum processor overhead penalties when they are transformed to execute in parallel.&lt;/FONT&gt;&lt;/P&gt;
&lt;H3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;FONT color=#4f81bd size=3 face=Cambria&gt;Fine grained parallelism. &lt;/FONT&gt;&lt;/H3&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;To frame this part of the discussion, let’s also consider the characterization of workloads and their amenability to parallelization into either &lt;I style="mso-bidi-font-style: normal"&gt;fine-grained&lt;/I&gt; or &lt;I style="mso-bidi-font-style: normal"&gt;coarse-grained&lt;/I&gt; ones. This distinction implicitly recognizes the impact of coherency delay factors on scalability. With fine-grained parallelism, the overhead of setting up the parallel runtime &amp;amp; executing the tasks in parallel can easily exceed the benefits. By definition, the initialization overhead of spinning up multiple threads and dispatching them is not nearly so big an issue when the program lends itself to coarse-grained parallelism. Plus, when you are looking at a very long running process with many opportunities to exploit parallelism, it is important to understand you should only have to incur the setup cost once. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Coarse-grained parallelism occurs when each parallel worker thread is assigned to computing a relatively long running function:&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&lt;IMG style="WIDTH: 338px; HEIGHT: 419px" title="coarse-grained parallelism" alt="coarse-grained parallelism" src="http://5l3vgw.bay.livefilestore.com/y1pa53nIchv7Sk5zdNifva6i3kLD1Eu2MsrdAIOayN3r2OUL9XBpRy72kD_wGhKM1uYeb9TjLCGfyiz9E3hg5bqtg/Coarse-grained%20parallelism%20Fork-Join%20flowchart.jpg" width=338 height=419 mce_src="http://5l3vgw.bay.livefilestore.com/y1pa53nIchv7Sk5zdNifva6i3kLD1Eu2MsrdAIOayN3r2OUL9XBpRy72kD_wGhKM1uYeb9TjLCGfyiz9E3hg5bqtg/Coarse-grained%20parallelism%20Fork-Join%20flowchart.jpg"&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&lt;STRONG&gt;Figure 5. Coarse-grained parallelism.&lt;/STRONG&gt;&lt;/P&gt;&lt;?xml:namespace prefix = o /&gt;&lt;o:wrapblock&gt;&lt;?xml:namespace prefix = v ns = "urn:schemas-microsoft-com:vml" /&gt;&lt;v:shapetype id=_x0000_t75 coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;&lt;v:stroke joinstyle="miter"&gt;&lt;/v:stroke&gt;&lt;v:formulas&gt;&lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 1 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum 0 0 @1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @2 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 0 1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @6 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @8 21600 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @10 21600 0"&gt;&lt;/v:f&gt;&lt;/v:formulas&gt;&lt;v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"&gt;&lt;/v:path&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:lock v:ext="edit" aspectratio="t"&gt;&lt;/o:lock&gt;&lt;/v:shapetype&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;v:shape style="Z-INDEX: 251662336; POSITION: absolute; MARGIN-TOP: 0px; WIDTH: 231.55pt; HEIGHT: 286.85pt; MARGIN-LEFT: 0px; mso-position-horizontal: center" id=_x0000_s1026 type="#_x0000_t75"&gt;&lt;v:imagedata src="file:///C:\Users\markfr\AppData\Local\Temp\msohtmlclip1\01\clip_image001.wmz" o:title=""&gt;&lt;/v:imagedata&gt;&lt;?xml:namespace prefix = w ns = "urn:schemas-microsoft-com:office:word" /&gt;&lt;w:wrap type="topAndBottom"&gt;&lt;/w:wrap&gt;&lt;/v:shape&gt;&lt;FONT size=3 face=Calibri&gt;while fine-grained parallelism looks more like this:&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;IMG style="WIDTH: 338px; HEIGHT: 300px" title="fine-grained parallelism" alt="fine-grained parallelism" src="http://5l3vgw.bay.livefilestore.com/y1pdnte4FSJ18igb7DEZZ2veO2esTg2BykZz6FavZ-37bbyLejUIQlFBc4mboJUbzN3jUp3uq-FV8MA2RGunfMe8Q/Fine-grained%20parallelism%20Fork-Join%20flowchart.jpg" width=338 height=300 mce_src="http://5l3vgw.bay.livefilestore.com/y1pdnte4FSJ18igb7DEZZ2veO2esTg2BykZz6FavZ-37bbyLejUIQlFBc4mboJUbzN3jUp3uq-FV8MA2RGunfMe8Q/Fine-grained%20parallelism%20Fork-Join%20flowchart.jpg"&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;STRONG&gt;Figure 6. Fine-grained parallelism.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;The difference, of course, lies in how long, relatively speaking, the worker thread processing the unit of work executes.&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;The rationale for the fine-grained:coarse-grained distinction is its significance to performance. We identify those parallel algorithms that execute in the worker thread long enough to recover the costs of setting up and running the parallel environment as coarse-grained. The benefits of running such programs in parallel are much easier to realize. On the other hand, the one full proof way to identify fine-grained parallelism is to find embarrassingly parallel programs with very short execution time spans for each parallel task. When executing fine-grained parallel programs, there is a very high risk of slowing down the performance of the program, instead of improving it. (If this sounds like a bit of circular reasoning, it most surely is.) &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Now let’s drill into these costs. In the .NET Framework, setting up a parallel execution environment is usually associated with the ThreadPool object. (If you are not very familiar with how to set up and use a ThreadPool in .NET, you might want to read &lt;/FONT&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/0ka9477y.aspx"&gt;&lt;FONT color=#0000ff size=3 face=Calibri&gt;this&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt; bit of documentation first that shows some simple C# examples. If you really want to understand all the ins and outs of the .NET ThreadPool, you should considering picking up a copy of Joe Duffy’s very thorough and authoritative book, &lt;/FONT&gt;&lt;A href="http://www.amazon.com/Concurrent-Programming-Windows-Microsoft-Development/dp/032143482X/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1241140965&amp;amp;sr=1-1"&gt;&lt;FONT color=#0000ff size=3 face=Calibri&gt;Concurrent Programming in Windows&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt;.) &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&lt;BR style="mso-ignore: vglayout" clear=all&gt;&lt;FONT size=3 face=Calibri&gt;The thing about using the ThreadPool object in .NET is that you don’t need to write a lot of code on your own to get it up and running. With very little coding effort, you can be running a parallel program. In the .NET Framework, there are newer programming constructs in the &lt;/FONT&gt;&lt;A href="http://msdn.microsoft.com/en-us/concurrency/default.aspx"&gt;&lt;FONT color=#0000ff size=3 face=Calibri&gt;Task Parallel Library&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt; that are designed to make it easier for developers to express parallelism and exploit multi-core and many-core computers. But underneath the covers, many of the new TPL constructs are still using the CLR ThreadPool. So whatever overheads are associated with this older, less elegant approach still apply to any of the newer parallel programming constructs.&lt;/FONT&gt;&lt;/P&gt;
&lt;H3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;FONT color=#4f81bd size=3 face=Cambria&gt;The ThreadPool in .NET&lt;/FONT&gt;&lt;/H3&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;The basic pattern for using the ThreadPool is to call the QueueUserWorkItem() method on the ThreadPool class, passing a delegate that performs the actual processing of the request, along with a set of parameters that are wrapped into a singleton Object. The parameters delineate the unit of work that is being requested. Typically, you also pass to the delegate a &lt;A title="ManualResetEvent reference" href="http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent.aspx" mce_href="http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent.aspx"&gt;ManualResetEvent&lt;/A&gt;, what is known as a &lt;EM&gt;synchronization primitive&lt;/EM&gt;. This event is used by the delegate to signal the Main task that the worker thread is finished processing the Work Item request.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Structurally, you have to write:&lt;/FONT&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;DIV style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;a class that wraps the Work Item parameter list,&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;a C# delegate that runs in the worker thread to process the Work Item request,&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;a dispatcher routine that queues work items for the thread pool delegate to process, and&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;finally, an event handler to get control when the worker thread completes.&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/OL&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;For example, to implement the Fork/Join pattern in .NET using the built-in ThreadPool Object, create (1) a wrapper for the parameter list:&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: blue; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;public&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;class&lt;/SPAN&gt; &lt;SPAN style="COLOR: #2b91af"&gt;WorkerThreadParms&lt;/SPAN&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;{&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; TEXT-INDENT: 0.5in; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: blue; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;private&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: #2b91af"&gt;ManualResetEvent&lt;/SPAN&gt; _thisevent;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; TEXT-INDENT: 0.5in; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;…&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: blue; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;public&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: #2b91af"&gt;ManualResetEvent&lt;/SPAN&gt; thisevent&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;{&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; TEXT-INDENT: 0.5in; MARGIN: 0in 0in 0pt 0.5in; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: blue; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;get&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt; { &lt;SPAN style="COLOR: blue"&gt;return&lt;/SPAN&gt; _thisevent; &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; TEXT-INDENT: 0.5in; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;}&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; TEXT-INDENT: 0.5in; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;…&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;public&lt;/SPAN&gt; WorkerThreadParms(&lt;SPAN style="COLOR: #2b91af"&gt;ManualResetEvent&lt;/SPAN&gt; signalwhendone, …,)&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;{&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;_thisevent = signalwhendone;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-tab-count: 2"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;…&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;}&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: auto auto 0pt; mso-add-space: auto; mso-layout-grid-align: none" class=MsoNormalCxSpMiddle&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;}&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;(2) the ThrealPool delegate that unwraps the parameter list, performs some work, then signals the main thread when it is done:&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: blue; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;public&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt; &lt;SPAN style="COLOR: blue"&gt;void&lt;/SPAN&gt; ThreadPoolDelegate(&lt;SPAN style="COLOR: #2b91af"&gt;Object&lt;/SPAN&gt; parm)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;{&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; TEXT-INDENT: 0.5in; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: #2b91af; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;WorkerThreadParms&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt; p = (&lt;SPAN style="COLOR: #2b91af"&gt;WorkerThreadParms&lt;/SPAN&gt;) parm;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="COLOR: #2b91af"&gt;ManualResetEvent&lt;/SPAN&gt; signal = p.thisevent;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;… &lt;SPAN style="COLOR: #00b050"&gt;//Do some work here&lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;…&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="TEXT-INDENT: 0.5in; MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;signal.Set(); &lt;SPAN style="COLOR: #00b050"&gt;//Signal main task when done&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;}&lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;(3) a (simple) dispatcher loop:&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: #2b91af; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;ManualResetEvent&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;[] thisevent = &lt;SPAN style="COLOR: blue"&gt;new&lt;/SPAN&gt; &lt;SPAN style="COLOR: #2b91af"&gt;ManualResetEvent&lt;/SPAN&gt;[tasks];&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; COLOR: blue; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;for&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt; (&lt;SPAN style="COLOR: blue"&gt;int&lt;/SPAN&gt; j = 0; j &amp;lt; tasks; j++)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;{&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; TEXT-INDENT: 0.5in; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;thisevent[j] = &lt;SPAN style="COLOR: blue"&gt;new&lt;/SPAN&gt; &lt;SPAN style="COLOR: #2b91af"&gt;ManualResetEvent&lt;/SPAN&gt;(&lt;SPAN style="COLOR: blue"&gt;false&lt;/SPAN&gt;);&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="COLOR: #2b91af"&gt;WorkerThreadParms&lt;/SPAN&gt; p = &lt;SPAN style="COLOR: blue"&gt;new&lt;/SPAN&gt; &lt;SPAN style="COLOR: #2b91af"&gt;WorkerThreadParms&lt;/SPAN&gt;(thisevent[j],…);&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="COLOR: #2b91af"&gt;WorkerThread&lt;/SPAN&gt; worker = &lt;SPAN style="COLOR: blue"&gt;new&lt;/SPAN&gt; &lt;SPAN style="COLOR: #2b91af"&gt;WorkerThread&lt;/SPAN&gt;();&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="COLOR: #2b91af"&gt;ThreadPool&lt;/SPAN&gt;.QueueUserWorkItem (worker.ThreadPoolDelegate,p);&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0in 0pt; mso-layout-grid-align: none" class=MsoNormal&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;}&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;o:p&gt;&lt;FONT size=3 face=Calibri&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;followed by (4) a WaitForMultipleObjects in the Main thread:&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #2b91af; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;ManualResetEvent&lt;/SPAN&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; FONT-SIZE: 9pt; mso-no-proof: yes"&gt;.WaitAll(thisevent);&lt;/SPAN&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-SIZE: 9pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;FONT face=Calibri&gt;&lt;FONT size=3 face=Calibri&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;You can see there isn’t very much code for you to write to get up and running in parallel and start taking advantage of all those multi-core processor resources. You should take Sanjiv Khan’s advice and write &amp;amp; debug the delegate code you intend to parallelize by testing it in a single threaded mode first. Once the single threaded program is debugged and optimized, you can easily restructure the program to run in parallel by encapsulating that processing in your worker thread delegate following this simple recipe.&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;Even though there isn’t very much code for you to write, there are overhead considerations that you need to be aware of. Let’s look at the simplest case where the program is embarrassingly parallel (as discussed in the previous blog entry). This allows us to ignore complications introduced by serialization and locking (which we will get to later). These overheads include &lt;/P&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P style="TEXT-INDENT: -0.25in; MARGIN: 0in 0in 0pt 0.5in; mso-list: l0 level1 lfo1" class=MsoListParagraphCxSpFirst&gt;&lt;SPAN style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3 face=Calibri&gt;(1)&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT size=3 face=Calibri&gt;work done in the Common Language Runtime (CLR) on your behalf to spin up the worker threads in the ThreadPool initially, and &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="TEXT-INDENT: -0.25in; MARGIN: 0in 0in 10pt 0.5in; mso-list: l0 level1 lfo1" class=MsoListParagraphCxSpLast&gt;&lt;SPAN style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3 face=Calibri&gt;(2)&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT size=3 face=Calibri&gt;the additional cost when running the parallel program associated with queuing work items, dispatching them to a thread pool thread to process, and signaling the main dispatcher thread when done. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;In the case of the initialization work, this is something that only needs to be done once. The other costs accrue each time you need to dispatch a worker thread. For parallelism to be an effective scaling strategy, it is necessary to amortize this overhead cost over the life of the parallel threads. The worker thread delegates need to execute long enough that there is a benefit to executing in parallel. And this is for a best case for parallelism where the underlying program is both embarrassingly parallel and easy to partition into roughly equivalent work requests. &lt;/FONT&gt;&lt;/P&gt;
&lt;H3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;FONT color=#4f81bd size=3 face=Cambria&gt;A Parallel.For example.&lt;/FONT&gt;&lt;/H3&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Using the new Parallel.For construct in the Task Parallel Library, by the way, there is even less code you have to write. All you need to code the Parallel.For is write is the worker thread delegate, because the TPL library handles the remaining boilerplate tasks. However, the underlying overhead considerations are almost identical.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;The challenge of speeding up programs that have fine-grained parallelism grows in tandem with making it easier for developers to write concurrent programs. Take a look at the following example of parallelization in C# using the new Parallel.For construct taken verbatim from a Microsoft white paper entitled “&lt;A title="Taking Parallelism Mainstream" href="http://download.microsoft.com/download/D/5/9/D597F62A-0BEE-4CE7-965B-099D705CFAEE/Taking%20Parallelism%20Mainstream%20Microsoft%20February%202009.docx" mce_href="http://download.microsoft.com/download/D/5/9/D597F62A-0BEE-4CE7-965B-099D705CFAEE/Taking%20Parallelism%20Mainstream%20Microsoft%20February%202009.docx"&gt;Taking Parallelism Mainstream&lt;/A&gt;.” The white paper describes some of the new language facilities in the Task Parallel Library that make it easier for developers to write parallel programs. These new language facilities include Parallel For loops, Parallel LINQ, Parallel Invoke, Futures and Continuations, and messaging using asynchronous agents. To the extent that these parallel computing initiatives succeed, they will generate the need for better performance analysis tools to understand the performance of concurrent programs because not everyone who implements these new constructs is going to see impressive speed-up of his or her applications. Some developers will even see the retrograde performance predicted by Gunther’s scalability formula.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Here’s the C# program that illustrates one of the new parallel programming constructs:&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;IEnumerable&amp;lt;StockQuote&amp;gt; Query(IEnumerable&amp;lt;StockQuote&amp;gt; stocks) {&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;var results = new ConcurrentBag&amp;lt;StockQuote&amp;gt;();&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Parallel.ForEach (stocks, stock =&amp;gt; {&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;if (stock.MarketCap &amp;gt; 100000000000.0 &amp;amp;&amp;amp;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;stock.ChangePct &amp;lt; 0.025 &amp;amp;&amp;amp;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;stock.Volume&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&amp;gt; 1.05 * stock.VolumeMavg3M) {&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;results.Add(stock);&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;}&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;});&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;return results;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 0pt" class=MsoNormal&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; COLOR: #4f81bd; FONT-SIZE: 9pt; mso-themecolor: accent1"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;}&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;The example uses a &lt;I style="mso-bidi-font-style: normal"&gt;Parallel.ForEach&lt;/I&gt; enumeration loop, along with one of the new concurrent collection classes, the &lt;I style="mso-bidi-font-style: normal"&gt;ConcurrentBag&lt;/I&gt;, to evaluate stock prices based on some set of selection criteria. The problem, the white paper author notes, is one that is considered “embarrassingly parallel” because it is easily decomposed into independent sub-problems that can be executed concurrently. The new C# Task Parallel Library language features provide an elegant way to express this parallelism. Underneath the Parallel.For expression is a run-time library that understands how to partition the body of the parallel For loop into multiple work items, and dispatches them to separate worker threads that are then scheduled to execute concurrently. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;The .NET Task Parallel Library (TPL) provides the run-time machinery to turn this program into a parallel program. At run-time, it automatically parallelizes the lambda expression that the &lt;I style="mso-bidi-font-style: normal"&gt;Parallel.For&lt;/I&gt; construct references. In this example, the Task Parallel Library takes the If Statement in the lambda expression and queues it up to run in parallel using the concurrent runtime library in .NET. The concurrent runtime library creates a thread pool and then delegates the processing of the lambda expression to these worker threads. The concurrent runtime attempts to allocate and schedule an optimal number of worker threads to this parallel task.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;While TPL makes it easier to express parallelism in your programs and eliminates most of the grunt work in setting up the runtime environment associated with parallel threads, it cannot guarantee that running this program in parallel on a machine with four or eight processors will actually speed up its execution time. That is the essence of the challenge of fine-grained parallelism. There is overhead associated with queuing work items to the thread pool the parallel run-time manages. This is overhead that the serial version of the program does not encounter. Only when the amount of work done inside the lambda expression executes for a long enough time does the benefit of parallelizing the lambda expression exceeds this cost, which must be amortized over each concurrent execution of the inner body of the Parallel.For loop. Note that this particular set of overheads is unavoidable. When you are dealing with fine-grained parallelism, the overhead of setting up the parallel run-time environment alone often exceeds the potential benefit of executing in parallel, notwithstanding other possible sources of contention-related delays that could further slow down execution time.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;So, it is important to realize that a code sample like the one I have taken here from the parallel programming white paper was chosen to illustrate the expressive power of the new language constructs. The example was intended to show the &lt;EM&gt;pattern&lt;/EM&gt; that developers should adhere to -- I am certain it wasn't intended to illustrate something specific you would actual do. When you are taking advantage of these new parallel programming language extensions, you need to be aware of the fine-grained:coarse-grained distinction.&amp;nbsp;This is emphatically not an example of a program that you will necessarily speed up by running it in parallel. It will take considerably more anlysis to figure out if parallelism is the right solution here. Speeding up a serial program by running portions of it in parallel isn’t always easy – even when that program has sections that are “embarrassingly parallel.”&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN style="mso-spacerun: yes"&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;What we’d like to be able to do is to estimate the actual performance improvement we can expect of this parallel program, compared to its original serial version. “Embarrassingly parallel” is another way of saying that, once parallelized, the serial portion of this program is expected to be quite small. However, this is also an example of fine-grained parallelism because the amount of code associated with the lambda expression that is passed to worker thread delegate to execute is also quite small. An experienced developer at this point should be asking whether the overhead of creating these working threads and dispatching them might, in fact, be greater than the benefit of executing this task in parallel. This relatively fixed overhead is especially important when the amount of work performed by each delegate is quite small – there is only a limited opportunity to amortize this overhead to initiate and manage the multithreaded operation across the execution time of each of the worker threads. It is extremely important to understand this in the case of fine-grained parallelism.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Next, we will turn to coarse-grained parallelism, where the odds are much better that you may be able to speed up program execution substantially using concurrent programming techniques. In the next blog entry, I will start to tackle more promising examples. The analysis of the performance costs associated with parallel programming tasks will become more complex. I will try to illustrate this analysis using a concrete programming example that will simulate coarse-grained parallelism. As we look at how this simple programming example scales on a multi-core machine, it will bring us face-to-face with the pitfalls even experienced developers can expect to encounter when they attempt to parallelize their existing serial applications. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;We will also start to look into the analysis tools that we are available in the next version of the Visual Studio Profiler that greatly help with understanding the performance of your .NET parallel program. In the meantime, if you'd like to get a head start on these new tools in the Visual Studio Profiler , be sure to check out &lt;A href="http://blogs.msdn.com/hshafi/archive/2009/05/18/visual-studio-2010-beta-1-parallel-performance-tools.aspx"&gt;&lt;FONT size=3 face=Calibri&gt;Hazim Shafi’s blog&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt; for more details.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/o:wrapblock&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9718379" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance/default.aspx">Performance</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Beta/default.aspx">Beta</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Visual+Studio+Profiler/default.aspx">Visual Studio Profiler</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Visual+Studio+Team+Developer/default.aspx">Visual Studio Team Developer</category></item><item><title>Are we taking advantage of Parallelism?</title><link>http://blogs.msdn.com/ddperf/archive/2009/05/02/are-we-taking-advantage-of-parallelism.aspx</link><pubDate>Sun, 03 May 2009 01:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9584046</guid><dc:creator>Sunny Egbo</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/9584046.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=9584046</wfw:commentRss><description>&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Recently, a colleague of mine, Mark Friedman, posted a blog titled “&lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx#9576239" mce_href="http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx#9576239"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT color=#0000ff size=3&gt;Parallel Scalability Isn’t Child’s Play&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;” in which he reviewed the merits of Amdahl Law vs. Gunther’s Law for determining the practical limits to parallelization. I would not argue with the premise of Mark’s blog that Parallelism is not child’s play. However, I do have alternate views of the use of Amdahl Law and Gunther’s Law that I posted on his blog. &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;I think my views and comments on Mark’s blog warrant another blog to fully explain.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Speaking of child’s play: my 10-year old son recently made a two-part movie titled “&lt;I style="mso-bidi-font-style: normal"&gt;the Way&lt;/I&gt;” and “&lt;I style="mso-bidi-font-style: normal"&gt;the Way Back&lt;/I&gt;” complete with a full storyline, multiple sound tracks and narrations. He put these movies together with only the help of his eight-year old sister, using sample movie clips and stock photographs he found on his computer hard drive. He asked me for help getting his two masterpieces onto a DVD capable of playing on the average home DVD player. Also, he asked about the length of a typical movie playing in movie theaters around the U.S. (approximately 2 hours) and how much these movies cost at the movie theater (approximately $12 for adults and $8 for children, minus the popcorn). Based on my answers, he determined that he will charge 25 cents for people to watch his movies, because he wanted everyone to attend. I wanted to ask him how much he would charge someone who decided to watch only one of the clips. However, I didn’t because I did not want to lose a price haggling war with a 10-year old. Besides, it would be terrible if you cannot find your way back.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;In any case, his movies were quite impressive. The most technologically savvy thing I did as a 10-year old kid was to build a telephone line with tomato soup cans and a string. Movie making was out of reach for me; but now it is child’s play.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Today, parallelism is not child’s play. However, I hold out hope that in the future the typical computer program would be written with parallelism in mind. Is parallelism ever going to be child’s play in the future the way movie making is today? &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Parallelism exists everywhere: &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;instruction level, memory level, loop level and task level parallelism, etc. Also, parallelism has been with us for quite some time now. For the past several decades, hardware engineers have quietly been busy solving problems in parallel to improve processor and system level performance. However, for the past four or more years, hardware designers have encountered the twin brick walls created by memory speed and power. These walls have forced CPU architects and hardware designers to go multi-core in a major way. The doubling of the CPU frequency every 18 months, that was true for many decades, are no longer practical and have come to an abrupt end. Although, hardware performance continues to improve as my colleagues and I pointed out in our blog “&lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://blogs.msdn.com/ddperf/archive/2008/06/18/lessons-from-the-test-lab-investigating-a-pleasant-surprise.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2008/06/18/lessons-from-the-test-lab-investigating-a-pleasant-surprise.aspx"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT color=#0000ff size=3&gt;Investigating a Pleasant Surprise&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;FONT size=3&gt;&lt;FONT face="Times New Roman"&gt;,&lt;/FONT&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;” the pace of CPU frequency increases has slowed considerably. Instead, hardware designers have been doubling the number of cores available on a single CPU socket every couple of years. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;To get the same level of performance that was previously possible, software engineers would now need to step up to the plate—to write software in a parallel and scalable fashion. They would need tools and frameworks that allow them to think about their problems, identify opportunities for parallelism and to analyze their solutions correctly and efficiently. &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;I am a big fan of Amdahl Law as an analysis framework. However, I do not subscribe to the narrow view that Amdahl’s Law applies only to parallelism, as most people who write about it seem to imply. I prefer the broader treatment of the Law by Hennessy and Patterson in their famous book “&lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.amazon.com/Computer-Architecture-Fourth-Quantitative-Approach/dp/0123704901/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1241294120&amp;amp;sr=1-1" mce_href="http://www.amazon.com/Computer-Architecture-Fourth-Quantitative-Approach/dp/0123704901/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1241294120&amp;amp;sr=1-1"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT color=#0000ff size=3&gt;Computer Architecture: A Quantitative Approach&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;”—where Amdahl’s Law is used to estimate the opportunities between competing designs. Amdahl’s Law is very powerful for showing the areas that will likely yield the most fruitful performance gains. In my performance design, tuning and optimization work, I use Amdahl Law for prioritizing the areas of opportunities to focus my efforts to gain performance.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Amdahl’s Law is not the limit to either absolute performance or parallelism as many authors seem to suggest. Gunther’s and Gustafson’s Laws are helpful for putting Amdahl’s Law in perspective. However, like Amdahl’s Law, these laws are not fundamental limits. The use of these three laws to estimate the level of parallelism that is possible is very flawed. Specifically, the use of these laws as fundamental limits can obfuscate the level of parallelism and performance inherent in typical computing problems. These laws gloss over a number of important points and practical aspects of obtaining parallelism in general purpose computing, including that:&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3&gt;
&lt;P style="MARGIN-LEFT: 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;1.&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;Many user tasks are non-monolithic and can be solved in a distributed fashion. Background tasks (e.g., virus scans) that often block single processor execution can now be done in a way that improves user experiences. The key is to identify unnecessary dependencies that would allow these tasks to proceed in parallel with other tasks in a multi-core computer.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN-LEFT: 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;2.&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;Some algorithms that have inefficient sequential solutions surprisingly have efficient parallel solutions. This fact should be comforting to fans of algorithms. For example, many applications require matrix multiplication, which turns out to be easily parallelizable. Although the best sequential algorithm for matrix multiplication has a time complexity of O(n&lt;SUP&gt;2.376&lt;/SUP&gt;), a straight-forward parallel solution has an asymptotic time complexity of O(log n) using n&lt;SUP&gt;2.376&lt;/SUP&gt; processor.&amp;nbsp;In other words, we can readily find a parallel solution for matrix multiplication that improves its runtime as more and more processor cores become available. Of course, you might have difficulty conceiving of n&lt;SUP&gt;2.376&lt;/SUP&gt; processors in a system--as a colleague mentioned recently. However, this is just another way of saying that matrix multiplication will benefit with more and more processors.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN-LEFT: 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;3.&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;Some poor sequential algorithms can be easily parallelized to execute in less time than their sequential solutions. Also, we know that&amp;nbsp;some algorithms that have the best asymptotic time complexities achieve&amp;nbsp;their speed&amp;nbsp;by introducing&amp;nbsp;data dependencies that make parallelization&amp;nbsp;difficult and that the best asymptotic time complexity does not necessarily translate to the best runtime in real life. Hence, at some point, the benefit of the simpler parallelization of some&amp;nbsp;poor sequential algorithms that have little data dependencies&amp;nbsp;can outweigh the benefit of&amp;nbsp;more efficient sequential counterparts that have data dependencies. Hence, when considering parallel solutions it is not always necessary to start with the sequential solution with the best time complexity [also, see comment about Fortune and Wylie below].&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN-LEFT: 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;4.&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;The real world performance of applications is not determined exclusively by the asymptotic time complexity of algorithms. Because of the increasing gap between CPU and memory speed, memory accesses are increasingly dominating the performance of applications running on modern CPUs. Although, the gap can be mitigated with large caches, every cache miss takes hundreds of CPU cycles to complete. Even a modest overlap in these memory accesses (Memory Level Parallelism) can improve application performance in noticeable ways.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN-LEFT: 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-SIZE: 11pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Over the years, there have been efforts to classify computationally intractable problems. Many decision problems (i.e., Yes/No) and their optimization counterparts have been categorized into NP-complete and NP-Hard sets respectively. The Travelling Salesman (TSP), Online Bin-Packing and 3-Dimensional Matching problems are three famous examples of NP-Complete problems. In a similar fashion, problems that are difficult to parallelize have been categorized into the P-Complete set or the set of problems that are known to be inherently sequential. As you can imagine sorting is not P-Complete. Likewise, Matrix Multiplication is not in the P-Complete set. Processor scheduling can be done in O(log n) time units using &lt;I style="mso-bidi-font-style: normal"&gt;n&lt;/I&gt; processors—so, it is not P-Complete either. In an ultimate twist of irony, many NP-Hard problems have heuristic solutions that can be executed in parallel to approximate the real solutions. Hence, the natural inclination to think that NP-Complete problems cannot be parallelized is not borne out in practice.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;As it turns out, the real limit to parallelism seem not to be defined by Amdahl’s Law, Gunther’s Law, Gustafson’s Law or NP-Completeness, but by the P-Complete set. It appears that parallelizable problems are related to the asymptotic space complexity of their sequential solutions. According to the Fortune and Wylie’s Parallel Processing Thesis, any problem that can be solved with a poly-logarithmic space complexity can be parallelized efficiently. Because of&amp;nbsp;the time space trade-off of algorithms, this&amp;nbsp;implies that the sequential algorithm that achieves this&amp;nbsp;space complexity is not necessarily the&amp;nbsp;algorithm with&amp;nbsp;the best asymptotic time complexity. &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;In any case,&amp;nbsp;because one can evaluate problems on multiple levels beyond algorithms (e.g., at the instruction, memory and data access, loop and task levels), the set of problems that can be parallelized appears to be quite large. The question is how to identify and take advantage of the parallelization opportunities that may be inherently available and to do so in an efficient and scalable manner. How can we parallelize loops? How do we&amp;nbsp;overlap high latency activities such as accesses to physical memory or I/O to amortize the cost of those activities? How do we minimize synchronizations? How do we partition tasks to eliminate bottlenecks&amp;nbsp;from the&amp;nbsp;critical paths? How do we dispatch work efficiently to improve efficient system utilization, improve throughput and improve latency? What areas of our application can benefit from what sets of efforts? These are some of the questions that allow for scalable designs.&amp;nbsp;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Today, the tools to identify parallelism and scalability opportunities are very limited. The programming languages that allow programmers to express parallelism in a natural way are completely lacking. The tools to analyze and troubleshoot parallel implementations are limited as well. Debugging parallel implementation is particularly hard. However, I suspect that with some industry focus and incremental progress, we could continue to make parallelism accessible to average programmers in a few years. However, we are many years away.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size=3&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'"&gt;W&lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;hat are some of the fundamental limits preventing such tools to be built? Like Mark said on his blog, &lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'"&gt;achieving improved scalability using parallel programming techniques is certainly very challenging. But, can parallel programming be made less challenging with intuitive tools that expose parallel solutions in a natural way and allow programmers to exploit them? &lt;/SPAN&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;Can programming languages and tools improve to a point where a typical 10-year old will be able to write a parallel program as easily as they can put together a multi-track movie today?&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;o:p&gt;&lt;FONT size=3&gt;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT size=3&gt;Sunny Egbo&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9584046" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance+Engineering/default.aspx">Performance Engineering</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance/default.aspx">Performance</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Hardware/default.aspx">Hardware</category></item><item><title>Parallel Scalability Isn’t Child’s Play, Part 2: Amdahl’s Law vs. Gunther’s Law</title><link>http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx</link><pubDate>Wed, 29 Apr 2009 07:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9575026</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/9575026.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=9575026</wfw:commentRss><description>&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;&lt;A title="Link to Parallel Scalability Part 1" href="http://blogs.msdn.com/ddperf/archive/2009/03/16/parallel-scalability-isn-t-child-s-play.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2009/03/16/parallel-scalability-isn-t-child-s-play.aspx"&gt;Part 1 of this series of blog entries&lt;/A&gt; discussed results from simulating the performance of a massively parallel SIMD application on several alternative multi-core architectures. These results were reported by researchers at Sandia Labs and publicized in a press release. Neil Gunther, my colleague from the Computer Measurement Group (CMG), referred to the Sandia findings as evidence supporting his &lt;I style="mso-bidi-font-style: normal"&gt;universal scalability law&lt;/I&gt;. This blog entry investigates Gunther’s model of parallel programming scalability, which, unfortunately, is not as well known as it should be. Gunther’s insight is especially useful in the current computing landscape, which is actively embracing parallel computing using multi-core workstations &amp;amp; servers.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Gunther’s scalability formula for parallel processing is a useful antidote to any overly optimistic expectations developers might have about the gains to be had from applying parallel programming techniques. Where Amdahl’s law can be used to establish a theoretical &lt;I style="mso-bidi-font-style: normal"&gt;upper limit&lt;/I&gt; to the speed-up that parallel programming techniques can provide, Gunther’s law can also model the retrograde performance that we frequently observe when parallel computing is used. In other words, Gunther’s scalability formula encapsulates the behavior we frequently observe where adding more and more processors to a parallel processing workload can result in &lt;I style="mso-bidi-font-style: normal"&gt;degraded&lt;/I&gt; performance. It is a more realistic model for people who adopt parallel programming techniques to enhance the scalability of their applications on multi-core hardware. So, without in any way trying to diminish enthusiasm for the entire enterprise, it is crucial to understand that achieving improved scalability using parallel programming techniques can be very challenging.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;As I discussed &lt;/FONT&gt;&lt;A href="http://www.cmg.org/measureit/issues/mit44/m_44_18.html"&gt;&lt;FONT color=#0000ff size=3 face=Calibri&gt;in a review of Gunther’s last book&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt;, Gunther’s law adds another parameter to the well-known Amdahl’s Law. Gunther calls this parameter &lt;I style="mso-bidi-font-style: normal"&gt;coherence&lt;/I&gt;. Parallel programs have additional costs associated with maintaining the “coherence” of shared data structures, memory locations that are accessed and updated by threads executing in parallel. By incorporating these coherence-related delays, Gunther’s formula is able to model the retrograde performance that all too frequently is observed empirically. The blue line marked “Conventional” in the chart Sandia Labs published (&lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/ddperf/archive/2009/03/16/parallel-scalability-isn-t-child-s-play.aspx"&gt;&lt;FONT size=3 face=Calibri&gt;Figure 1&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt; in the earlier blog) is a scalability curve that Gunther correctly cites is consistent with his model.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Let’s drill into the mathematics here for a moment. What Gunther’s calls his Universal Scalability law is an extension to the familiar multiprocessor scalability formula first suggested by Gene Amdahl. In &lt;/FONT&gt;&lt;A href="http://en.wikipedia.org/wiki/Amdahl%27s_law"&gt;&lt;FONT color=#0000ff size=3 face=Calibri&gt;Amdahl's law&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt;, &lt;I style="mso-bidi-font-style: normal"&gt;p&lt;/I&gt; is the proportion of a program that can be parallelized, leaving &lt;I style="mso-bidi-font-style: normal"&gt;1 −p&lt;/I&gt; to represent the part of the program that cannot be parallelized and remains serial. Amdahl’s insight was that the &lt;I style="mso-bidi-font-style: normal"&gt;1-p &lt;/I&gt;amount of time spent in the serial execution portion of the program creates an upper bound on how much its performance can be improved when parallelized. &lt;/FONT&gt;&lt;/P&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:wrapblock&gt;&lt;?xml:namespace prefix = v ns = "urn:schemas-microsoft-com:vml" /&gt;&lt;v:shapetype id=_x0000_t75 coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;&lt;v:stroke joinstyle="miter"&gt;&lt;/v:stroke&gt;&lt;v:formulas&gt;&lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 1 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum 0 0 @1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @2 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 0 1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @6 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @8 21600 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @10 21600 0"&gt;&lt;/v:f&gt;&lt;/v:formulas&gt;&lt;v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"&gt;&lt;/v:path&gt;&lt;o:lock v:ext="edit" aspectratio="t"&gt;&lt;/o:lock&gt;&lt;/v:shapetype&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;v:shape style="Z-INDEX: 251668480; POSITION: absolute; MARGIN-TOP: 72.45pt; WIDTH: 276.5pt; HEIGHT: 234.2pt; MARGIN-LEFT: 0px; mso-position-horizontal: center" id=_x0000_s1026 type="#_x0000_t75"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;v:imagedata src="file:///C:\Users\markfr\AppData\Local\Temp\msohtmlclip1\01\clip_image001.wmz" o:title=""&gt;&lt;/v:imagedata&gt;&lt;?xml:namespace prefix = w ns = "urn:schemas-microsoft-com:office:word" /&gt;&lt;w:wrap type="topAndBottom"&gt;&lt;/w:wrap&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/v:shape&gt;&lt;/o:wrapblock&gt;&lt;BR style="mso-ignore: vglayout" clear=all&gt;&lt;FONT size=3 face=Calibri&gt;Consider a sequential program that we want to speed up using parallel programming techniques. An old-fashioned way to think about this is to identify some portion of the program, &lt;I style="mso-bidi-font-style: normal"&gt;p&lt;/I&gt;, that can be executed in parallel, and then implement a &lt;B style="mso-bidi-font-weight: normal"&gt;Fork()&lt;/B&gt; to spawn parallel tasks, followed by a &lt;B style="mso-bidi-font-weight: normal"&gt;Join()&lt;/B&gt; to unify the processing and carry on sequentially afterwards. Conceptually, something like this:&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;IMG style="WIDTH: 365px; HEIGHT: 335px" title="Fork Join" alt="Fork Join" src="http://5l3vgw.bay.livefilestore.com/y1prpgpboRXSzjQVnNYycq6GJuJ8R8HJIlojyRLhYinSz8MbLbSRl-3NN9tSD_qBRNoLp4SGLDZHIzUL0yvuqRj9GczCqOudgK_/Fork-Join%20flowchart.jpg" width=365 height=335 mce_src="http://5l3vgw.bay.livefilestore.com/y1prpgpboRXSzjQVnNYycq6GJuJ8R8HJIlojyRLhYinSz8MbLbSRl-3NN9tSD_qBRNoLp4SGLDZHIzUL0yvuqRj9GczCqOudgK_/Fork-Join%20flowchart.jpg"&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;STRONG&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; COLOR: #002060; FONT-SIZE: 10pt; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-size: 11.0pt"&gt;Figure 3.&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;STRONG&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; COLOR: #002060; FONT-SIZE: 10pt; FONT-WEIGHT: normal; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-bidi-font-weight: bold; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-size: 11.0pt"&gt; Parallel processing using a Fork/Join.&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;SPAN&gt;&lt;o:p&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Amdahl’s law simply observes that your ability to speed up this program using parallelism is a function of the proportion of the time, &lt;I style="mso-bidi-font-style: normal"&gt;p&lt;/I&gt;, spent in the parallel portion of the program, compared to &lt;I style="mso-bidi-font-style: normal"&gt;s&lt;/I&gt;, the time spent in the serial parts of the program. (Note that &lt;I style="mso-bidi-font-style: normal"&gt;p + s = 1&lt;/I&gt;, in this formulation.) Amdahl’s observation was meant as a direct challenge to hardware architects who were advocating building parallel computing hardware. It was also easy for those advocates of parallel computing approaches to dismiss Amdahl’s remarks since Dr. Amdahl was clearly invested in trying to build faster processors, no matter the cost.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Advocates of parallel computing, of course, are not blind to the hazards of the parallel processing approach. Scalability of the underlying hardware is one challenge. An even bigger challenge is writing multi-threaded programs. For starters, it is often far more difficult to conceptualize a parallel solution than a serial one. (We can speculate that this may simply be a function of the way our minds tend to work.) Parallel programs are also notoriously more difficult to debug. When you are debugging a multi-threaded program running in parallel on parallel hardware, events don’t always occur in the exact same sequence. This is known as &lt;I style="mso-bidi-font-style: normal"&gt;non-determinism&lt;/I&gt;, and it often leads to huge problems for developers because, for instance, it may be very difficult to reproduce the exact timing sequence that exposes an error in your logic.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Furthermore, once you manage to get your programs to run correctly in a parallel processing mode the performance wins of doing so are not a given. In the course of celebrating the performance wins they do get, developers can sometimes diminish an appreciation for how difficult it was to achieve those gains. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Notwithstanding the difficulties that need to be overcome, compelling reasons to look at parallel computation remain, including trying to solve problems that simply just won’t fit inside the largest computers we can build. Today, there is renewed interest in parallel programming because it is difficult for hardware designers to make processors run at higher and higher clock speeds using current semiconductor fabrication technology without them consuming excessive amounts of power and generating excessive amounts of heat in the process that must be dissipated. Power and cooling considerations are driving parallel computing today for portables, desktops, and servers.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;STRONG&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Comparing Gunther’s formula to Amdahl’s law. &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Meanwhile, Amdahl’s original insight remains relevant today. From Amdahl’s law, we understand that, no matter what degree of parallelism is achieved, the execution time of a program’s serial portion is a practical upper bound on the performance of its parallel counterpart. As an example, Figure 1 plots the scalability curve using Amdahl’s law where p = 0.9, when just 10% of the program remains serial. Notice that Amdahl’s law predicts the performance of a parallel program will level off as more and more processors are added. As you can, see Amdahl’s law shows diminishing returns from increasing the level of parallelism. You can see how the parallel approach becomes less and less cost-effective as more and more processors are added.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;IMG style="WIDTH: 334px; HEIGHT: 286px" title="Amdahl's Law vs. Gunther's Law" alt="Amdahl's Law vs. Gunther's Law" src="http://5l3vgw.bay.livefilestore.com/y1pN7Q-dLDfRpTM8sHTBUMnORaHYkqs1fnOq57tgpYfyBjjbKKjQ8XZRTTazyt9cNaxt2X31QRdackdQL_gF0tcM9PTxce5Hw8-/Amdahl%20vs%20Gunther%20laws.jpg" width=334 height=286 mce_src="http://5l3vgw.bay.livefilestore.com/y1pN7Q-dLDfRpTM8sHTBUMnORaHYkqs1fnOq57tgpYfyBjjbKKjQ8XZRTTazyt9cNaxt2X31QRdackdQL_gF0tcM9PTxce5Hw8-/Amdahl%20vs%20Gunther%20laws.jpg"&gt;&lt;IMG style="WIDTH: 0px; HEIGHT: 0px" title="Amdahl's Law &amp;amp; Gunther's Law" alt="Amdahl's Law &amp;amp; Gunther's Law" src="http://5l3vgw.bay.livefilestore.com/y1pN7Q-dLDfRpTM8sHTBUMnORaHYkqs1fnOq57tgpYfyBjjbKKjQ8XZRTTazyt9cNaxt2X31QRdackdQL_gF0tcM9PTxce5Hw8-/Amdahl%20vs%20Gunther%20laws.jpg" mce_src="http://5l3vgw.bay.livefilestore.com/y1pN7Q-dLDfRpTM8sHTBUMnORaHYkqs1fnOq57tgpYfyBjjbKKjQ8XZRTTazyt9cNaxt2X31QRdackdQL_gF0tcM9PTxce5Hw8-/Amdahl%20vs%20Gunther%20laws.jpg"&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="LINE-HEIGHT: 115%; COLOR: #002060; FONT-SIZE: 10pt; mso-bidi-font-size: 11.0pt"&gt;Figure 4.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="LINE-HEIGHT: 115%; COLOR: #002060; FONT-SIZE: 10pt; mso-bidi-font-size: 11.0pt"&gt; &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;A comparison of Amdahl’s Law to Gunther’s Universal Scalability Model&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="LINE-HEIGHT: 115%; COLOR: #002060; FONT-SIZE: 10pt; mso-bidi-font-size: 11.0pt"&gt;&lt;o:p&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;Given that Amdahl was mainly acting as an advocate for building faster serial CPUs, something that he wanted to do anyway, his is by no means the last word on the subject. Researchers in numerical computing like the ones in Sandia Labs were encouraged a few years later by a paper from one of their own. John Gustafson of Sandia Labs published a well-known paper in 1988 entitled “&lt;/FONT&gt;&lt;A href="http://www.scl.ameslab.gov/Publications/Gus/AmdahlsLaw/Amdahls.html"&gt;&lt;FONT color=#0000ff size=3&gt;Reevaluating Amdahl's Law&lt;/FONT&gt;&lt;/A&gt;&lt;FONT color=#000000 size=3&gt;” that adopted a much more optimistic stance to parallel programming. The essence of Gustafson’s argument is that when parallel processing resources become available, programmers will jigger their software to take advantage of them:&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0.3in 10pt" class=MsoNormal&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-SIZE: 9pt"&gt;&lt;FONT color=#000000&gt;One does not take a fixed-size problem and run it on various numbers of processors except when doing academic research; in practice, &lt;I&gt;the problem size scales with the number of processors&lt;/I&gt;. When given a more powerful processor, the problem generally expands to make use of the increased facilities. Users have control over such things as grid resolution, number of timesteps, difference operator complexity, and other parameters that are usually adjusted to allow the program to be run in some desired amount of time. Hence, it may be most realistic to assume that &lt;I&gt;run time, not problem size&lt;/I&gt;, is constant.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;Gustafson’s counter-argument does not refute Amdahl’s law so much as suggest there might be creative ways to work&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;around it. It encouraged parallel programming researchers to keep plugging away, pursuing creative ways to sidestep Amdahl’s law. Microsoft’s Herb Sutter, &lt;/FONT&gt;&lt;A href="http://www.ddj.com/architect/205900309"&gt;&lt;FONT color=#0000ff size=3&gt;in his popular Dr. Dobbs Journal column back in January 2008&lt;/FONT&gt;&lt;/A&gt;&lt;FONT color=#000000 size=3&gt;, cited Gustafson’s Law favorably to offer similar encouragement to software developers today that need to re-fashion their code to take advantage of parallel processing in the many-core, multi-core era. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;Gunther’s augmentation of Amdahl’s law is grounded empirically, providing a more realistic assessment of scalability using parallel programming technology. Gunther’s formula is similar, but adds another parameter to Amdahl’s law, κ,&lt;SPAN style="mso-fareast-font-family: 'Times New Roman'; mso-fareast-theme-font: minor-fareast"&gt; &lt;/SPAN&gt;that represents something called &lt;I style="mso-bidi-font-style: normal"&gt;coherency&lt;/I&gt; delay: &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"&gt;&lt;v:shapetype id=_x0000_t75 coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;&lt;FONT color=#000000&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;EM&gt;&lt;STRONG&gt;C(p) = p/(1+s(p-1) + kp(p-1))&lt;/STRONG&gt;&lt;/EM&gt;&lt;/FONT&gt;&lt;/v:shapetype&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"&gt;&lt;v:shapetype coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;&lt;FONT color=#000000&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;To show how the two formulas behave, in Figure 2 above, Amdahl’s law is compared to Gunther’s law for a program with the same 10% serial portion. I set the coherency delay factor in Gunther’s formula to 0.001. When a coherency delay is also modeled, notice that parallel scalability is no longer monotonically increasing as processors are added. When we allow for some amount of coherency delay, there comes a point when overall performance levels off and ultimately begins to &lt;EM&gt;decrease&lt;/EM&gt;. Gunther’s formula not only models the performance of a parallel program that encounters diminishing returns from increased levels of parallelism, it also highlights the performance degradation that can occur when the communication and coordination-related delays introduced by multiple threads needing to synchronize access to shared data structures becomes excessive.&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;Gunther’s formula lumps all the delays associated with communication and coordination among threads that require access to shared data structures into one factor &lt;EM&gt;k&lt;/EM&gt; that he calls &lt;EM&gt;coherence&lt;/EM&gt;. Unfortunately, Gunther himself provides little help in telling us how to estimate &lt;EM&gt;k,&lt;/EM&gt; the crucial coherency delay factor, beforehand, or measure it after the fact. Presumably, &lt;EM&gt;k&lt;/EM&gt; includes delays associated with accessing critical sections of code that are protected by shared locks, as well as instruction execution delays in the hardware associated with maintaining the “coherence” of shared data kept in processor caches that are accessed and updated by concurrently running threads. There are also additional “overheads” associated with spinning up multiple worker threads, queuing up work items for them to process, controlling their execution, and coordinating their ultimate completion that are new to the parallel processing environment that are all absent from the serial version of the same program.&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;As a practical developer trying to understand the behavior of my parallel application, personally, I would find Gunther’s formula much more useful if it helped me identify the sources of coherency delays my parallel programs encounter that are impacting its scalability. It would also be useful if Gunther’s insight could help me guided me as I work to try to eliminate or reduce these obstacles to scalability. That is the main subject of&lt;A title="forward pointer" href="http://blogs.msdn.com/ddperf/archive/2009/06/09/parallel-scalability-isn-t-child-s-play-part-3-the-problem-with-fine-grained-parallelism.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2009/06/09/parallel-scalability-isn-t-child-s-play-part-3-the-problem-with-fine-grained-parallelism.aspx"&gt; the next blog entry in this series&lt;/A&gt;. &lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;&lt;/FONT&gt;&lt;/v:shapetype&gt;&lt;/SPAN&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9575026" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance/default.aspx">Performance</category></item><item><title>Parallel Scalability Isn’t Child’s Play</title><link>http://blogs.msdn.com/ddperf/archive/2009/03/16/parallel-scalability-isn-t-child-s-play.aspx</link><pubDate>Mon, 16 Mar 2009 20:39:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9481780</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>9</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/9481780.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=9481780</wfw:commentRss><description>&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;In &lt;A title="Neil Gunther's blog" href="http://perfdynamics.blogspot.com/2009/02/poor-scalability-on-multicore.html" mce_href="http://perfdynamics.blogspot.com/2009/02/poor-scalability-on-multicore.html"&gt;a recent blog entry&lt;/A&gt;, Dr. Neil Gunther, a colleague from the Computer Measurement Group (CMG), warned about unrealistic expectations being raised with regard to the performance of parallel programs on current multi-core hardware. Neil’s blog entry highlighted a dismal parallel programming experience publicized &lt;/FONT&gt;&lt;A title="Sandia Labs multi-core press release" href="http://www.sandia.gov/news/resources/releases/2009/multicore.html" mce_href="http://www.sandia.gov/news/resources/releases/2009/multicore.html"&gt;&lt;FONT color=#0000ff size=3 face=Calibri&gt;in a recent press release&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt; from the Sandia Labs in Albuquerque, New Mexico. Sandia Labs is a research facility operated by the U.S. Department of Energy. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;According to the press release, scientists at Sandia Labs &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="LINE-HEIGHT: normal; MARGIN: 0in 0.2in 10pt" class=MsoNormal&gt;&lt;SPAN style="FONT-SIZE: 9pt"&gt;&lt;FONT face=Calibri&gt;simulated key algorithms for deriving knowledge from large data sets. The simulations show a significant increase in speed going from two to four multicores, but an insignificant increase from four to eight multicores. Exceeding eight multicores causes a decrease in speed. Sixteen multicores perform barely as well as two, and after that, a steep decline is registered as more cores are added.” They concluded that this retrograde speed-up was due to deficiencies in “memory bandwidth as well as contention between processors over the memory bus available to each processor.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Holy cow. The Lab’s scientists, who are heavily invested in parallel programming on supercomputers, simulated running programs on sixteen cores encapsulating “key algorithms for deriving knowledge from large data sets” that gave no better performance than running the same program on two cores. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Please note that these are simulated performance results, because 16-core machines of the type being simulated don’t currently exist. Indeed, I would not expect that 16-core machines of the type being simulated would ever exist. Which leads me to wonder what the point of this Sandia Labs exercise was.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Of course, for developers experienced in parallel programming, this result actually isn’t in itself all that surprising. Quite frequently, experienced developers find that running their multi-threaded application on massively parallel hardware does not scale well with the hardware capabilities. This was apparently the case for the applications the Sandia Labs folks simulated. So what? Should we just give up in our quest for parallel program scalability? &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Before drilling into Dr. Gunther’s specific interest in this disclosure, it is worth looking into the Sandia Labs finding in a bit more detail. For instance, did anyone, besides me, wonder what applications were being simulated?&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;In theory, “deriving knowledge from large data sets” is a category of computing program that readily lends itself to a solution using an &lt;/FONT&gt;&lt;A href="http://en.wikipedia.org/wiki/SIMD"&gt;&lt;FONT size=3 face=Calibri&gt;SIMD&lt;/FONT&gt;&lt;/A&gt;&lt;FONT size=3 face=Calibri&gt; (&lt;B style="mso-bidi-font-weight: normal"&gt;S&lt;/B&gt;ingle &lt;B style="mso-bidi-font-weight: normal"&gt;I&lt;/B&gt;nstruction, &lt;B style="mso-bidi-font-weight: normal"&gt;M&lt;/B&gt;ultiple &lt;B style="mso-bidi-font-weight: normal"&gt;D&lt;/B&gt;ata) approach. The canonical example of an SIMD approach to “deriving knowledge from large data sets” is a database Search function conducted in parallel where the data set of interest is partitioned across &lt;I style="mso-bidi-font-style: normal"&gt;n&lt;/I&gt; processing units and their locally attached disks. For example, when the Thinking Machines CM-1 supercomputer publicly debuted in the mid-80s, the company demonstrated its capabilities using a parallel search of a database that was partitioned across all 64K nodes of the machine, which was based on the Connection Machine originally designed by MIT whiz kid Danny Hillis. Parallel search when executed across a partitioned dataset should scale linearly, or close enough for government work (pun intended). &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;Whenever a problem lends itself to an SIMD approach (also known as “divide and conquer”), linear scalability of the SIMD algorithm does require first partitioning the data being accessed and then proceeding to process that data in parallel. I am sure the point of the Sandia Labs press release was not to disparage the SIMD approach to parallel processing; after all, that is a tried-and-true technique that they have used with great success over the years. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;On the contrary, it appears to be a critique of an approach to building parallel processing hardware&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;where you would increase the number of processing cores on the chip (just because you can with the most current semiconductor fabrication technology) without scaling the memory bandwidth proportionally. Since that is not what is happening hardware-wise, it strikes me that this implied criticism of the multi-core hardware&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;strategy Intel and AMD are pursuing is slaying a non-existent dragon. Both Intel and AMD recognize that memory bus bandwidth is a significant potential bottleneck in their multi-core products, and, as a result, both manufacturers are attempting to scale memory bandwidth proportional to the amount of processing power they deliver on a chip.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3 face=Calibri&gt;So then what is all the fuss about? The Sandia Labs “news” starts to look like something the blogosphere is latching onto on an otherwise slow day for tech news, raising an alarm &amp;amp; potentially misleading naïve readers about what the conventional wisdom in multiprocessor chip architecture would be if anyone were actually trying to build multi-core microprocessors that way.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;STRONG&gt;Building a better multicore processor.&lt;/STRONG&gt;&lt;/P&gt;&lt;FONT size=3 face=Calibri&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;The point of the Sandia Labs press release publicizing these simulation results appears to be to suggest what they consider better approaches to packaging multi-core processors on a single socket. They released the following chart that that makes this point (reproduced here in Figure 1):&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&lt;IMG style="WIDTH: 450px; HEIGHT: 353px" title="Sandia Labs multicore simulation results" alt="Sandia Labs multicore simulation results" src="http://5l3vgw.bay.livefilestore.com/y1pr4F4aoYifbSInSEBRcbQ9TBEARzKw87EyYk2bricI-CoyRgTN--dE7SeFYj-q7Ll9D3mJePubLw_-B_yrrSvOQ/Sandia%20Labs%20simulated%20multicore%20performance%20(smaller).jpg" width=450 height=353 mce_src="http://5l3vgw.bay.livefilestore.com/y1pr4F4aoYifbSInSEBRcbQ9TBEARzKw87EyYk2bricI-CoyRgTN--dE7SeFYj-q7Ll9D3mJePubLw_-B_yrrSvOQ/Sandia%20Labs%20simulated%20multicore%20performance%20(smaller).jpg"&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoCaption&gt;&lt;STRONG&gt;&lt;FONT size=2&gt;&lt;FONT color=#4f81bd&gt;&lt;SPAN style="mso-no-proof: yes"&gt;Figure 1&lt;/SPAN&gt;. Sandia Labs simulation showing performance of their application vs. the number of processors.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;FONT color=#4f81bd size=2&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;Exactly what the Sandia Labs folks are reporting here is a little sketchy. Presumably, the simulations are based on observing the behavior of some of their key programs where they were able to measure performance running on “conventional” multi-core processors, perhaps, something like the quad-core machine I recently installed for my desktop that uses a memory bus with bandwidth in the range of 10 GB/sec. The press release seems to imply that the Sandia Labs baseline measurements were taken on current quad-core machines from Intel like mine, not the newer Nehalem processors where the memory architecture has been re-worked extensively. How useful or meaningful the results that Sandia Labs published may turn on this crucial point.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;This Sandia Labs simulation then extrapolates out to 16 cores per socket (and beyond), simulating the manufacturer adding more cores to the die, apparently &lt;I style="mso-bidi-font-style: normal"&gt;leaving the memory architecture fundamentally unchanged&lt;/I&gt; as they moved to more cores. The Sandia Labs chart in Figure 1 is labeled to indicate that the memory bandwidth was held constant at 10 GB/sec. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;This is more than a little suspicious. Hardware manufacturers like Intel and AMD understand clearly that the memory bus has to scale with the number of processors. The AMD &lt;/FONT&gt;&lt;A title="HyperTransport specifications" href="http://www.hypertransport.org/default.cfm?page=TechnologyLowLatency" mce_href="http://www.hypertransport.org/default.cfm?page=TechnologyLowLatency"&gt;&lt;FONT size=3&gt;HyperTransport&lt;/FONT&gt;&lt;/A&gt;&lt;FONT color=#000000 size=3&gt; bus architecture is quite explicit about this, and the latest spec for &lt;/FONT&gt;&lt;A title="HyperTransport 3.1" href="http://blogs.msdn.com/controlpanel/blogs/Exactly%20what%20the%20Sandia%20Labs%20folks%20are%20reporting%20here%20is%20a%20little%20sketchy.%20Presumably,%20the%20simulations%20are%20based%20on%20observing%20the%20behavior%20of%20some%20of%20their%20key%20programs%20where%20they%20were%20able%20to%20measure%20performance%20running%20on%20“conventional”%20multi-core%20processors,%20perhaps,%20something%20like%20the%20quad-core%20machine%20I%20recently%20installed%20for%20my%20desktop%20that%20uses%20a%20memory%20bus%20with%20bandwidth%20in%20the%20range%20of%2010%20GB/sec.%20The%20press%20release%20seems%20to%20imply%20that%20the%20Sandia%20Labs%20baseline%20measurements%20were%20taken%20on%20current%20quad-core%20machines%20from%20Intel%20like%20mine,%20not%20the%20newer%20Nehalem%20processors%20where%20the%20memory%20architecture%20has%20been%20re-worked%20extensively.%20How%20useful%20or%20meaningful%20the%20results%20that%20Sandia%20Labs%20published%20may%20turn%20on%20this%20crucial%20point." mce_href="http://blogs.msdn.com/controlpanel/blogs/Exactly what the Sandia Labs folks are reporting here is a little sketchy. Presumably, the simulations are based on observing the behavior of some of their key programs where they were able to measure performance running on “conventional” multi-core processors, perhaps, something like the quad-core machine I recently installed for my desktop that uses a memory bus with bandwidth in the range of 10 GB/sec. The press release seems to imply that the Sandia Labs baseline measurements were taken on current quad-core machines from Intel like mine, not the newer Nehalem processors where the memory architecture has been re-worked extensively. How useful or meaningful the results that Sandia Labs published may turn on this crucial point."&gt;&lt;FONT size=3&gt;HyperTransport version 3.1&lt;/FONT&gt;&lt;/A&gt;&lt;FONT color=#000000 size=3&gt; has an aggregate bandwidth in excess of 50 GB/sec.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;Meanwhile, the memory bus capacity on the latest &lt;/FONT&gt;&lt;A title="Nhealem architecture announcement" href="http://blogs.msdn.com/ddperf/archive/2008/04/01/thoughts-on-intel-s-recent-hardware-announcements.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2008/04/01/thoughts-on-intel-s-recent-hardware-announcements.aspx"&gt;&lt;FONT size=3&gt;Nehalem&lt;/FONT&gt;&lt;/A&gt;&lt;FONT color=#000000 size=3&gt;-class processors from Intel has been boosted significantly. Alternatively, it is when you cannot scale the memory bus with processor capacity that machines with &lt;/FONT&gt;&lt;A title="Blogging about NUMA 2008" href="http://blogs.msdn.com/ddperf/archive/2008/06/10/mainstream-numa-and-the-tcp-ip-stack-part-i.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2008/06/10/mainstream-numa-and-the-tcp-ip-stack-part-i.aspx"&gt;&lt;FONT size=3&gt;NUMA&lt;/FONT&gt;&lt;/A&gt;&lt;FONT color=#000000 size=3&gt; architectures become more attractive. The AMD processors use HyperTransport links&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;in a ring topology that implicitly leads to NUMA-characteristics. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;In Intel’s approach to NUMA scalability, some small number of processors share a common memory bus, forming a &lt;I style="mso-bidi-font-style: normal"&gt;node&lt;/I&gt;. Current Nehalem machines (also known as the Core i7 architecture) have four cores sharing the Front-side memory bus (FSB). The physical layout of this chip is photographed in Figure 2, showing four cores, connected to DDR3 DRAM using an integrated memory controller. I wasn’t able to come find a speed rating for the FSB in the Nehalem on Intel’s web site or elsewhere, other than ballpark estimates that puts its speed in the range of 30-40 GB/sec. The QuickConnect technology links that are used to link memory controllers support 25 GB/sec transfers, which is probably a safe lower bound on the capacity of the FSB. &lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;&lt;IMG style="WIDTH: 526px; HEIGHT: 363px" title="Core i7 4-way multiprocessor photo" alt="Core i7 4-way multiprocessor photo" src="http://5l3vgw.bay.livefilestore.com/y1pS_IQwDypWmRE8pD4mgMliuhbypb0uOI730CaN7MKi5QtXsiDyzMJ9eE4o2-kp03n19hsvrPV-MEMRRbv9L1d3Q/Nehalem%20multicore%20chip%20photo.jpg" width=526 height=363 mce_src="http://5l3vgw.bay.livefilestore.com/y1pS_IQwDypWmRE8pD4mgMliuhbypb0uOI730CaN7MKi5QtXsiDyzMJ9eE4o2-kp03n19hsvrPV-MEMRRbv9L1d3Q/Nehalem%20multicore%20chip%20photo.jpg"&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;FONT color=#4f81bd size=2&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoCaption&gt;&lt;STRONG&gt;Figure 2. Aerial photograph showing the layout of the 4-way Core i7 (Nehalem) microprocessor chip.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;The &lt;/FONT&gt;&lt;A href="https://cfwebprod.sandia.gov/cfdocs/CCIM/docs/pim-mpi.pdf"&gt;&lt;FONT color=#0000ff size=3&gt;PIM architecture&lt;/FONT&gt;&lt;/A&gt;&lt;FONT color=#000000 size=3&gt;, whose scalability curves are close to ideal for the Sandia Labs workloads is, probably not coincidentally, a processor architecture championed at Sandia Labs. The idea behind PIM machines &lt;SPAN style="LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; FONT-SIZE: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"&gt;(&lt;U&gt;P&lt;/U&gt;rocessor &lt;U&gt;I&lt;/U&gt;n &lt;U&gt;M&lt;/U&gt;emory) &lt;/SPAN&gt;is that the processor (or processors) is embedded into the memory chip itself, which is a pretty interest approach to solving the “memory wall” that limits performance in today’s dominant computer architectures. Instead of loading up the microprocessor socket with more and more cores, which is the professed hardware roadmap at Intel &amp;amp; AMD, integrating memory into the socket is an intriguing alternative. Such machines, if anyone were to build them, would obviously have NUMA performance characteristics.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT color=#000000 size=3&gt;The debate is a bit academic for my taste, however, until these PIM architecture machines are a reality. For PIM architecture machines to ever get traction, either the microprocessor manufacturers would have to start building DRAM chips or the DRAM manufacturers would have to start building microprocessors. The way the semiconductor fabrication business is stratified today, that does not appear to be very likely in the near future.&lt;/FONT&gt;&lt;/P&gt;&lt;FONT color=#000000&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3&gt;So, in the end, the point of the Sandia Labs press release appears to be trying to publicize the multiprocessor hardware direction espoused mainly by Sandia Labs’ own researchers. Frankly, there have been lots and lots of different architectural approaches to parallel processing over the years, and it doesn’t look like any one approach is optimal for all computing situations. You ought to be pick another parallel programming workload to simulate in Figure 1 and get an entirely different ranking of the approaches.&lt;/FONT&gt;&lt;/P&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoNormal&gt;&lt;FONT size=3&gt;Still, the Sandia Labs simulation data are interesting mainly for they say about how difficult it is going to be for developers to write parallel programs that scale well on multi-core machines. No, achieving parallel isn’t child’s play for hardware manufacturers. Nor is it for software developers attempting to take advantage of parallel processing hardware, which is the subject I will start to drill into next time.&lt;/FONT&gt;&lt;/P&gt;&lt;/FONT&gt;
&lt;P style="MARGIN: 0in 0in 10pt" class=MsoCaption&gt;&lt;A title="Continue to Part 2" href="http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2009/04/29/parallel-scalability-isn-t-child-s-play-part-2-amdahl-s-law-vs-gunther-s-law.aspx"&gt;Continue to Part 2....&lt;/A&gt;.&lt;/P&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9481780" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Hardware/default.aspx">Hardware</category></item><item><title>Mainstream NUMA and the TCP/IP stack: Final Thoughts</title><link>http://blogs.msdn.com/ddperf/archive/2008/09/18/mainstream-numa-and-the-tcp-ip-stack-final-thoughts.aspx</link><pubDate>Fri, 19 Sep 2008 00:18:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8957878</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/8957878.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=8957878</wfw:commentRss><description>&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;This is a continuation of Part IV of this article posted &lt;A class="" title=Link-back-to-Part4 href="http://blogs.msdn.com/ddperf/archive/2008/09/09/mainstream-numa-and-the-tcp-ip-stack-part-iv-paralleling-tcp-ip.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2008/09/09/mainstream-numa-and-the-tcp-ip-stack-part-iv-paralleling-tcp-ip.aspx"&gt;&lt;FONT color=#666666&gt;here&lt;/FONT&gt;&lt;/A&gt;.&amp;nbsp;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;Note that a final version of a white paper tying this series of five blog entries together (and a Powerpoint presentation on the subject) are attached.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;For many years, the effort to improve network performance on Windows and other platforms focused on reducing the host processing requirements associated with the need to service frequent interrupts from the NIC. In the many-core era where the clock speeds of processors are constrained by power considerations, this strategy is inadequate to the growing host processing requirements that accompany high-speed networking. It is necessary to augment technologies like interrupt moderation and TCP Offload Engine that improve the efficiency of network I/O with an approach that allows TCP/IP Receive packets to be processed in parallel across multiple CPUs. Together, MSI-X and RSS are technologies that enable host processing of TCP/IP packets to scale in the many-core world, albeit not without some compromises with the prevailing model of networking using isolated, layered components.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT face=Calibri&gt;Using MSI-X and RSS, for example, the Intel 82598 10 Gigabit Ethernet Controller mentioned earlier can be mapped to a maximum of 16 processor cores that could then be devoted to networking I/O interrupt handling. Capacity-wise, this is still not sufficient processing capacity to handle the theoretical maximum load equation 3 predicts for a 10 Gb Ethernet card, but it does represent a substantial scalability improvement.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;With this understanding of what MSI-X and RSS accomplishes, let’s return for a moment to our NUMA server machine shown in Figure 6 below.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;IMG title="NUMA server with multiple RSS queues" style="WIDTH: 364px; HEIGHT: 633px" height=633 alt="NUMA server with multiple RSS queues" src="http://5l3vgw.bay.livefilestore.com/y1pP1tl3lheOVmfXixoNk6WzdhcLnXhAbVSW28AD1IJ3YyHN1ZbYhAQygJHF1fesNHfPK3ehJ6yE4w/Simple%20Two%20Node%20NUMA%20Server%20with%20two%20RSS%20Queues%20(vertical%20orientation).jpg" width=364 mce_src="http://5l3vgw.bay.livefilestore.com/y1pP1tl3lheOVmfXixoNk6WzdhcLnXhAbVSW28AD1IJ3YyHN1ZbYhAQygJHF1fesNHfPK3ehJ6yE4w/Simple%20Two%20Node%20NUMA%20Server%20with%20two%20RSS%20Queues%20(vertical%20orientation).jpg"&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;With MSI-X and Receive-Side Scaling, CPU 0 on node A and CPU 1 on node B are both enabled for processing network interrupts. Since RSS schedules the NDIS DPC to run on the same processor as the ISR, even at moderate networking loads, CPU 0 and 1 for all practical purposes become dedicated to the processing of high priority networking interrupts. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Numerous economies of scale accrue using this approach. The same RSS process that sends all Receive packets from a single TCP connection to a specific CPU for processing improves the efficiency of that processing. The instruction execution rate of the TCP/IP protocol stack is enhanced significantly through this scheduling mechanism that enforces localization. Ultimately, TCP/IP application data buffers need to be allocated from local node memory and processed by threads confined to that node. Recently used data and instructions that networking ISRs and DPCs issue tend to reside in the dedicated cache (or caches) associated with the processor devoted to network I/O. Or, at the very least, they migrate to the last level cache that is shared by all the processors on the same NUMA node.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Ultimately, of course, the TCP layer hands data from the network I/O to an application layer that is ready to receive and process it. The implications of RSS for the application threads that process TCP receive packets and build responses for TCP/IP to send back to network clients ought to be obvious, but I will spell them out anyway. For optimal performance, these application processing threads also need to be directed to run on the same NUMA node where the TCP Receive packet was processed. This localization of the application’s threads should, of course, be subject to other load balancing considerations to prevent the ideal node from becoming severely over-committed while other CPUs on other nodes are idling or under-utilized. The performance penalty for an application thread that must run on a different node than the one that processed the original TCP/IP Receive packet is considerable because it must access the data payload of the request remotely. Networked applications need to understand these performance and capacity considerations and schedule their threads accordingly to balance the work across NUMA nodes optimally.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Consider the ASP.NET application threads that process incoming HTTP Requests and generate HTTP Response messages. If the HTTP Request packet is processed by CPU 0 on node A in a NUMA machine, the Request packet payload is allocated in node A local memory. The ASP.NET application thread running in User mode that processes that incoming HTTP Request will run much more efficiently if it is scheduled to run on one of the other processors on node A, where it can access the payload and build the Response message using local node memory. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;There is currently no mechanism in Windows today for kernel mode drivers like ndis.sys and http.sys to communicate to the application layers above them and specify the NUMA node on which that packet was originally processed. Communicating that information to the application layer is another grievous violation of the principle of isolation in the network protocol stack, but it is a necessary step to improve the performance of networking applications in the many-core era where even moderately sized server machines have NUMA characteristics.&lt;/FONT&gt;&lt;/P&gt;
&lt;H3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;BR&gt;&lt;FONT face=Cambria color=#4f81bd size=2&gt;Links.&lt;/FONT&gt;&lt;/H3&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Herb Sutter, “The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software.” Dr. Dobb’s Journal, March 1, 2005. &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.ddj.com/architect/184405990" mce_href="http://www.ddj.com/architect/184405990"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri color=#0000ff&gt;http://www.ddj.com/architect/184405990&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;NTttcp performance testing tool: &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.microsoft.com/whdc/device/network/TCP_tool.mspx" mce_href="http://www.microsoft.com/whdc/device/network/TCP_tool.mspx"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri color=#0000ff&gt;http://www.microsoft.com/whdc/device/network/TCP_tool.mspx&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Windows Performance Toolkit (WPT, aka xperf): &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/cc305218.aspx" mce_href="http://msdn.microsoft.com/en-us/library/cc305218.aspx"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: xperfLink"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;http://msdn.microsoft.com/en-us/library/cc305218.aspx&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: xperfLink"&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: xperfLink"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;David Kanter, “The Common System Interface: Intel's Future Interconnect,” &lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.realworldtech.com/includes/templates/articles.cfm?ArticleID=RWT082807020032&amp;amp;mode=print" mce_href="http://www.realworldtech.com/includes/templates/articles.cfm?ArticleID=RWT082807020032&amp;amp;mode=print"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: TheCommonSystemInterface"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;http://www.realworldtech.com/includes/templates/articles.cfm?ArticleID=RWT082807020032&amp;amp;mode=print&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: TheCommonSystemInterface"&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: TheCommonSystemInterface"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Windows NUMA support: &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/aa363804.aspx" mce_href="http://msdn.microsoft.com/en-us/library/aa363804.aspx"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: WindowsNUMAsupport"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;http://msdn.microsoft.com/en-us/library/aa363804.aspx&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: WindowsNUMAsupport"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: WindowsNUMAsupport"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN style="mso-bookmark: WindowsNUMAsupport"&gt;&lt;/SPAN&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Intel white paper: &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://www.intel.com/technology/ioacceleration/306484.pdf" mce_href="http://www.intel.com/technology/ioacceleration/306484.pdf"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;Accelerating High-Speed Networking with Intel® I/O Acceleration Technology&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Mark B. Friedman, “&lt;/FONT&gt;&lt;A class="" title=SANCapacityPlanningLink name=SANCapacityPlanningLink&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://www.demandtech.com/Resources/Papers/Intro%20to%20SAN%20capacity%20planning.pdf" mce_href="http://www.demandtech.com/Resources/Papers/Intro%20to%20SAN%20capacity%20planning.pdf"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bookmark: SANCapacityPlanningLink"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;An Introduction to SAN Capacity Planning&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bookmark: SANCapacityPlanningLink"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: SANCapacityPlanningLink"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;,” &lt;I style="mso-bidi-font-style: normal"&gt;Proceedings&lt;/I&gt;, Computer Measurement Group, Dec. 2001.&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Jeffrey Mogul’s “TCP offload is a dumb idea whose time has come,” &lt;I style="mso-bidi-font-style: normal"&gt;Proceedings&lt;/I&gt; of the 9th conference on Hot Topics in Operating Systems - Volume 9, 2003. &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://portal.acm.org/citation.cfm?id=1251059&amp;amp;dl=ACM&amp;amp;coll=portal&amp;amp;CFID=71988909&amp;amp;CFTOKEN=98964748" mce_href="http://portal.acm.org/citation.cfm?id=1251059&amp;amp;dl=ACM&amp;amp;coll=portal&amp;amp;CFID=71988909&amp;amp;CFTOKEN=98964748"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bookmark: TCPOffloadDumbIdeaLink"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;http://portal.acm.org/citation.cfm?id=1251059&amp;amp;dl=ACM&amp;amp;coll=portal&amp;amp;CFID=71988909&amp;amp;CFTOKEN=98964748&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bookmark: TCPOffloadDumbIdeaLink"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;SPAN style="mso-bookmark: TCPOffloadDumbIdeaLink"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN style="mso-bookmark: TCPOffloadDumbIdeaLink"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: IOATwhitepaper"&gt;&lt;/SPAN&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Dell Computer Corporation, “&lt;/FONT&gt;&lt;A class="" title=DellTCPOffloadwhitepaper name=DellTCPOffloadwhitepaper&gt;&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://www.dell.com/downloads/global/vectors/ps3q06-20060132-Broad_com.pdf" mce_href="http://www.dell.com/downloads/global/vectors/ps3q06-20060132-Broad_com.pdf"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: DellTCPOffloadwhitepaper"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;Boosting Data Transfer with TCP Offload Engine Technology&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: DellTCPOffloadwhitepaper"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: DellTCPOffloadwhitepaper"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Microsoft Corporation, KB 951037, &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://support.microsoft.com/kb/951037" mce_href="http://support.microsoft.com/kb/951037"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;http://support.microsoft.com/kb/951037&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Microsoft Corporation, Windows Driver Development Kit (DDK) documentation, &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/cc264906.aspx" mce_href="http://msdn.microsoft.com/en-us/library/cc264906.aspx"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;http://msdn.microsoft.com/en-us/library/cc264906.aspx&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Microsoft Corporation, KB 927168, &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://support.microsoft.com/kb/927168" mce_href="http://support.microsoft.com/kb/927168"&gt;&lt;FONT color=#0000ff&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bookmark: KB927168"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;http://support.microsoft.com/kb/927168&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bookmark: KB927168"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bookmark: KB927168"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bookmark: KB927168"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;FONT face=Calibri&gt;Microsoft Corporation,&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;NDIS 6.0 Receive-Side Scaling documentation, &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/ms795609.aspx" mce_href="http://msdn.microsoft.com/en-us/library/ms795609.aspx"&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bookmark: KB927168"&gt;&lt;SPAN style="mso-bookmark: NDIS6ReceiveSideScaling"&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin"&gt;&lt;FONT face=Calibri color=#0000ff&gt;http://msdn.microsoft.com/en-us/library/ms795609.aspx&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;SPAN style="mso-bookmark: KB927168"&gt;&lt;SPAN style="mso-bookmark: NDIS6ReceiveSideScaling"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="mso-bookmark: NDIS6ReceiveSideScaling"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: KB927168"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: WindowsDDKLink"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: KB951037"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bookmark: ReskitPerfGuidebook"&gt;&lt;/SPAN&gt;&lt;SPAN style="mso-bidi-font-size: 9.0pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8957878" width="1" height="1"&gt;</description><enclosure url="http://cid-12a53f90793d2c8b.skydrive.live.com/self.aspx/DDPEBlogImages/Presentations%20and%20Papers/Mainstream%20NUMA%20and%20the%20TCP%20|5CMG%20paper%208220%20draft|6.docx" length="24241" type="text/html; charset=utf-8" /><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance+Engineering/default.aspx">Performance Engineering</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance/default.aspx">Performance</category></item><item><title>Mainstream NUMA &amp; the TCP/IP stack: Part 2: Programming ccNUMA machines</title><link>http://blogs.msdn.com/ddperf/archive/2008/07/27/mainstream-numa-and-the-tcp-ip-stack-part-i-programming-ccnuma-machines.aspx</link><pubDate>Sun, 27 Jul 2008 21:02:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8780016</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/8780016.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=8780016</wfw:commentRss><description>&lt;P&gt;This is a continuation of Part I of this article posted &lt;A class="" title="Link-back to Part 1" href="http://blogs.msdn.com/ddperf/archive/2008/06/10/mainstream-numa-and-the-tcp-ip-stack-part-i.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2008/06/10/mainstream-numa-and-the-tcp-ip-stack-part-i.aspx"&gt;here&lt;/A&gt;.&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;In Part 1 of this article, we looked at the capacity issues that are driving architectural changes in the TCP/IP networking stack. While network interfaces are increasing in throughput capacity, processor speeds in the multi-core era are not keeping pace. Meanwhile, the TCP/IP protocol has grown in complexity so that host processing requirements are increasing, too. The only way for networked computers to scale in the multi-core era is to begin distributing networking I/O operations across multiple processors. Since bigger server machines rely on NUMA architectures for scalability, high speed networking is also evolving to exploit machines with NUMA architectures in an optimal fashion.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Machines with NUMA (non-uniform memory access speeds) architectures are usually large scale multiprocessors that are assembled using building blocks, or &lt;I style="mso-bidi-font-style: normal"&gt;nodes&lt;/I&gt;, that each contain some number of CPUs, some amount of RAM, and various other peripheral connections. Nodes are often configured on separate boards, for example, or specific segments of a board. Multiple nodes are then interconnected with high speed links of some sort that permit all the memory that is configured to be available to executing programs. There are many schools of thought on what the best interconnection technology is. Some manufacturers favor tree structures, some favor directory schemes, some favor network-like routing. A key feature of the architecture is that the latency of a memory fetch depends on the physical location of the RAM being accessed. Accessing RAM attached to the local node is faster than a memory fetch to a remote location that is physically located on another node. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Within &lt;A class="" title="Nehalem Hyperlink1" href="http://www.realworldtech.com/includes/templates/articles.cfm?ArticleID=RWT040208182719&amp;amp;mode=print" mce_href="http://www.realworldtech.com/includes/templates/articles.cfm?ArticleID=RWT040208182719&amp;amp;mode=print"&gt;one of the new Intel Nehalem many-core microprocessor&lt;/A&gt;, for example, all the processor cores and their logical processors can access local memory at a uniform speed. Figure 3 is a schematic diagram depicting a 4-way Nehalem multiprocessor chip that is connected to a bank of RAM. The configuration of processors and RAM shown in Figure 3 is a building block that is used in creating a larger scale machine by connecting two or more of such nodes together. &lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;IMG title="Quad-core node" style="WIDTH: 362px; HEIGHT: 520px" height=520 alt="Quad-core node" src="http://5l3vgw.bay.livefilestore.com/y1pysaX_fyaHyL_hZhhGIyXP5RhSKILXbj8AXnupeLec_hHtxoKXb6Z48TZYahS02yXpSrpH6b9-mY/Quad-core%20Node%20Drawing.jpg" width=362 mce_src="http://5l3vgw.bay.livefilestore.com/y1pysaX_fyaHyL_hZhhGIyXP5RhSKILXbj8AXnupeLec_hHtxoKXb6Z48TZYahS02yXpSrpH6b9-mY/Quad-core%20Node%20Drawing.jpg"&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; mso-bidi-font-size: 11.0pt"&gt;Figure 3.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="FONT-SIZE: 9pt; mso-bidi-font-size: 11.0pt"&gt; &lt;EM&gt;A schematic diagram depicting a NUMA node showing locally-attached RAM and a multi-core socket.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/EM&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;A two-node NUMA server is illustrated in Figure 4, which shows a direct connection between the memory controller on node A and the memory controller on node B. This is the relatively simple case. A thread executing on node A can access any RAM location on either node, but an access to a local memory address is considerably faster. The latency to access to a remote memory location is several times slower. (Definitive timings are not available as of this writing because early versions of the hardware are just starting to become available.)&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;A class="" title="Two-node NUMA server based on Nehalem" href="http://5l3vgw.bay.livefilestore.com/y1pNycwWsLj-tQabletYlwpg3Jnn7wvCJGYF-7IKnkz7PITD2CeK6cTdNqU3uDM8GRBK0iw64sEKZ8/Simple%20Two%20Node%20NUMA%20Server%20Drawing%20(vertical%20orientation).jpg" mce_href="http://5l3vgw.bay.livefilestore.com/y1pNycwWsLj-tQabletYlwpg3Jnn7wvCJGYF-7IKnkz7PITD2CeK6cTdNqU3uDM8GRBK0iw64sEKZ8/Simple%20Two%20Node%20NUMA%20Server%20Drawing%20(vertical%20orientation).jpg"&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; mso-bidi-font-size: 11.0pt"&gt;&lt;IMG title="Two-node NUMA server based on Nehalem" style="WIDTH: 407px; HEIGHT: 1072px" height=1072 alt="Two-node NUMA server based on Nehalem" src="http://5l3vgw.bay.livefilestore.com/y1pNycwWsLj-tQabletYlwpg3Jnn7wvCJGYF-7IKnkz7PITD2CeK6cTdNqU3uDM8GRBK0iw64sEKZ8/Simple%20Two%20Node%20NUMA%20Server%20Drawing%20(vertical%20orientation).jpg" width=407 mce_src="http://5l3vgw.bay.livefilestore.com/y1pNycwWsLj-tQabletYlwpg3Jnn7wvCJGYF-7IKnkz7PITD2CeK6cTdNqU3uDM8GRBK0iw64sEKZ8/Simple%20Two%20Node%20NUMA%20Server%20Drawing%20(vertical%20orientation).jpg"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; mso-bidi-font-size: 11.0pt"&gt;Figure 4.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="FONT-SIZE: 9pt; mso-bidi-font-size: 11.0pt"&gt; &lt;EM&gt;A two-NUMA server showing a cross-node link that is used when a thread on one node needs to access a remote memory location.&lt;o:p&gt;&lt;/o:p&gt;&lt;/EM&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;As the number of nodes increases, it is no longer feasible for every node to be directly connected to every other node, nor can each bank of RAM that is installed be accessed in a single hop. The specific technology used to link nodes may introduce additional variation in the cost of accessing remote memory. From any one node, it could take longer to access memory on some nodes than others. For instance, some nodes may be accessed in a single hop across a direct link, while other accesses may require multiple hops. Some manufacturers favor routing through a shared directory service, for example. Your mileage may vary.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Specifically, in the Intel architecture, manufacturers are supplying a cache coherent flavor of NUMA servers (ccNUMA). Cache coherence is implemented using a snooping protocol to ensure that threads executing on each NUMA node have access to the most current copy of the contents of the distributed memory. Details of the snooping protocol used in Intel ccNUMA machines are discussed &lt;/FONT&gt;&lt;A href="http://www.realworldtech.com/includes/templates/articles.cfm?ArticleID=RWT082807020032&amp;amp;mode=print"&gt;&lt;FONT face=Calibri color=#0000ff&gt;here&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri&gt;.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;AMD has taken a somewhat different tack in building its multi-core processors. For communication on chip between processors, AMD uses a technology known as HyperTransport, which is a dedicated, per-processor 2-way high speed link. Multiple processors cores are then linked on the chip in a ring topology as depicted in Figure 5. The ring topology has the effect of scaling the bus bandwidth that is used as an interconnect linearly with the number of the processors. But the architecture leads to NUMA characteristics. A thread executing on CPU 0 can access a local memory location, a remote memory location that is local to CPU 1 at the cost of one hop across the HT link, or a remote memory location that is local to CPU 2 at the cost of two hops across HT links.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;IMG title="AMD multi-core socket" style="WIDTH: 435px; HEIGHT: 435px" height=435 alt="AMD multi-core socket" src="http://5l3vgw.bay.livefilestore.com/y1pt8apQ0QRaEO0kR9KRDE29WNelvL0WCkG3i6aQTMLuL52t-DmDG1bUcWKUlO_qNaHWOaCGRePA_w/AMD%20multicore%20socket.jpg" width=435 mce_src="http://5l3vgw.bay.livefilestore.com/y1pt8apQ0QRaEO0kR9KRDE29WNelvL0WCkG3i6aQTMLuL52t-DmDG1bUcWKUlO_qNaHWOaCGRePA_w/AMD%20multicore%20socket.jpg"&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; mso-bidi-font-size: 11.0pt"&gt;Figure 5.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="FONT-SIZE: 9pt; mso-bidi-font-size: 11.0pt"&gt; &lt;EM&gt;The AMD approach to multi-core processors has NUMA characteristics. A program executing on CPU 0 that accesses RAM that is local to CPU 2 requires two hops across the HyperTransport links that connect the processors in a ring.&lt;o:p&gt;&lt;/o:p&gt;&lt;/EM&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Historically, application development for NUMA machines meant understanding the performance costs associated with accessing remote memory on a specific hardware platform. Since manufacturers employ different proprietary interconnection schemes in their multi-tiered NUMA machines, application developer are challenged to find the right balance in exploiting a specific proprietary architecture that may then limit the ability to port the application to a different platform in the future. It may be possible to connect nodes in a NUMA machine in an asymmetric configuration, for example, where the performance cost function associated with accessing different memory locations is decidedly irregular.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;To scale well, a multi-threaded program running on a NUMA machine needs to be aware of the machine environment and understand which memory references are local to the node and which are remote. A thread that was running on one NUMA node that migrates to another node pays a heavy price every time it has to fetch results from remote memory locations. The difficulty programmers face when trying to develop a scalable, multi-threaded application for a NUMA architecture machine is understanding their memory usage pattern and how it maps to the NUMA topography. When NUMA considerations were confined to expensive, high-end supercomputers, the inherent complexities developers faced in programming them were considered relatively esoteric concerns. However, in the era of many-core processors, NUMA is poised to become a mainstream architecture. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;In theory, it is possible to craft an optimal solution when threads and the memory they access are &lt;I style="mso-bidi-font-style: normal"&gt;balanced&lt;/I&gt; across NUMA processing nodes. In order to achieve an optimal balancing of the machines resources without overloading any of them, programs need to understand the CPU and memory resources that individual tasks executing in parallel require and understand how to best map those resources to the topography of the machine. Then they require a suitable scheduling mechanism to achieve the desired result. Achieving an optimal balance, as a practical matter, is not easy, in the face of variability in the resources required by any of execution threads, a complication that may then require dynamic adjustments to the scheduling policy in effect. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;The Windows OS is already NUMA-aware to a degree and, thus, supports a NUMA programming model. For example, once dispatched, threads have node affinity and tend to stay dispatched on an available processor within a node. Windows OS memory management is also NUMA-aware, maintaining per node allocation pools. The OS not only resists migrating threads to another node, it also tries to ensure that most memory allocated are satisfied locally using per node memory management data structures. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;Windows also provides a number of NUMA-oriented APIs that applications can use to keep their threads from migrating off-node and also enable them to direct memory allocations to a specific physical processing node. For more information on the NUMA support in Windows, see the MSDN Help topic “&lt;/FONT&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/aa363804.aspx"&gt;&lt;FONT face=Calibri color=#0000ff&gt;NUMA Support&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri&gt;.” &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;FONT face=Calibri&gt;To help application developers deal better with the complexities of NUMA architectures in the future, the Windows NUMA support needs to evolve. One potential approach would be for the OS to attempt to calculate a performance cost function at start-up that it would then expose to driver and application programs when they start up and run. Conceivably, the OS might also need to adjust this performance cost function to response to configuration changes that occur dynamically, such as any power management event that affects memory latency. These changes would then have to be communicated to NUMA-aware drivers and applications somehow so they could adapt to changing conditions.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=CMGpaperbody style="MARGIN: 0in 0in 3pt"&gt;&lt;A class="" title=Link-to-Part3 href="http://blogs.msdn.com/ddperf/archive/2008/08/06/mainstream-numa-and-the-tcp-ip-stack-part-iii-a-look-back-at-strategies-to-scale-high-speed-networking.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2008/08/06/mainstream-numa-and-the-tcp-ip-stack-part-iii-a-look-back-at-strategies-to-scale-high-speed-networking.aspx"&gt;Continue to Part III of this article.&lt;/A&gt; &lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8780016" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance+Engineering/default.aspx">Performance Engineering</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance/default.aspx">Performance</category></item><item><title>Mainstream NUMA and the TCP/IP stack: Part I.</title><link>http://blogs.msdn.com/ddperf/archive/2008/06/10/mainstream-numa-and-the-tcp-ip-stack-part-i.aspx</link><pubDate>Tue, 10 Jun 2008 04:44:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8588240</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>8</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/8588240.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=8588240</wfw:commentRss><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;One of the intriguing aspects of the onset of the many-core processor era is the necessity of using parallel programming techniques to reap the performance benefits of this and future generations of processor chips. Instead of significantly faster processors, we are getting more of them packaged on a single chip. To build the cost-effective mid-range blade servers configured in huge server farms to drive today’s Internet-based applications, the hardware manufacturers are tying together these complex multiprocessor chips to create NUMA architecture machines. There is nothing the matter with NUMA – machines with &lt;U&gt;n&lt;/U&gt;on-&lt;U&gt;u&lt;/U&gt;niform &lt;U&gt;m&lt;/U&gt;emory &lt;U&gt;a&lt;/U&gt;ccess speeds – of course, other than the fact that they introduce complex, hardware-specific programming models if you want to build applications that can harness their performance and capacity effectively. What is decidedly new is the extent to which previously esoteric NUMA architecture machines are becoming mainstream building blocks for current and future application servers.&lt;SPAN style="COLOR: black"&gt; For the connected applications of the future, our ability to build programming models that help server application developers deal with complex NUMA architecture performance considerations is the singular challenge of the many-core era.&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;In this blog entry, I will discuss the way both these trends -- multi-core processors and mainstream NUMA architectures – come together to influence the way high speed internetworking works today on servers of various sorts that need to handle a high volume of TCP/IP traffic. These include IIS web servers, Terminal Servers, SQL Servers, Exchange servers, Office Communicator servers, and others. Profound changes were necessary in the TCP/IP networking stack in both &lt;/FONT&gt;&lt;SPAN style="FONT-SIZE: 8.5pt; COLOR: black; LINE-HEIGHT: 115%; FONT-FAMILY: 'Verdana','sans-serif'"&gt;Windows Server 2008 and the &lt;/SPAN&gt;&lt;A href="http://support.microsoft.com/?kbid=912222"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;Microsoft Windows Server 2003 Scalable Networking Pack release&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt; to scale effectively on multi-processor machines. These changes are associated with a technology known as Receive-Side Scaling, or RSS. RSS&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;has serious performance implications on the architecture of highly scalable server applications that sit atop the TCP/IP stack in connected system environments. &lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Let’s start by considering what is happening to the TCP/IP software stack in Windows to support high speed networking, which is depicted in Figure 1. &lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;IMG title="NDIS TCP/IP protocol stack" style="WIDTH: 336px; HEIGHT: 425px" height=425 alt="NDIS TCP/IP protocol stack" src="http://5l3vgw.bay.livefilestore.com/y1pzbrt-s_HGnmLf_7hbLr4iCGVLnLvnA9PYQPxPNWiXL30FLiihEPz61UoQ7O_rAlXjRr6U_2Qcyg0bIeHXmOhIA/NDIS%20Network%20stack(compressed).jpg" width=336 mce_src="http://5l3vgw.bay.livefilestore.com/y1pzbrt-s_HGnmLf_7hbLr4iCGVLnLvnA9PYQPxPNWiXL30FLiihEPz61UoQ7O_rAlXjRr6U_2Qcyg0bIeHXmOhIA/NDIS%20Network%20stack(compressed).jpg"&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;o:p&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;The Internet Protocol (IP) and the Transmission Control Program (TCP) are the standardized software layers that sit atop the networking hardware. The Ethernet protocol is the pervasive Media Access (MAC) Layer that segregates the transmission of digital bits into individual packets. Performance issues with Ethernet arise due to the relatively small size of each packet. The Maximum Transmission Unit (MTU) for standard Ethernet sized packets is 1500 bytes. Any messages that are larger than the MTU require segmentation to fit in standard sized Ethernet packets. (Segmentation on the Send side and reassembly on the Receive side are functions performed by the next higher level protocol in the stack, namely the IP layer.) Not all transmissions are maximum sized packets. For example, the Acknowledgement (ACK) packets required and frequently issued in TCP consist of 50-byte packet headers only, with no additional data payload. On average, the size of packets circulating over the Internet is actually much less than the protocol-defined MTU.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;The performance problems arise because in a basic networking scheme, each packet received by the Network Interface Card (NIC) delivers a hardware interrupt to the host processor, requiring in turn some associated processing time on the host computer to service that interrupt. The TCP/IP protocol is reasonably complex, so the amount of host processing per interrupt is considerable.&lt;A class="" title=_ftnref1 style="mso-footnote-id: ftn1" href="http://blogs.msdn.com/tiny_mce/jscripts/tiny_mce/blank.htm#_ftn1" name=_ftnref1&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="mso-special-character: footnote"&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="FONT-SIZE: 11pt; COLOR: black; LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: PMingLiU; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: ZH-TW; mso-bidi-language: AR-SA"&gt;[1]&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt; As transmission bit rates have increased from 10 Mb to 100 Mb to 1 Gb to today’s 10 Gb NICs, the potential interrupt rate rises proportionally. The host CPU load associated with processing network interrupts is a long-standing issue in the world of high speed networking. The problem has taken on a new dimension in the many-core era because the network interface continues to get faster, but processor speeds are no longer keeping pace.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;A back-of-the envelope calculation to figure out how many interrupts/sec a host computer with a 10 Gb Ethernet card potentially needs to handle should illustrate the scope of the problem. The Ethernet wire protocol specifies a redundant coding scheme that encodes successive eight bits of data with two bits of error correction data. This is known as 10/8 encoding. With 10/8 encoding, a 10 Megabit Ethernet card has a nominal data rate of 1 Megabyte/sec, a 100 Mb Ethernet NIC transmits data at a 10 MB rate, etc. Similarly, the 10 Gb Ethernet card has the capacity to transmit application data at 1 GB/sec.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;To understand the rate interrupts need to be processed on a host computer system to sustain 1 GB/second throughput with Ethernet, simply divide by the average packet size. To keep the math easy, assume an &lt;I style="mso-bidi-font-style: normal"&gt;average&lt;/I&gt; packet size is 1k or less bytes. (This is not an outlandish assumption. A large portion of the Receive packets processed at a typical web server are TCP ACKs; these are minimum 50 byte headers-only packets. Meanwhile, http Get Requests containing a URL, a cookie value, and other optional parameters can usually fit in a single Ethernet 1500-byte packet -- in practice, the cookie data that most web applications store is often less than 1 KB.) Assuming an average packet size of 1 KB, a 10 Gb Ethernet card that can transfer data at 1 GB/sec rate has the capability of generating 1 million operations/sec on your networking server. Next, assume there is a 1:1 ratio of Send:Receive packets. If 50% of those are Receive operations, then the machine needs to support 500K interrupts/sec.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Now, if the number of instructions associated with network interrupt processing in the Interrupt Service Routine (ISR) associated with the device, the Deferred Procedure Call (DPC), and the next higher layers in the Network Device Interface Specification (NDIS) stack to support TCP/IP is, let’s say 10,000, then the processor load to service TCP/IP networking requests is:&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt; TEXT-ALIGN: center" align=center&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN lang=FR style="COLOR: black; mso-ansi-language: FR"&gt;500,000 interrupts/sec * 10,000 instructions = 5,000,000,000 &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;instructions/second&lt;/SPAN&gt;&lt;/I&gt;&lt;SPAN lang=FR style="COLOR: black; mso-ansi-language: FR"&gt; &lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;(1)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;which easily exceeds the capacity of a single CPU in the many-core era. If these network interrupts are confined to a single processor, which is the way things worked in days of yore, host processor speed is a bottleneck that will constrain the performance of a high speed NIC.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Of course, instead of wishing and hoping that TCP interrupt processing could be accomplished within 10K instructions in today’s complex networking environment, it might help to actually try and measure the CPU path length associated with this processing. To measure the impact of the current TCP/IP stack in Windows Vista, I installed the NTttcp Test tool available &lt;/SPAN&gt;&lt;A href="http://www.microsoft.com/whdc/device/network/TCP_tool.mspx"&gt;here&lt;/A&gt;&lt;SPAN style="COLOR: black"&gt; and set up a simple test using the 1 Gb Ethernet NIC installed on a dual-core 2.2 GHz machine running Windows Vista SP1 over a dedicated Gigabit Ethernet network segment. Since the goal of the test was not to maximize network throughput, I specified 512-byte sized packets and was careful to confine the TCP interrupt processing to CPU 0 using the following NTttcp parameters:&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt; TEXT-INDENT: 0.5in"&gt;&lt;SPAN lang=FR style="FONT-SIZE: 9pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Courier New'; mso-ansi-language: FR"&gt;ntttcpr -m 1,0,192.168.3.51 -a 16 -l 512 -mb -fr -t 120&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;I was also careful to shut down all other networking applications on my machine for the duration of the test.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Here’s the output from a 120 second NTttcp run, allowing for both a warm-up and cool down period wrapped around the main test:&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;
&lt;TABLE class=MsoTableGrid style="BORDER-RIGHT: medium none; BORDER-TOP: medium none; BORDER-LEFT: medium none; BORDER-BOTTOM: medium none; BORDER-COLLAPSE: collapse; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-yfti-tbllook: 1184; mso-padding-alt: 0in 5.4pt 0in 5.4pt" cellSpacing=0 cellPadding=0 border=1 class="MsoTableGrid"&gt;
&lt;TBODY&gt;
&lt;TR style="mso-yfti-irow: 0; mso-yfti-firstrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: black 1pt solid; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Throughput(KB/s)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: black 1pt solid; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;16,475.553&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="mso-yfti-irow: 1"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Throughput(Mbit/s)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1; mso-border-bottom-themecolor: text1; mso-border-right-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;131.804&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="mso-yfti-irow: 2"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Average Frame Size&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1; mso-border-bottom-themecolor: text1; mso-border-right-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;764.394&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="mso-yfti-irow: 3"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Packets Sent&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1; mso-border-bottom-themecolor: text1; mso-border-right-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;1,309,892&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="mso-yfti-irow: 4"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Packets Received&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1; mso-border-bottom-themecolor: text1; mso-border-right-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;2,586,923&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="mso-yfti-irow: 5"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Packets received/Int)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1; mso-border-bottom-themecolor: text1; mso-border-right-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;2&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="mso-yfti-irow: 6"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Interrupts/sec&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1; mso-border-bottom-themecolor: text1; mso-border-right-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;9,494.04&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="mso-yfti-irow: 7; mso-yfti-lastrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: black 1pt solid; WIDTH: 108.9pt; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1" vAlign=top width=145&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Cycles/Byte&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: black 1pt solid; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 1in; PADDING-TOP: 0in; BORDER-BOTTOM: black 1pt solid; BACKGROUND-COLOR: transparent; mso-border-alt: solid black .5pt; mso-border-themecolor: text1; mso-border-left-alt: solid black .5pt; mso-border-left-themecolor: text1; mso-border-top-alt: solid black .5pt; mso-border-top-themecolor: text1; mso-border-bottom-themecolor: text1; mso-border-right-themecolor: text1" vAlign=top width=96&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;129.3&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;On the dual-core machine, CPU 0 was maxed out at 100% for the duration of the test – evidently, that was the capacity of the machine to Receive TCP/IP packets and process and return the necessary Acknowledgement packets to the Sender. I will drill into the CPU usage statistics in a moment. For now, let’s focus on the interrupt rate, which was about 9500 interrupts/sec or slightly more than 100 &lt;/SPAN&gt;&lt;SPAN style="COLOR: black; FONT-FAMILY: 'Times New Roman','serif'"&gt;μ&lt;/SPAN&gt;&lt;SPAN style="COLOR: black"&gt;secs of processing time for each Interrupt processed. This being a 2.2 GHz machine, 100 &lt;/SPAN&gt;&lt;SPAN style="COLOR: black; FONT-FAMILY: 'Times New Roman','serif'"&gt;μ&lt;/SPAN&gt;&lt;SPAN style="COLOR: black"&gt;secs of processing time translates into 220,000 cycles of execution time per TCP/IP interrupt. Substituting this more realistic estimate of the CPU path length into equation 1 yields&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt; TEXT-ALIGN: center" align=center&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN lang=FR style="COLOR: black; mso-ansi-language: FR"&gt;500,000 interrupts/sec * 200,000 clocks = 100,000,000,000&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;instructions/second&lt;/SPAN&gt;&lt;/I&gt;&lt;SPAN lang=FR style="COLOR: black; mso-ansi-language: FR"&gt; &lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;(2)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;a requirement for 100 GHz of host processing power to perform the TCP/IP processing for a 10 Gb Ethernet card running at its full rated capacity.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Next, I re-executed the test while running the xperf ETW utility that is packaged with the &lt;/SPAN&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/cc305187.aspx"&gt;Windows Performance Toolkit&lt;/A&gt;&lt;SPAN style="COLOR: black"&gt; to capture CPU consumption by the TCP/IP stack: &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt; TEXT-ALIGN: center" align=center&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Trebuchet MS','sans-serif'"&gt;xperf -on &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;LATENCY -f tcpreceive1.etl –ClockType Cycle&lt;/SPAN&gt;&lt;SPAN style="COLOR: black"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;According to the xperf documentation, the LATENCY flags request trace data that includes all CPU context switches, interrupts (Interrupt Service Routines or ISRs) and Deferred Procedure calls (DPCs). As explained in [1], Windows uses a two-step process to service device interrupts. Initially, the OS dispatches an ISR to service the specific device interrupt. During the ISR, further interrupts by the device are disabled. Ideally, the ISR performs the minimum amount of processing time possible to re-enable the device for interrupts and then schedules a DPC to finish the job. DPCs are dispatched at a lower priority than ISRs, but above all other functions in the machine. DPCs execute with the device re-enabled for interrupts so it is possible for the execution time of the DPC to be delayed because it is preempted by the need to service a higher priority interrupt from the NIC (or another device). &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Gathering the xperf data while the NTttcp test was running lowered the network throughput only slightly – by less than 2%, additional measurement noise that can safely be ignored in this context. The kernel trace events requested are basically being gathered continuously by all the diagnostic infrastructure in Windows anyway. The xperf session merely gathers them from memory and writes them to disk. The disk was otherwise not being used for anything else during this test and there was an idle CPU available to handle the tracing chores. The overall performance impact of gathering the trace data was minimally disruptive in this situation. &lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;I then loaded the trace data from the .etl file and used the xperview GUI application to analyze it. See Figure 2.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;DIV style="mso-element: footnote-list"&gt;&lt;IMG title="NDIS DPC protocol overhead" style="WIDTH: 340px; HEIGHT: 362px" height=362 alt="NDIS DPC protocol overhead" src="http://5l3vgw.bay.livefilestore.com/y1pzbrt-s_HGnlx4xwaK3LHB2rnJ0tBhZ-bryUqHTLegqnmj9md-cc-o5k14zMSVzYQPFg5PVbNidQ8hZ7rsO-78w/xperf%20viewscreen%20shot%20of%20Ndis%20DPC%20processing.jpg" width=340 mce_src="http://5l3vgw.bay.livefilestore.com/y1pzbrt-s_HGnlx4xwaK3LHB2rnJ0tBhZ-bryUqHTLegqnmj9md-cc-o5k14zMSVzYQPFg5PVbNidQ8hZ7rsO-78w/xperf%20viewscreen%20shot%20of%20Ndis%20DPC%20processing.jpg"&gt;&lt;BR clear=all&gt;&lt;/DIV&gt;
&lt;DIV style="mso-element: footnote-list"&gt;
&lt;HR align=left width="33%" SIZE=1&gt;
&lt;/DIV&gt;
&lt;DIV style="mso-element: footnote-list"&gt;&amp;nbsp; 
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt;Figure 2.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt; xperview display showing % CPU utilization, % DPC time, and % Interrupt time, calculated from the kernel trace event data recorded during the NTttcp test execution.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Figure 2 shows three views of the activity on CPU 0 where all the networking processing was performed. The top view shows overall processor utilization at close to 100% during the TCP test, with an overlay of a second line graph indicating the portion specifically associated with DPC processing, accounting for somewhere in excess of 60% busy. The DPC data is broken out and displayed separately in the middle graph, and the Interrupt CPU time is shown at the bottom (a little less than 4%).&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;xperfview allows you to display a Summary Table that breaks out Interrupt and DPC processor utilization at the driver level, sorted by the amount of processor time spent per module. For the DPCs, we see the following.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;TABLE class=MsoNormalTable style="MARGIN: auto auto auto 4.8pt; WIDTH: 374.1pt; BORDER-COLLAPSE: collapse; mso-yfti-tbllook: 1184; mso-padding-alt: 0in 5.4pt 0in 5.4pt" cellSpacing=0 cellPadding=0 width=499 border=0 class="MsoNormalTable"&gt;
&lt;TBODY&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 0; mso-yfti-firstrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;Module&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Function&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Count&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Max Duration [ms]&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Avg Duration [ms]&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Duration [ms]&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 1"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;ndis.sys&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;425125&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;1.116595&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.075376&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;32044.44967&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 2"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;&lt;FONT face=Calibri&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8ac79237&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;423800&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.797987&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.075552&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;32019.18583&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 3"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;&lt;FONT face=Calibri&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8ad38209&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;207&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;1.116595&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.11752&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;24.326752&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 4"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;&lt;FONT face=Calibri&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8ad3892f&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;1117&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.012439&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.000837&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.935867&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 5"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;&lt;FONT face=Calibri&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8ad399b3&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;1&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.001213&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.001213&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.001213&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 6"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;USBPORT.SYS&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;8312&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.064506&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.011802&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;98.100399&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 7"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;tcpip.sys&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;4154&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.551394&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.009585&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;39.817004&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 8"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;dxgkrnl.sys&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8f34e09b&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;3039&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.528346&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.012848&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;39.047187&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 9; mso-yfti-lastrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 54.95pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=73&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;iastor.sys&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 53.65pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=72&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 45pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=60&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;1221&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 76.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=102&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.033546&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.015061&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=84&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 8pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-bidi-font-size: 9.0pt; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;18.390545&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt;Table 1.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt; xperfview Summary Table display showing processor utilization by DPC.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Confirming my back-of-the-envelope calculation presented earlier, xperf trace data indicates the average duration of an ndis.sys DPC used to process a network interrupt was 75 &lt;/SPAN&gt;&lt;SPAN style="COLOR: black; FONT-FAMILY: 'Times New Roman','serif'"&gt;μ&lt;/SPAN&gt;&lt;SPAN style="COLOR: black"&gt;secs. The total amount of time spent in DPC processing was approximately 32 seconds of the full trace, which lasted about 52 seconds, corresponding to slightly more than 61% busy on CPU 0.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;TABLE class=MsoNormalTable style="MARGIN: auto auto auto 4.8pt; WIDTH: 392.1pt; BORDER-COLLAPSE: collapse; mso-yfti-tbllook: 1184; mso-padding-alt: 0in 5.4pt 0in 5.4pt" cellSpacing=0 cellPadding=0 width=523 border=0 class="MsoNormalTable"&gt;
&lt;TBODY&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 0; mso-yfti-firstrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63.6pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=85&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;Module&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 58.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=78&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Function&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 38.2pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=51&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Count&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 83.3pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=111&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Max Duration [ms]&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Avg Duration [ms]&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 67.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=90&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Duration [ms]&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 1"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63.6pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=85&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;ntkrnlpa.exe&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 58.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=78&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x828d6fa2&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 38.2pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=51&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;423803&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 83.3pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=111&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.023875&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.003173&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 67.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=90&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;1345.147547&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 2"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63.6pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=85&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;dxgkrnl.sys&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 58.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=78&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8f3630ea&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 38.2pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=51&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;3040&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 83.3pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=111&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.096759&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.039523&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 67.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=90&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;120.151742&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 3"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63.6pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=85&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;USBPORT.SYS&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 58.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=78&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8f4098c2&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 38.2pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=51&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;5529&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 83.3pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=111&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.025199&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.007241&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 67.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=90&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;40.038229&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 4"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63.6pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=85&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;pcmcia.sys&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 58.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=78&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x82f8deea&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 38.2pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=51&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;4968&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 83.3pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=111&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.02401&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.006754&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 67.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=90&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;33.558272&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 5; mso-yfti-lastrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 63.6pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=85&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;iastor.sys&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 58.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=78&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;0x8aaa7f6c&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 38.2pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=51&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;4968&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 83.3pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=111&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.016345&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 81pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=108&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.005103&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 67.5pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=90&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;25.353482&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt;Table 2.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt; xperfview Summary Table display showing processor utilization by ISR.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;The Summary Table display reproduced in Table 2 serves to confirm a direct relationship between the ndis DPC processing and the kernel mode interrupts processed by the ntkrnlpa ISR. The average duration of an ntkrnlpa ISR execution was just 3 &lt;/SPAN&gt;&lt;SPAN style="COLOR: black; FONT-FAMILY: 'Times New Roman','serif'"&gt;μ&lt;/SPAN&gt;&lt;SPAN style="COLOR: black"&gt;secs. Together, the ISR+DPC time was just under 80 &lt;/SPAN&gt;&lt;SPAN style="COLOR: black; FONT-FAMILY: 'Times New Roman','serif'"&gt;μ&lt;/SPAN&gt;&lt;SPAN style="COLOR: black"&gt;secs. This leads to a slight downward revision of equation 2:&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt; TEXT-ALIGN: center" align=center&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;SPAN lang=FR style="COLOR: black; mso-ansi-language: FR"&gt;500,000 interrupts/sec * 175,000&amp;nbsp;clocks = 88,000,000,000&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;instructions/second&lt;/SPAN&gt;&lt;/I&gt;&lt;SPAN lang=FR style="COLOR: black; mso-ansi-language: FR"&gt; &lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;(3)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;which remains a formidable constraint, considering the speed of current processors.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Finally, I drilled in the processor utilization by process, which showed utilization by the NTttcp process, whose main processing thread was also affinitized to CPU 0, at the receiving end of the interrupt responsible for an additional 11% CPU busy. Allowing for OS scheduling and other overhead factors, these three workloads account for the 100% utilization of CPU 0.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;TABLE class=MsoNormalTable style="MARGIN: auto auto auto 4.8pt; WIDTH: 253pt; BORDER-COLLAPSE: collapse; mso-yfti-tbllook: 1184; mso-padding-alt: 0in 5.4pt 0in 5.4pt" cellSpacing=0 cellPadding=0 width=337 border=0 class="MsoNormalTable"&gt;
&lt;TBODY&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 0; mso-yfti-firstrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;Process&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Cpu Usage (ms)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;% Cpu Usage&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/B&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 1"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;Idle (0)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;32730.11165&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;50.4&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 2"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;NTttcpr.exe (3280)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;7120.518872&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;10.97&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 3"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;services.exe (700)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;582.050622&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.9&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 4"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;InoRT.exe (2076)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;368.519663&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.57&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 5"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;dwm.exe (4428)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;300.59808&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.46&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 6"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;System (4)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;256.802505&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.4&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 7"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;svchost.exe (1100)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;166.479679&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.26&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 8"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;taskmgr.exe (7092)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;162.960941&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.25&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 9"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;msiexec.exe (6412)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;145.122854&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.22&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 10"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;sidebar.exe (5884)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;111.86257&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.17&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 11"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;WmiPrvSE.exe (7952)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;93.797334&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.14&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 12"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;csrss.exe (668)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;79.331035&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.12&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR style="HEIGHT: 15pt; mso-yfti-irow: 13; mso-yfti-lastrow: yes"&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 107pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=143&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal"&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;svchost.exe (1852)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 80pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=107&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;70.65799&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;
&lt;TD class="" style="BORDER-RIGHT: #f0f0f0; PADDING-RIGHT: 5.4pt; BORDER-TOP: #f0f0f0; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0in; BORDER-LEFT: #f0f0f0; WIDTH: 66pt; PADDING-TOP: 0in; BORDER-BOTTOM: #f0f0f0; HEIGHT: 15pt; BACKGROUND-COLOR: transparent" vAlign=bottom noWrap width=88&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; LINE-HEIGHT: normal; TEXT-ALIGN: right" align=right&gt;&lt;SPAN style="FONT-SIZE: 9pt; COLOR: black; mso-fareast-font-family: 'Times New Roman'; mso-bidi-font-family: 'Times New Roman'; mso-ascii-font-family: Calibri; mso-hansi-font-family: Calibri"&gt;&lt;FONT face=Calibri&gt;0.11&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt;Table 3.&lt;/SPAN&gt;&lt;/B&gt;&lt;SPAN style="FONT-SIZE: 10pt; COLOR: black; LINE-HEIGHT: 115%"&gt; xperfview Summary Table display showing processor utilization by process.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Note that the NTttcp program responsible for processing the packets it receives probably represents a network application that performs the minimum amount of application-specific processing per packet that you can expect. It ignores the data payload contents completely, and its other processing of the packet is pretty much confined to maintaining its throughput statistics. We should also note that is a user mode application, which means that processing the receive packet does require a transition from kernel to user mode. It is possible to implement a kernel mode networking application in Windows – the http.sys kernel model driver that IIS uses is one – that avoids these expensive processor execution state transitions, but they are the exception, not the rule. (And, when it comes to building HTTP Response messages dynamically using ASP.NET, even http.sys hands off the HTTP Request packet to an ASP.NET user mode thread for processing.)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;The point of this set of measurements and calculations is not characterize the network traffic in and out of a typical web server, but to understand the motivation for recent architectural changes in the networking stack – both hardware and software – to allow network interrupts to be processed concurrently on multiple processors. Those architectural changes are the subject of Part II of this blog. &lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;SPAN style="COLOR: black"&gt;Part II of this article is posted &lt;A class="" title=Part-II href="http://blogs.msdn.com/ddperf/archive/2008/07/27/mainstream-numa-and-the-tcp-ip-stack-part-i-programming-ccnuma-machines.aspx" mce_href="http://blogs.msdn.com/ddperf/archive/2008/07/27/mainstream-numa-and-the-tcp-ip-stack-part-i-programming-ccnuma-machines.aspx"&gt;here&lt;/A&gt;.&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/DIV&gt;
&lt;DIV style="mso-element: footnote-list"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id=ftn1 style="mso-element: footnote"&gt;
&lt;P class=MsoFootnoteText style="MARGIN: 0in 0in 0pt"&gt;&lt;A class="" title=_ftn1 style="mso-footnote-id: ftn1" href="http://blogs.msdn.com/tiny_mce/jscripts/tiny_mce/blank.htm#_ftnref1" name=_ftn1&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="mso-special-character: footnote"&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: PMingLiU; mso-fareast-theme-font: minor-fareast; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: ZH-TW; mso-bidi-language: AR-SA"&gt;&lt;FONT color=#0000ff&gt;[1]&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;FONT size=2&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Some have argued otherwise. See D. D. Clark, V. Jacobson, J. Romkey and H. Salwen, “An analysis of TCP processing overhead.” &lt;I style="mso-bidi-font-style: normal"&gt;IEEE Communications&lt;/I&gt; Magazine, 27(6):23-29, June 1989. This assessment was made prior to the overhaul of the TCP protocol proposed by Van Jacobson that was implemented to address serious scalability issues that Internet technology faced in the early years of its adoption. Taking account of both security and performance considerations, the TCP/IP protocol software stack as implemented today is considerably more complex. &lt;I style="mso-bidi-font-style: normal"&gt;Microsoft Windows Server 2003 TCP/IP Protocols and Services Technical Reference&lt;/I&gt; by Davies and Lee is a useful guide to the full set of TCP/IP services that are provided today, except it does not include the additional functions in the Microsoft Windows Server 2003 Scalable Networking Pack release discussed here. For a recent description of TCP/IP host processor overhead, see &lt;SPAN style="COLOR: black"&gt;Hyun-Wook Jin, Chuck Yoo, “Impact of protocol overheads on network throughput over high-speed interconnects: measurement, analysis, and improvement.” &lt;/SPAN&gt;&lt;/FONT&gt;&lt;A href="http://sunsite.informatik.rwth-aachen.de/dblp/db/journals/tjs/tjs41.html#JinY07"&gt;&lt;FONT color=#0000ff size=2&gt;The Journal of Supercomputing 41&lt;/FONT&gt;&lt;/A&gt;&lt;SPAN style="COLOR: black"&gt;&lt;FONT size=2&gt;(1): 17-40 (2007&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/DIV&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8588240" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance+Engineering/default.aspx">Performance Engineering</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance/default.aspx">Performance</category></item><item><title>Thoughts on Intel's recent hardware announcements</title><link>http://blogs.msdn.com/ddperf/archive/2008/04/01/thoughts-on-intel-s-recent-hardware-announcements.aspx</link><pubDate>Tue, 01 Apr 2008 03:56:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8347001</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/8347001.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=8347001</wfw:commentRss><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;A href="http://www.intel.com/pressroom/archive/releases/20080317fact.htm?iid=pr1_releasepri_20080317fact" mce_href="http://www.intel.com/pressroom/archive/releases/20080317fact.htm?iid=pr1_releasepri_20080317fact"&gt;&lt;FONT face=Calibri size=3&gt;Intel briefed customers recently&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; about the evolution of its processor architectures to support ManyCore processors. Highlights of the press briefing include announcing the quad-core Tukwila processor that supports the IA-64 Itanium architecture and a six-core x64-based processor called Dunnington that will be available later this year. The major focus of the announcement though was the new Nehalem architecture processors which are scheduled for production by the end of this year.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Intel is executing on an aggressive, two-year product cycle, driven primarily by semiconductor fabrication improvements in the first year of the cycle that double the amount of circuitry they can fit on a chip. These are followed up by architectural improvements designed to leverage the new chip density in the year following. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Last year, this involved bringing online a new 45-nanometer (nm) fabrication plant that utilizes a new technology, namely, &lt;/FONT&gt;&lt;A href="http://www.spectrum.ieee.org/oct07/5553" mce_href="http://www.spectrum.ieee.org/oct07/5553"&gt;&lt;FONT face=Calibri size=3&gt;hafnium-based high-k gate dielectrics and metal gates&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. This technology is designed to address a serious problem with the previous materials used to insulate the tiny circuitry etched into the substrate. Historically, the material used for this insulation has been silicon dioxide. As the dimensions of the circuit have shrunk with each succeeding generation of semiconductor fabrication technology, so has the insulation layer. At the 90 nm point, the silicon dioxide insulator was the width of just 5 atoms. Material scientists felt it could not be shrunk any further without ceasing to function as an effective insulator. At that point, it was also subject to leaking power significantly (and generating excess heat). It started to look like we had reached a physical limit on how much circuitry could be crammed onto a single chip.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Intel is championing the new materials and manufacturing processes as a breakthrough that will enable it to build increasingly denser chips. (If high-k gate dielectrics really is a breakthrough, the rest of the semiconductor industry can be expected to follow.) The Tukwila and Dunnington processors announced last week each contain close to 2 billion logic circuits. Intel expects that the another two-year product cycle will hit like clockwork next year when it moves to a next generation 32-nm fab. This will once again double the number of circuits that can be packed on a chip to 4 billion. Enter the new processor architecture, code-named Nehalem designed to exploit the new fabrication density.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The Nehalem architecture is a modular design that can be produced today on the 45 nm process, but, can migrates next year to the 32 nm process. This year it will be built with 4 processor cores, next year 8. Consistent with the imperative to conserve power, it does not look like the Nehalem will increase the CPU clock speed. At least there was nothing about a faster clock in the announcement materials.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The emphasis on conserving power is leading to an increased interest in utilizing &lt;/FONT&gt;&lt;A href="http://www.cmg.org/measureit/issues/mit01/m_1_1.html" mce_href="http://www.cmg.org/measureit/issues/mit01/m_1_1.html"&gt;&lt;FONT face=Calibri size=3&gt;Simultaneous Multi-threading (SMT) technology&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; to boost processor performance. This is something Intel branded initially as Hyper-Threading (HT). It is nice to see Intel moving away from the original, marketing-oriented branding, which was confusing, to the generic and generally accepted terminology. The announcement hints at expanding the use of SMT in future Nehalem chips, which is something the &lt;/FONT&gt;&lt;A href="http://www.cs.washington.edu/research/smt/papers/tlp2ilp.final.pdf" mce_href="http://www.cs.washington.edu/research/smt/papers/tlp2ilp.final.pdf"&gt;&lt;FONT face=Calibri size=3&gt;original SMT research&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; found to be very promising. We could get 8 processor cores on the 32 nm fab process next year, each supporting up to 4 logical processors on a desktop machine. I don’t know if that will happen because it is not clear that desktop machines need anything close to 32 processors. That’s the challenge that software development has to step up to because right now, as I blogged about last time, the current generation of desktop software cannot effectively utilize all those processors.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;One of the performance issues that arose with HT in earlier Intel processors, especially with server workloads, was that the added logical processors had a tendency to saturate the Front-side bus (FSB) that connected the processors to the memory controller. (Here’s an anecdotal example: &lt;/FONT&gt;&lt;A href="http://www.cmg.org/measureit/issues/mit15/m_15_2.html" mce_href="http://www.cmg.org/measureit/issues/mit15/m_15_2.html"&gt;&lt;FONT face=Calibri size=3&gt;http://www.cmg.org/measureit/issues/mit15/m_15_2.html&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;.)&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;The bus accesses are needed so that the snooping protocol can maintain cache coherence in a multiprocessor. The new QuickPath Interconnect provides a scalable alternative to the FSB that has both better latency and higher bandwidth. This should help bring SMT into the mainstream for server workloads. Architecturally, QuickPath looks very similar to AMD’s HyperTransport (another “HT,” which may explain why Intel has reverted to using the generic SMT terminology).&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;In the QuickPath architecture, each microprocessor contains a built-in memory controller designed to access its own dedicated local memory. But on machines configured with more than one microprocessor, QuickPath leads to a NUMA architecture. Within a microprocessor, all the processor cores and their logical processors can access local memory at a uniform speed using the integrated memory controller. But access to remote memory on a different microprocessor is slower. A program thread running on one microprocessor can access remote memory attached to another microprocessor using the QuickPath Interconnect. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;These are the basic performance characteristics associated with NUMA (non-uniform memory access) machines. NUMA used to be associated with mainly esoteric high-end super-computers, mainly due to the difficulty developers had in programming for them. Even so, NUMA is poised to become a mainstream architecture, which is another serious challenge for our software frameworks.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;To run well, a multi-threaded program running on a NUMA machine needs to be aware of the machine environment and understand which memory references are local to the node and which are remote. A thread that was running on one NUMA node that migrates to another node pays a heavy price every time it has to fetch results from remote memory locations. The Windows OS is already NUMA-aware to a degree. Once dispatched, threads have node affinity, for example. And Windows OS memory management is also NUMA-aware. The OS resists migrating threads to another node and otherwise also tries to ensure that most memory accesses remain local using per node memory management data structures. There are a number of NUMA-oriented APIs that applications can use to keep their threads from migrating off-node and direct memory allocations to a specific physical processing node. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;In the meantime, it is an open question how NUMA-aware the application level run-times and critical memory management functions like the GC in .NET need to be. Let’s assume for a moment that 8 processor cores is more than enough for almost any desktop. This means dealing with &lt;/FONT&gt;&lt;A href="http://msdn2.microsoft.com/en-us/library/aa363804.aspx" mce_href="http://msdn2.microsoft.com/en-us/library/aa363804.aspx"&gt;&lt;FONT face=Calibri size=3&gt;the complexities of NUMA&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; is confined, in the short term at least, to server applications. While that is occasion for a big sigh of relief, it is also a pointed reminder that facilities like the new &lt;/FONT&gt;&lt;A href="http://channel9.msdn.com/Showpost.aspx?postid=384229" mce_href="http://channel9.msdn.com/Showpost.aspx?postid=384229"&gt;&lt;FONT face=Calibri size=3&gt;Task Parallel Library&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; in the .NET Framework will need to become NUMA-aware.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Finally, it is worth mentioning that there are a bunch of other goodies in the recent Intel announcement to further fuel the drive to many-core processors . These include making the serializing instructions&amp;nbsp;like XCHG run faster, which should be a boost to multi-threaded programs of all stripes. Intel is also adding a shared L3 cache to each chip; each processor core continues to have its own dedicated L1 and L2 caches.&lt;/FONT&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8347001" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance+Engineering/default.aspx">Performance Engineering</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/CSharpDevCenter/default.aspx">CSharpDevCenter</category></item><item><title>Where Do We Go From Here, Part 1.</title><link>http://blogs.msdn.com/ddperf/archive/2008/03/22/where-do-we-go-from-here-part-1.aspx</link><pubDate>Sat, 22 Mar 2008 03:53:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8330278</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/8330278.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=8330278</wfw:commentRss><description>&lt;H1 style="MARGIN: 24pt 0in 0pt; TEXT-ALIGN: center" align=center&gt;&lt;FONT face=Calibri size=3&gt;The Performance of Desktop Applications in the ManyCore Era&lt;/FONT&gt;&lt;/H1&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The &lt;A href="http://www.intel.com/products/processor/core2quad/index.htm?iid=homepage+qc" mce_href="http://www.intel.com/products/processor/core2quad/index.htm?iid=homepage+qc"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;Quad-cores&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; are coming! The &lt;/FONT&gt;&lt;A href="http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_15331_15332%5E15333,00.html" mce_href="http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_15331_15332%5E15333,00.html"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;Quad-cores&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; are coming! &lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Beginning in early 2008, machines with the latest quad-core processors became available from the major manufacturers. Should you be excited about the prospect? Should you run right out and buy one?&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;These machines will have 4 independent processor cores on a single processor chip, each able to run independent OS or application program threads concurrently. The &lt;I style="mso-bidi-font-style: normal"&gt;processor core&lt;/I&gt; terminology is designed to distinguish this architecture from processors that support &lt;/FONT&gt;&lt;A href="http://www.cmg.org/measureit/issues/mit01/m_1_1.html" mce_href="http://www.cmg.org/measureit/issues/mit01/m_1_1.html"&gt;&lt;FONT face=Calibri size=3&gt;simultaneous multithreading (SMT) technology&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; like Intel’s HyperThreading (or HT) that simulates multiprocessing on a single processor core. Quad-core means four physical processors reside on the chip; eventually, with HT-enabled you can then simulate 8 logical processors on a single chip (supporting 8-way concurrency).&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;These latest multi-core machines represent an extension of an ongoing trend in processor design that has become known as the ManyCore architecture. Instead of machines with significantly faster CPUs, with ManyCore we expect machines with &lt;I style="mso-bidi-font-style: normal"&gt;more&lt;/I&gt; physical processors on the chip. And these new parallel processing chips are directed at standard server, desktop, and portable computers. This year’s crop of quad-core machines are then expected to be replaced by a subsequent generation of ManyCore machines with 8, 16, 32, 64, or more processor cores over the next 5-10 years. These developments in computer hardware present a serious challenge for the software development community to harness this computing power effectively in the next generation of computer software. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;In this series of blog entries, I am going to look at that challenge from several perspectives, including the advances in software development technology that are necessary to make many-core processors yield their full potential. Multiprocessor scalability and the software necessary to exploit massively parallel processors is such a wide and deep topic in computer engineering that it is difficult to tackle head-on. In this first installment, I will focus on the factors influencing the hardware evolution. It also discusses why many-core processors are happening now. And, finally, I will talk about what to expect in the way of scalability &amp;amp; performance from this hardware running both current application software and the software of the future that will be designed to exploit the Manycore platform fully. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Along with the move towards multiple CPUs on a single multiprocessor chip, the discussion will highlight several related hardware developments, including power management, improved timing features, 64-bit computing trends, non-uniform memory access speeds, new instructions sets, and even a word or two about the impact of virtualization technology.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;First, let’s look at the performance and scalability implications of deploying multiprocessors to run current Windows desktop and server applications. This will provide a broader perspective on some of the key performance-oriented challenges associated with many-core processors.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;To illustrate these challenges,&amp;nbsp;I took&lt;/FONT&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;a screen shot from a performance monitoring session I ran showing utilization on my desktop computer for one of my somewhat typical days as a Knowledge Worker. During the course of this particular day, I attended a couple of meetings when my machine was virtually idle. (Ah, meetings, I just love them!) But while I was working at my desk, I was mostly using my computer. I answered a bunch of e-mails, surfed the web a few times, and even managed to find some time to do some programming and software testing. Processor utilization across a dual-core two processor machine at one minute intervals throughout the day averaged less than 10%. Even when the machine was busy, it wasn't very.&amp;nbsp;Processor utilization peaked at less than 40% busy.&lt;/FONT&gt;&lt;A class="" title=_ftnref1 style="mso-footnote-id: ftn1" href="http://blogs.msdn.com/tiny_mce/jscripts/tiny_mce/blank.htm#_ftn1" name=_ftnref1 mce_href="http://blogs.msdn.com/tiny_mce/jscripts/tiny_mce/blank.htm#_ftn1"&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="mso-special-character: footnote"&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="FONT-SIZE: 11pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; mso-fareast-font-family: PMingLiU; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: ZH-TW; mso-bidi-language: AR-SA"&gt;&lt;FONT color=#0000ff&gt;[1]&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; Given this feeble load on my machine, explain to me again why I need more processing power. If the two processors I have currently are usually idle, wouldn’t having four of them mean just that much more idle horsepower?&lt;/FONT&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;IMG title="Knowledge worker's dual core desktop processor utilization" style="WIDTH: 600px; HEIGHT: 506px" height=506 alt="Knowledge worker's dual core desktop processor utilization" hspace=12 src="http://5l3vgw.bay.livefilestore.com/y1p5LRhwhhWeed6vGvopsN7KXBF8fa1nIxZ1gQbzPKlbfibHmTZF3KAvxPZysnMy8OMZN9ASD3SZXkTlf4ghJBPUZ7yX0uiJsKR/ParallelProgrammingFigure1-small.jpg" width=600 vspace=12 border=4 mce_src="http://5l3vgw.bay.livefilestore.com/y1p5LRhwhhWeed6vGvopsN7KXBF8fa1nIxZ1gQbzPKlbfibHmTZF3KAvxPZysnMy8OMZN9ASD3SZXkTlf4ghJBPUZ7yX0uiJsKR/ParallelProgrammingFigure1-small.jpg"&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;This is just a bit of the strong evidence that suggests that the need for many-core processors on the average person’s desktop machine is currently quite limited. Therein lies the first and foremost challenge to the industry’s goal of replacing your current desktop hardware with a new machine with 4, 8 or more processors. The overall performance gains that this hardware promises absolutely require application software that can fully exploit parallel processing hardware. That software is generally not available today for typical desktop applications which were written with conventional single CPU machines in mind. Not enough of today’s software was designed with parallel processing in mind. But that is beginning to change as the software industry rushes to catch up with the hardware. (Nothing new about this, by the way – software technology usually lags the hardware by 3-5 years or more.)&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;But the situation is far from hopeless. Developers of desktop applications have not had good reasons to program for multiprocessors until recently. Consider, for example, the xBox 360 platform, which is a three-way 64-bit multiprocessor with a parallel vector graphics co-processor (GPU) that is capable of performing something like 20 billion vector operations per second. The rich, immersive user experience associated with a well-designed xBox 360 application is a harbinger of the future of the Windows desktop as we start to build desktop applications that fully exploit multi-core architecture machines.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;But getting there will be a serious challenge. I say that for several main reasons:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpFirst style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3&gt;·&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;Although the software industry has made significant progress in parallel programming, there is no general purpose method available that can reliably take a serial process and turn it into a multi-threaded parallel processing application. While some parallel programming patterns like “divide-and-conquer” can be used in many situations, most applications still require parallelization on a case by case basis.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpMiddle style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3&gt;·&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;Multiple threads executing in parallel have non-determinative execution patterns that introduce subtle errors that can be very difficult to debug with existing developer tools.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpLast style="MARGIN: 0in 0in 10pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="FONT-FAMILY: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT size=3&gt;·&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;If adding a little parallelism to a serial application often leads to major speed-ups, there are often diminishing returns from adding more and more parallel processing threads. Determining an optimal number of parallel processing threads can be quite difficult in practice.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;More about the multi-core challenge and related topics in software performance in&amp;nbsp;a future&amp;nbsp;blog post.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;
&lt;HR align=left width="33%" SIZE=1&gt;
&lt;/FONT&gt;
&lt;DIV style="mso-element: footnote-list"&gt;
&lt;DIV id=ftn1 style="mso-element: footnote"&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;A class="" title=_ftn1 style="mso-footnote-id: ftn1" href="http://blogs.msdn.com/tiny_mce/jscripts/tiny_mce/blank.htm#_ftnref1" name=_ftn1 mce_href="http://blogs.msdn.com/tiny_mce/jscripts/tiny_mce/blank.htm#_ftnref1"&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%"&gt;&lt;SPAN style="mso-special-character: footnote"&gt;&lt;SPAN class=MsoFootnoteReference&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Calibri','sans-serif'; mso-fareast-font-family: PMingLiU; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ascii-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: ZH-TW; mso-bidi-language: AR-SA"&gt;&lt;FONT color=#0000ff&gt;[1]&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%"&gt;&lt;FONT face=Calibri&gt; During lunch and the other periods when I wasn’t even at my desk, utilization of the machine never dropped below 2%, for which I can thank all those background threads in both Windows Vista and Office 2007 that wake up and execute from time to time, regardless of whether I am using the machine or not.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8330278" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance+Engineering/default.aspx">Performance Engineering</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category></item><item><title>Who Am I and What Am I Doing Writing a Blog?</title><link>http://blogs.msdn.com/ddperf/archive/2008/03/21/who-am-i-and-what-am-i-doing-writing-a-blog.aspx</link><pubDate>Fri, 21 Mar 2008 04:58:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8328275</guid><dc:creator>MarkBFriedman</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/ddperf/comments/8328275.aspx</comments><wfw:commentRss>http://blogs.msdn.com/ddperf/commentrss.aspx?PostID=8328275</wfw:commentRss><description>&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;My name is Mark Friedman and I have been working here at Microsoft as an Architect in the Developer Division Performance Engineering team since October 2006. Although I am a newbie here, I am an industry veteran with an extensive background in software product development, particularly in developing monitoring tools for finding and resolving performance problems. These experiences include designing and developing several major products that brought me a modest measure of fame and fortune. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Although I consider myself primarily a software developer, over the years, I have also written extensively on performance-oriented topics, including two &lt;/FONT&gt;&lt;A href="http://www.amazon.com/Windows-2000-Performance-Guide-Friedman/dp/1565924665/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1203715882&amp;amp;sr=1-1" mce_href="http://www.amazon.com/Windows-2000-Performance-Guide-Friedman/dp/1565924665/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1203715882&amp;amp;sr=1-1"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;books on the subject of Windows performance&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, most recently, the &lt;I style="mso-bidi-font-style: normal"&gt;Performance Guide&lt;/I&gt; that is packaged with the &lt;/FONT&gt;&lt;A href="http://www.microsoft.com/mspress/books/8856.aspx" mce_href="http://www.microsoft.com/mspress/books/8856.aspx"&gt;&lt;FONT face=Calibri size=3&gt;Windows Server 2003 Resource Kit&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. You can find bunches of articles I have written over the years for the &lt;/FONT&gt;&lt;A href="http://www.cmg.org/index.html" mce_href="http://www.cmg.org/index.html"&gt;&lt;FONT face=Calibri color=#0000ff size=3&gt;Computer Measurement Group&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt; annual conference and its MeasureIT online publication or at &lt;/FONT&gt;&lt;A href="http://www.demandtech.com/knowledge_perspectives.html" mce_href="http://www.demandtech.com/knowledge_perspectives.html"&gt;&lt;FONT face=Calibri size=3&gt;the web site of my previous company&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;If you bother to track down some of the links above, you will notice that I have not been writing much for public consumption lately. This prolonged period of what might look like writer’s block is due to my starting on this job to orient myself around what is to me a new technical challenge, namely developing a deep understanding of application performance on the Windows platform. The Developer Division produces a number of major applications, including Visual Studio, the Visual Studio Team System, the Team Foundation Server, the Expression suite of designer tools for web development, the C++, C#, and Visual Basic compilers, the Common Language Runtime (CLR), and the .NET Framework. This is a very diverse set of software products, to say the least. When you factor the myriad ways customers can wield these developer tools and technologies to build their applications, the word “diverse” doesn’t seem adequate to the task. Something along the lines of Universal Turing Machine seems more apt.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;Software Performance Engineering&lt;/B&gt; (SPE), which is my team’s charter, is the discipline associated with building responsive and highly scalable applications. The problem we face is how to apply the principles and best practices associated with SPE to the engineering practices Microsoft uses to build the complex software we sell to customers. Furthermore, we want to incorporate successful practices and procedures into the developer tools we sell to customers. This would help them develop Windows-based applications that meet stringent performance and scalability requirements.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;One particularly sobering aspect of this challenge is that there is no fully realized blueprint anywhere in the software industry for successfully integrating the performance engineering discipline into the application development life cycle. This has been a problem nagging advocates of the SPE approach from its very inception. Today, for example, there are a number of excellent books available on the subject suitable for general practitioners, including Connie Smith and Lloyd Williams’ &lt;/FONT&gt;&lt;A href="http://www.amazon.com/Performance-Solutions-Responsive-Addison-Wesley-Technology/dp/0201722291/ref=pd_sim_b_title_8" mce_href="http://www.amazon.com/Performance-Solutions-Responsive-Addison-Wesley-Technology/dp/0201722291/ref=pd_sim_b_title_8"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;FONT face=Calibri size=3&gt;Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software&lt;/FONT&gt;&lt;/I&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. Smith and Williams’ book is probably the most thorough practical guide available for the practitioner. I also happen to be particularly partial to Daniel Menasce, &lt;I style="mso-bidi-font-style: normal"&gt;et&lt;/I&gt;. &lt;I style="mso-bidi-font-style: normal"&gt;al&lt;/I&gt;., &lt;/FONT&gt;&lt;A href="http://www.amazon.com/Performance-Design-Computer-Capacity-Planning/dp/0130906735/ref=sr_1_5?ie=UTF8&amp;amp;s=books&amp;amp;qid=1205106664&amp;amp;sr=1-5" mce_href="http://www.amazon.com/Performance-Design-Computer-Capacity-Planning/dp/0130906735/ref=sr_1_5?ie=UTF8&amp;amp;s=books&amp;amp;qid=1205106664&amp;amp;sr=1-5"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;FONT face=Calibri size=3&gt;Performance by Design: Computer Capacity Planning By Example&lt;/FONT&gt;&lt;/I&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, but that approach is more informed by analytic modeling, rather than the software development lifecycle. But neither book is specific to the Microsoft platform in general and the .NET Framework in particular. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The performance engineering guide that comes closest to addressing the specific needs of the .NET developer is &lt;/FONT&gt;&lt;A href="http://msdn2.microsoft.com/en-us/library/ms998530.aspx" mce_href="http://msdn2.microsoft.com/en-us/library/ms998530.aspx"&gt;&lt;I style="mso-bidi-font-style: normal"&gt;&lt;FONT face=Calibri size=3&gt;Improving .NET Application Performance and Scalability&lt;/FONT&gt;&lt;/I&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. This valuable book provides an enormous amount of practical guidance to the .NET developer. I don’t like to think about the Herculean effort it took to produce this extensive developer’s guide, mainly because it will probably be my responsibility to produce the next edition of it. It is almost encyclopedic in scope. Even so, in my view, the advice contained in the book could be improved in at least one key area, which would be the adoption of the empirical, measurement-oriented approach I advocated in my two Windows performance books. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;The number of public APIs in the .NET Framework is enormous, not to mention its extensibility by third party component developers and your in-house programmers. These object-oriented components can then be assembled into an executable program in any imaginable combination. When you factor in the need to consider all possible (or likely) permutations of .NET classes and their methods, my belief is that no static set of developer guidelines can ever be adequate to the task. This, by the way, is not meant to diminish my enthusiasm for an initiative my colleague Rico Mariani calls &lt;I style="mso-bidi-font-style: normal"&gt;performance signatures&lt;/I&gt;, which he has discussed extensively on his &lt;/FONT&gt;&lt;A href="http://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspx" mce_href="http://blogs.msdn.com/ricom/archive/2007/02/07/performance-signatures-cmg-2006-paper.aspx"&gt;&lt;FONT face=Calibri size=3&gt;blog&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, which relies on static analysis. I am sure Rico is correct in the value of some form of static analysis. But, in addition to a static analysis, we also have to measure the application’s performance empirically to see if its execution is taking too long and then determine why. In my mind, the pattern developed in the relational database world around having both a static definition of the execution Plan chosen by the Optimizer and a dynamic Explain function is probably worth copying.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;If you are a .NET developer interested in building a highly scalable application, you are currently pretty much on your own in trying to figure out how to get started. Fortunately, basic performance considerations for .NET developers are not that different than for any other computing platform. But the .NET Framework is so wide and deep, getting going is a daunting prospect. Picking up a copy of &lt;I style="mso-bidi-font-style: normal"&gt;Improving .NET Application Performance and Scalability&lt;/I&gt;, which runs to slightly more than 1000 pages, is also quite intimidating.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;This is also a problem I face in starting to write about this problem space. The subject is so broad and so deep, it is difficult to know quite where the best place to start is. Under these circumstances, blogging about the area more informally than I might otherwise attempt in writing about this subject seems like a reasonable way to get started. I also work collaboratively with a remarkably talented group of individuals here in the Developer Division who share my interest and passion for application performance. They are accomplished authors on their own, too. I will attempt to structure this blog as a team effort so you have a chance to read their contributions to our understanding of this area as well.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;So, if you don’t mind reading a work in progress, please stay tuned. If following me down a blind alley from time to time is a prospect that will dishearten you, I suggest you wait for the polished book version. But I warn, you that is probably at least several years in the future. What I intend to blog about over the next few months is our first tentative steps to formulate an empirical approach to engineering .NET application scalability, informed both by my background in computer performance and access to the internals of Windows, the CLR, and the .NET Framework. Someday, this might form the makings of a decent book on the subject, perhaps version 2 of &lt;I style="mso-bidi-font-style: normal"&gt;Improving .NET Application Performance and Scalability&lt;/I&gt;. But, fair warning, much of what we will be writing about is preliminary, as we work out the most effective set of techniques. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Initially, I expect you will see me writing on an unholy mix of three topics that I am currently thinking about the most:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpFirst style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT face=Calibri size=3&gt;1.&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;Parallel computing, including the multi-core hardware that we can expect and the challenge being issued to programmers to exploit it in the software we develop. I am not sure that I have anything terribly original to say about this topic, but I would at least like to acknowledge its scope and report on what tangible progress I see being made here in this important area.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpMiddle style="MARGIN: 0in 0in 0pt 0.5in"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;ATOMICELEMENT id=ms__id92&gt;&lt;FONT face=Calibri size=3&gt;&lt;HIGHLIGHTTEXT id=ms__id93&gt;&amp;nbsp;&lt;/HIGHLIGHTTEXT&gt;&lt;/FONT&gt;&lt;/ATOMICELEMENT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpMiddle style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT face=Calibri size=3&gt;2.&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;The performance of the Collection classes, which is really a proxy for a wider larger discussion about what software architects and developers need in terms of run-time instrumentation to deal effectively with performance and scalability considerations with the .NET Framework. There are currently some 30 million lines of code in the Framework, so it meets almost anyone’s definition of a huge surface area that developers must understand and interact with in order to be successful. Collections of one sort or another are a good choice to help frame this discussion because they are key elements of almost every substantial application that is built using .NET. Moreover, the choice of which Collection class to use in your application is one of the key decisions developers have to make. Some form of &lt;I style="mso-bidi-font-style: normal"&gt;performance intellisense&lt;/I&gt; (RicoM’s phrase) to help catch poor choices during application design and coding would undoubtedly be a cool feature in Visual Studio. But, realistically, there is no way to say whether the developer has selected the right Collection class until we understand both the size (or &lt;I style="mso-bidi-font-style: normal"&gt;cardinality&lt;/I&gt;) of the collection and the way it is accessed. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpMiddle style="MARGIN: 0in 0in 0pt 0.5in"&gt;&lt;o:p&gt;&lt;ATOMICELEMENT id=ms__id94&gt;&lt;FONT face=Calibri size=3&gt;&lt;HIGHLIGHTTEXT id=ms__id95&gt;&amp;nbsp;&lt;/HIGHLIGHTTEXT&gt;&lt;/FONT&gt;&lt;/ATOMICELEMENT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpMiddle style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT face=Calibri size=3&gt;3.&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;Finally, I plan to write a few blog entries on the subject of performance and scalability testing. Performance testing is another area where the Microsoft Patterns and Practices group has published &lt;/FONT&gt;&lt;A href="http://msdn2.microsoft.com/en-us/library/bb924375.aspx" mce_href="http://msdn2.microsoft.com/en-us/library/bb924375.aspx"&gt;&lt;FONT face=Calibri size=3&gt;a healthy amount of guidance&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. Hopefully, I can contribute something in this space that is specific to the .NET application platform.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpLast style="MARGIN: 0in 0in 10pt 0.5in"&gt;&lt;o:p&gt;&lt;ATOMICELEMENT id=ms__id96&gt;&lt;FONT face=Calibri size=3&gt;&lt;HIGHLIGHTTEXT id=ms__id97&gt;&amp;nbsp;&lt;/HIGHLIGHTTEXT&gt;&lt;/FONT&gt;&lt;/ATOMICELEMENT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoListParagraphCxSpLast style="MARGIN: 0in 0in 10pt 0.5in"&gt;&lt;o:p&gt;&amp;lt;:AtomicElement&amp;gt;&lt;FONT face=Calibri size=3&gt;&amp;lt;:HighlightText&amp;gt;&amp;nbsp;&lt;IMG title="DDPE Blog logo" style="WIDTH: 400px; HEIGHT: 305px" height=305 alt="DDPE Blog logo" hspace=12 src="http://5l3vgw.bay.livefilestore.com/y1p5LRhwhhWeedsXaMm7DhefEcAHkF7Ls2ubyssiag_dskfe2NmQxFxMJPVijVczkAnQ66uBEeiKw5AI0Ur61gQfFNFR9u1SLbq/DDPEbloglogo.jpg" width=400 align=left vspace=12 border=4 mce_src="http://5l3vgw.bay.livefilestore.com/y1p5LRhwhhWeedsXaMm7DhefEcAHkF7Ls2ubyssiag_dskfe2NmQxFxMJPVijVczkAnQ66uBEeiKw5AI0Ur61gQfFNFR9u1SLbq/DDPEbloglogo.jpg"&gt;&lt;/:HIGHLIGHTTEXT&gt;&lt;/FONT&gt;&lt;/:ATOMICELEMENT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt 0.25in"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;Photo credit&lt;/B&gt;: The copyright-protected artwork used in the logo is provided courtesy of James Neff. It is used with the permission of the artist. It is entitled “1947 Mt. Rainier, Washington.” The picture commemorates &lt;/FONT&gt;&lt;/FONT&gt;&lt;A href="http://brumac.8k.com/KARNOLD/KARNOLD.html" mce_href="http://brumac.8k.com/KARNOLD/KARNOLD.html"&gt;&lt;FONT face=Calibri size=3&gt;the detailed, widely reported sighting of UFOs by Kenneth Arnold&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;, a businessman and pilot, in 1947 near Mt. Rainier in Washington, which led to the first coining of the term “flying saucers” to describe the encounter. Reproductions of the artwork suitable for framing are available for sale at &lt;/FONT&gt;&lt;A href="http://www.rense.com/prints.htm" mce_href="http://www.rense.com/prints.htm"&gt;&lt;FONT face=Calibri size=3&gt;http://www.rense.com/prints.htm&lt;/FONT&gt;&lt;/A&gt;&lt;FONT face=Calibri size=3&gt;. By the way, Mr. Arnold never claimed he saw saucer-shaped UFOs, but that was the terminology that captured the imagination of the public. He claimed he saw wing-shaped vehicles flying at great speeds. What he did marvel at was the way these UFOs seemed to float and move like saucers being skipped across a lake or stream.&lt;/FONT&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8328275" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/ddperf/archive/tags/Performance+Engineering/default.aspx">Performance Engineering</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/.NET/default.aspx">.NET</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Scalability/default.aspx">Scalability</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/Parallel+programming/default.aspx">Parallel programming</category><category domain="http://blogs.msdn.com/ddperf/archive/tags/CSharpDevCenter/default.aspx">CSharpDevCenter</category></item></channel></rss>