<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>SQL Server Performance : ETL</title><link>http://blogs.msdn.com/sqlperf/archive/tags/ETL/default.aspx</link><description>Tags: ETL</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>An ETL World Record Revealed (Finally)</title><link>http://blogs.msdn.com/sqlperf/archive/2009/03/03/an-etl-world-record-revealed-finally.aspx</link><pubDate>Wed, 04 Mar 2009 02:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9457870</guid><dc:creator>Data &amp; SQL Storage Performance Team</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/sqlperf/comments/9457870.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlperf/commentrss.aspx?PostID=9457870</wfw:commentRss><wfw:comment>http://blogs.msdn.com/sqlperf/rsscomments.aspx?PostID=9457870</wfw:comment><description>&lt;P&gt;We suppose a more appropriate title would have been: Better Late&amp;nbsp;Than Never.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;David&amp;nbsp;should begin by apologizing. Last month was the &lt;EM&gt;first anniversary&lt;/EM&gt; of the ETL world record we set last year with SSIS, loading 1 TB of data in under 30 minutes. While Len did a nice blog post on the project at the time, we had promised to return to our favorite ETL practitioners (that's you) with more details, pulling back the curtain on Oz so to speak. &lt;/P&gt;
&lt;P&gt;It only took a mere year to get around to that, and a lot of water has gone under the bridge since then. Happily, the paper's now done and &lt;A class="" title="published to the web" href="http://msdn.microsoft.com/en-us/library/dd537533.aspx" target=_blank mce_href="http://msdn.microsoft.com/en-us/library/dd537533.aspx"&gt;published to the web&lt;/A&gt; for your reading pleasure. Whew!&lt;/P&gt;
&lt;P&gt;-David Powell, Len Wyatt &amp;amp; Tim Shea&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9457870" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlperf/archive/tags/Benchmarks/default.aspx">Benchmarks</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/SQL+Server+Performance/default.aspx">SQL Server Performance</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/ETL/default.aspx">ETL</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/SQL+Server+2008/default.aspx">SQL Server 2008</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/SSIS/default.aspx">SSIS</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/Integration+Services/default.aspx">Integration Services</category></item><item><title>ETL World Record!</title><link>http://blogs.msdn.com/sqlperf/archive/2008/02/27/etl-world-record.aspx</link><pubDate>Wed, 27 Feb 2008 22:36:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7921723</guid><dc:creator>Data &amp; SQL Storage Performance Team</dc:creator><slash:comments>29</slash:comments><comments>http://blogs.msdn.com/sqlperf/comments/7921723.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlperf/commentrss.aspx?PostID=7921723</wfw:commentRss><wfw:comment>http://blogs.msdn.com/sqlperf/rsscomments.aspx?PostID=7921723</wfw:comment><description>&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;Today at the launch of SQL Server 2008, you may have seen the references to world-record performance doing a load of data using SSIS.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Microsoft and Unisys announced a record for loading data into a relational database using an Extract, Transform and Load (ETL) tool.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Over 1 TB of TPC-H data was loaded in under 30 minutes.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;I wanted to provide some background material in the form of a Q&amp;amp;A on the record, since it’s hard to give many details in the context of a launch event.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;We are also planning a paper that talks about all this, so think of this article as a place-holder until the full paper comes along.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;I hope you find this background information useful.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoListParagraph style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt 0.5in; TEXT-INDENT: -0.25in; TEXT-ALIGN: justify; mso-list: l0 level1 lfo1"&gt;&lt;SPAN style="mso-fareast-font-family: Calibri; mso-bidi-font-family: Calibri"&gt;&lt;SPAN style="mso-list: Ignore"&gt;&lt;FONT face=Calibri size=3&gt;-&lt;/FONT&gt;&lt;SPAN style="FONT: 7pt 'Times New Roman'"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;Len Wyatt&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;How fast was the data load?&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;More than one terabyte of data was parsed from flat files, transferred over the network and loaded into the destination database in less than 30 minutes, a world record beating all previously published results using an ETL tool.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;That is a rate in excess of 2 TB per hour (650+ MB/second). &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;To be precise, 1.18TB of flat file data was loaded in 1794 seconds.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;This is equivalent to 1.00TB in 25 minutes 20 seconds or 2.36TB per hour.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Why is this important?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;Businesses have ever-increasing volumes of data stored in many heterogeneous systems.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Thay want to know that an ETL tool they choose will be able to support any data volumes they might require.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Microsoft has been making a significant investment in SQL Server Integration Services (SSIS), and this record illustrates the capability of SQL Server Integration Services 2008, SQL Server 2008 and the Unisys ES7000 to handle a significant volume of data at a dramatic speed.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Why not just do a bulk load of the data?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;It is rare in businesses today that data is always available on the destination system, and does not need to be standardized or corrected for errors before loading. These rare cases are the times that bulk loading data makes sense. Data integration can involve complex transformation rules, error checking and data standardization techniques. ETL tools like SSIS can perform these functions such as moving data between systems, reformatting data, integrity checking, key lookups, tracking lineage, and more. SSIS has proven itself to be a versatile ETL tool, and now it is shown to be the fastest one as well.&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/B&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;What data did you choose to load?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;DBGEN tool from the TPC-H benchmark was used to generate 1.18 TB of source data.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;The data were partitioned by DBGEN, allowing it to be loaded in parallel from multiple systems. &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;DBGEN generates data on customers, parts, suppliers, orders and line items.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;It is broadly representative of a wholesale business.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;The data contain a variety of data types, including dates, money amounts, integers, strings and flags.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Please note that the ETL loading results are &lt;B style="mso-bidi-font-weight: normal"&gt;not&lt;/B&gt; TPC-H benchmark results and should not be compared to TPC-H benchmark results.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Was this a certified benchmark?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;There is no commonly accepted benchmark for ETL tools.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Microsoft thinks there should be.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Industry standard benchmarks can lead to healthy competition, better products, and better publication of the techniques used to get high performance.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Microsoft would welcome the opportunity to join with others in the industry to define a common benchmark that reflects the real-world uses of ETL tools.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;The use of TPC-H data for this project was a convenience.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;This is not a TPC-H benchmark result.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;How does this compare to your competitors?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;Multiple competitors have published results based on TPC-H data.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Informatica has the fastest time previously reported, loading 1 TB in over 45 minutes.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;SSIS has now beaten that time by more than 15 minutes.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;There are other claims of fast times that have been made, but on non-standard data sets and without enough information to allow any meaningful comparison.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;This is part of the reason Microsoft would support the creation of an industry standard ETL benchmark.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;What system configuration was used?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;FONT face=Calibri size=3&gt;The database server ran on a Unisys ES7000/one Enterprise Server , with &lt;/FONT&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;32 socket dual core &lt;SPAN style="COLOR: black"&gt;Intel® Xeon&lt;SUP&gt;TM&lt;/SUP&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SUP&gt;&lt;SPAN style="FONT-SIZE: 7.5pt; COLOR: black; LINE-HEIGHT: 115%; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt; &lt;/SPAN&gt;&lt;/SUP&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;3.4 Ghz (7140M) &lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;processors , 256 GB RAM and 8 dual port 4Gbit HBA’s .&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;The SQL Server data was stored on an EMC Clariion CX3-80 SAN with 165 (146 GB/15 krpm) spindles. The database server ran a pre-release build of SQL Server 2008 Enterprise Edition (V10.0.1300.4, built just before the “February 2008 CTP”) on the Windows Server 2008 x64 Datacenter Edition operating system.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;IMG src="http://blogs.msdn.com/photos/sqlperf/images/7921664/371x425.aspx" mce_src="http://blogs.msdn.com/photos/sqlperf/images/7921664/371x425.aspx"&gt;&lt;A href="http://blogs.msdn.com/photos/sqlperf/picture7921664.aspx" mce_href="http://blogs.msdn.com/photos/sqlperf/picture7921664.aspx"&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;Four servers acted as data sources, modeling the fact that data comes from a variety of systems in a modern enterprise.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Each source server ran SSIS packages that sent data across the network to the database server. The source servers ran SSIS from SQL Server build V10.0.1300.4, on the Windows Server 2008 operating system. &lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;Source data came from flat files, as it was generated by DBGEN.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;For the source servers, 4 Unisys ES3220L servers with Windows2008 x64 Enterprise Edition were used. Each server is equipped with 2 x 2.0GHz quad &lt;/FONT&gt;&lt;SPAN style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: 'Arial','sans-serif'; mso-fareast-font-family: 'Times New Roman'"&gt;core &lt;SPAN style="COLOR: black"&gt;Intel® &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;FONT face=Calibri size=3&gt;processors, 4GB RAM, a dual port 4Gbit Emulex HBA and Intel PRO1000/PT network card. The source data was read from 2 x EMC Clariion CX600 SAN’s with 45 spindles each. &lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;The Source servers were connected to the ES7000/one server database server with private dual port 1Gb Ethernet connections.&lt;I style="mso-bidi-font-style: normal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/I&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Why use multiple source systems?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;Modern large businesses are complex operations.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Large data sets are often the result of multiple data feeds.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;This made the test more realistic by mimicking a real world ETL scenario.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;What do the SSIS packages look like?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;There was just one package, though the source systems ran multiple instances of it.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;It is quite simple:&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;There is one control flow for each “stream” of data generated by DBGEN.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;The control flow has one data flow for each table, each data flow reading data from a flat file source and writing to the SQL Server database via OLEDB.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Using this data set there is a one-to-one column mapping between the flat file data and the database tables.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt" mce_keep="true"&gt;&lt;IMG style="WIDTH: 700px; HEIGHT: 525px" height=525 src="http://blogs.msdn.com/photos/sqlperf/images/7921672/original.aspx" width=700 mce_src="http://blogs.msdn.com/photos/sqlperf/images/7921672/original.aspx"&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Did Windows Server 2008 figure in to this?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;A lot of innovative engineering work in Windows Server 2008, including significant improvements in memory management, PCI and block storage I/O, and core networking, helped achieve this great performance. Because of these advances, Windows Server 2008 sustained about 960 megabytes per second over the Ethernet network, during processing of one large table.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;B style="mso-bidi-font-weight: normal"&gt;&lt;FONT size=3&gt;&lt;FONT face=Calibri&gt;Were secret internal tricks were needed to make this work?&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;No secret internal tricks or special builds were needed.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Although this project used a pre-release version, it was a regular SQL2008 Enterprise Edition build.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;No special code in the product was used.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Everything we did could be replicated by others.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;The main thing done in the relational database was to use “soft NUMA” and port mapping to get a good distribution of work within the system.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;This is a published technique; you can find articles about it on MSDN.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;We also set the –x flag on starting SQL Server.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;This reduces the time SQL Server spends collecting performance statistics at run-time.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;In SSIS we made sure the data types used in the SSIS data flows matched the types used in SQL Server, so the data did not need to be converted again after the initial conversion of strings read from flat files.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Fast Parse is set on the text file fields where it applied.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="TEXT-JUSTIFY: inter-ideograph; MARGIN: 0in 0in 10pt; TEXT-ALIGN: justify"&gt;&lt;FONT face=Calibri size=3&gt;The network connections on the server used the built-in Intel PRO/1000 GbE controllers. Released versions of network drivers were used, and Ethernet jumbo frames were configured to better support this bulk streaming scenario. Window Server 2008’s new TCP/IP receive window autotuning was set to “restricted”.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;The IntPolicy tool was used to ensure the ES7000 server NICs’s interrupts &amp;amp; DPCs occurred on a CPU affinitized to the same NUMA node as the NIC.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;A complete list of settings and optimizations will be included in the paper when it is released.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=7921723" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlperf/archive/tags/Benchmarks/default.aspx">Benchmarks</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/Announcements/default.aspx">Announcements</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/SQL+Server+Performance/default.aspx">SQL Server Performance</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/ETL/default.aspx">ETL</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/SQL+Server+2008/default.aspx">SQL Server 2008</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/SSIS/default.aspx">SSIS</category><category domain="http://blogs.msdn.com/sqlperf/archive/tags/Integration+Services/default.aspx">Integration Services</category></item></channel></rss>