<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>SSIS Team Blog : Lookup</title><link>http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx</link><description>Tags: Lookup</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>API Sample – Lookup Transform</title><link>http://blogs.msdn.com/mattm/archive/2009/01/02/api-sample-lookup-transform.aspx</link><pubDate>Fri, 02 Jan 2009 21:34:50 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9269538</guid><dc:creator>mmasson</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9269538.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9269538</wfw:commentRss><description>&lt;p&gt;This sample creates a data flow package with an OLEDB Source component feeding into a Lookup Transform. The Lookup transform is set to Full Cache mode, and uses [DimCustomer] as its reference table. &lt;/p&gt;  &lt;p&gt;Items of interest:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;CustomerKey and GeographyKey are used as the index (join) columns. This is configured by using the JoinToReferenceColumn property&lt;/li&gt;    &lt;li&gt;The FirstName column is being overwritten by the value retrieved by the lookup transform&lt;/li&gt;    &lt;li&gt;The LastName2 column is being added as a new output column&lt;/li&gt; &lt;/ul&gt;  &lt;pre class="code"&gt;&lt;span style="color: blue"&gt;static void &lt;/span&gt;Main(&lt;span style="color: blue"&gt;string&lt;/span&gt;[] args)
{
    &lt;span style="color: #2b91af"&gt;Package &lt;/span&gt;package = &lt;span style="color: blue"&gt;new &lt;/span&gt;&lt;span style="color: #2b91af"&gt;Package&lt;/span&gt;();

    &lt;span style="color: green"&gt;// Add Data Flow Task
    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;Executable &lt;/span&gt;dataFlowTask = package.Executables.Add(&lt;span style="color: #a31515"&gt;&amp;quot;STOCK:PipelineTask&amp;quot;&lt;/span&gt;);

    &lt;span style="color: green"&gt;// Set the name (otherwise it will be a random GUID value)
    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;TaskHost &lt;/span&gt;taskHost = dataFlowTask &lt;span style="color: blue"&gt;as &lt;/span&gt;&lt;span style="color: #2b91af"&gt;TaskHost&lt;/span&gt;;
    taskHost.Name = &lt;span style="color: #a31515"&gt;&amp;quot;Data Flow Task&amp;quot;&lt;/span&gt;;

    &lt;span style="color: green"&gt;// We need a reference to the InnerObject to add items to the data flow
    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;MainPipe &lt;/span&gt;pipeline = taskHost.InnerObject &lt;span style="color: blue"&gt;as &lt;/span&gt;&lt;span style="color: #2b91af"&gt;MainPipe&lt;/span&gt;;

    &lt;span style="color: green"&gt;//
    // Add connection manager
    //

    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;ConnectionManager &lt;/span&gt;connection = package.Connections.Add(&lt;span style="color: #a31515"&gt;&amp;quot;OLEDB&amp;quot;&lt;/span&gt;);
    connection.Name = &lt;span style="color: #a31515"&gt;&amp;quot;localhost&amp;quot;&lt;/span&gt;;
    connection.ConnectionString = &lt;span style="color: #a31515"&gt;&amp;quot;Data Source=localhost;Initial Catalog=AdventureWorksDW2008;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;&amp;quot;&lt;/span&gt;;

    &lt;span style="color: green"&gt;//
    // Add OLEDB Source
    //

    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;IDTSComponentMetaData100 &lt;/span&gt;srcComponent = pipeline.ComponentMetaDataCollection.New();
    srcComponent.ComponentClassID = &lt;span style="color: #a31515"&gt;&amp;quot;DTSAdapter.OleDbSource&amp;quot;&lt;/span&gt;;
    srcComponent.ValidateExternalMetadata = &lt;span style="color: blue"&gt;true&lt;/span&gt;;
    &lt;span style="color: #2b91af"&gt;IDTSDesigntimeComponent100 &lt;/span&gt;srcDesignTimeComponent = srcComponent.Instantiate();
    srcDesignTimeComponent.ProvideComponentProperties();
    srcComponent.Name = &lt;span style="color: #a31515"&gt;&amp;quot;OleDb Source&amp;quot;&lt;/span&gt;;

    &lt;span style="color: green"&gt;// Configure it to read from the given table
    &lt;/span&gt;srcDesignTimeComponent.SetComponentProperty(&lt;span style="color: #a31515"&gt;&amp;quot;AccessMode&amp;quot;&lt;/span&gt;, 0);
    srcDesignTimeComponent.SetComponentProperty(&lt;span style="color: #a31515"&gt;&amp;quot;OpenRowset&amp;quot;&lt;/span&gt;, &lt;span style="color: #a31515"&gt;&amp;quot;[DimCustomer]&amp;quot;&lt;/span&gt;);

    &lt;span style="color: green"&gt;// Set the connection manager
    &lt;/span&gt;srcComponent.RuntimeConnectionCollection[0].ConnectionManager = &lt;span style="color: #2b91af"&gt;DtsConvert&lt;/span&gt;.GetExtendedInterface(connection);
    srcComponent.RuntimeConnectionCollection[0].ConnectionManagerID = connection.ID;

    &lt;span style="color: green"&gt;// Retrieve the column metadata
    &lt;/span&gt;srcDesignTimeComponent.AcquireConnections(&lt;span style="color: blue"&gt;null&lt;/span&gt;);
    srcDesignTimeComponent.ReinitializeMetaData();
    srcDesignTimeComponent.ReleaseConnections();

    &lt;span style="color: green"&gt;// Add transform
    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;IDTSComponentMetaData100 &lt;/span&gt;lookupComponent = pipeline.ComponentMetaDataCollection.New();
    lookupComponent.ComponentClassID = &lt;span style="color: #a31515"&gt;&amp;quot;DTSTransform.Lookup&amp;quot;&lt;/span&gt;;
    lookupComponent.Name = &lt;span style="color: #a31515"&gt;&amp;quot;Lookup&amp;quot;&lt;/span&gt;;

    &lt;span style="color: #2b91af"&gt;CManagedComponentWrapper &lt;/span&gt;lookupWrapper = lookupComponent.Instantiate();
    lookupWrapper.ProvideComponentProperties();

    &lt;span style="color: green"&gt;// Connect the source and the transform
    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;IDTSPath100 &lt;/span&gt;path = pipeline.PathCollection.New();
    path.AttachPathAndPropagateNotifications(srcComponent.OutputCollection[0], lookupComponent.InputCollection[0]);

    &lt;span style="color: green"&gt;//
    // Configure the transform
    //

    // Set the connection manager
    &lt;/span&gt;lookupComponent.RuntimeConnectionCollection[0].ConnectionManager = &lt;span style="color: #2b91af"&gt;DtsConvert&lt;/span&gt;.GetExtendedInterface(connection);
    lookupComponent.RuntimeConnectionCollection[0].ConnectionManagerID = connection.ID;

    &lt;span style="color: green"&gt;// Cache Type - Full = 0, Partial = 1, None = 2
    &lt;/span&gt;lookupWrapper.SetComponentProperty(&lt;span style="color: #a31515"&gt;&amp;quot;CacheType&amp;quot;&lt;/span&gt;, 0);
    lookupWrapper.SetComponentProperty(&lt;span style="color: #a31515"&gt;&amp;quot;SqlCommand&amp;quot;&lt;/span&gt;, &lt;span style="color: #a31515"&gt;&amp;quot;select * from [DimCustomer]&amp;quot;&lt;/span&gt;);

    &lt;span style="color: green"&gt;// initialize metadata
    &lt;/span&gt;lookupWrapper.AcquireConnections(&lt;span style="color: blue"&gt;null&lt;/span&gt;);
    lookupWrapper.ReinitializeMetaData();
    lookupWrapper.ReleaseConnections();

    &lt;span style="color: green"&gt;// Mark the columns we are joining on
    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;IDTSInput100 &lt;/span&gt;lookupInput = lookupComponent.InputCollection[0];
    &lt;span style="color: #2b91af"&gt;IDTSInputColumnCollection100 &lt;/span&gt;lookupInputColumns = lookupInput.InputColumnCollection;
    &lt;span style="color: #2b91af"&gt;IDTSVirtualInput100 &lt;/span&gt;lookupVirtualInput = lookupInput.GetVirtualInput();
    &lt;span style="color: #2b91af"&gt;IDTSVirtualInputColumnCollection100 &lt;/span&gt;lookupVirtualInputColumns = lookupVirtualInput.VirtualInputColumnCollection;

    &lt;span style="color: green"&gt;// We are joining on CustomerKey and GeographyKey
    // Note: join columns should be marked as READONLY
    &lt;/span&gt;&lt;span style="color: blue"&gt;var &lt;/span&gt;joinColumns = &lt;span style="color: blue"&gt;new string&lt;/span&gt;[] { &lt;span style="color: #a31515"&gt;&amp;quot;CustomerKey&amp;quot;&lt;/span&gt;, &lt;span style="color: #a31515"&gt;&amp;quot;GeographyKey&amp;quot; &lt;/span&gt;};
    &lt;span style="color: blue"&gt;foreach &lt;/span&gt;(&lt;span style="color: blue"&gt;string &lt;/span&gt;columnName &lt;span style="color: blue"&gt;in &lt;/span&gt;joinColumns)
    {
        &lt;span style="color: #2b91af"&gt;IDTSVirtualInputColumn100 &lt;/span&gt;virtualColumn = lookupVirtualInputColumns[columnName];
        &lt;span style="color: #2b91af"&gt;IDTSInputColumn100 &lt;/span&gt;inputColumn = lookupWrapper.SetUsageType(lookupInput.ID, lookupVirtualInput, virtualColumn.LineageID, &lt;span style="color: #2b91af"&gt;DTSUsageType&lt;/span&gt;.UT_READONLY);
        lookupWrapper.SetInputColumnProperty(lookupInput.ID, inputColumn.ID, &lt;span style="color: #a31515"&gt;&amp;quot;JoinToReferenceColumn&amp;quot;&lt;/span&gt;, columnName);
    }
    
    &lt;span style="color: green"&gt;// Overwrite the existing FirstName column value with the one returned by the Lookup.
    // To do this, we need to flag the column as READWRITE, and set the CopyFromReferenceColumn property on the input
    &lt;/span&gt;&lt;span style="color: blue"&gt;var &lt;/span&gt;overwriteColumns = &lt;span style="color: blue"&gt;new string&lt;/span&gt;[] { &lt;span style="color: #a31515"&gt;&amp;quot;FirstName&amp;quot; &lt;/span&gt;};
    &lt;span style="color: blue"&gt;foreach &lt;/span&gt;(&lt;span style="color: blue"&gt;string &lt;/span&gt;columnName &lt;span style="color: blue"&gt;in &lt;/span&gt;overwriteColumns)
    {
        &lt;span style="color: #2b91af"&gt;IDTSVirtualInputColumn100 &lt;/span&gt;virtualColumn = lookupVirtualInputColumns[columnName];
        &lt;span style="color: #2b91af"&gt;IDTSInputColumn100 &lt;/span&gt;inputColumn = lookupWrapper.SetUsageType(lookupInput.ID, lookupVirtualInput, virtualColumn.LineageID, &lt;span style="color: #2b91af"&gt;DTSUsageType&lt;/span&gt;.UT_READWRITE);

        lookupWrapper.SetInputColumnProperty(lookupInput.ID, inputColumn.ID, &lt;span style="color: #a31515"&gt;&amp;quot;CopyFromReferenceColumn&amp;quot;&lt;/span&gt;, columnName);
    }

    &lt;span style="color: green"&gt;// First output is the Match output
    &lt;/span&gt;&lt;span style="color: #2b91af"&gt;IDTSOutput100 &lt;/span&gt;lookupMatchOutput = lookupComponent.OutputCollection[0];

    &lt;span style="color: green"&gt;// Add a new LastName2 column from the &amp;quot;LastName&amp;quot; column returned by the lookup
    &lt;/span&gt;&lt;span style="color: blue"&gt;var &lt;/span&gt;newColumns = &lt;span style="color: blue"&gt;new &lt;/span&gt;&lt;span style="color: #2b91af"&gt;Dictionary&lt;/span&gt;&amp;lt;&lt;span style="color: blue"&gt;string&lt;/span&gt;, &lt;span style="color: blue"&gt;string&lt;/span&gt;&amp;gt;();
    newColumns.Add(&lt;span style="color: #a31515"&gt;&amp;quot;LastName&amp;quot;&lt;/span&gt;, &lt;span style="color: #a31515"&gt;&amp;quot;LastName2&amp;quot;&lt;/span&gt;);

    &lt;span style="color: blue"&gt;foreach &lt;/span&gt;(&lt;span style="color: blue"&gt;string &lt;/span&gt;sourceColumn &lt;span style="color: blue"&gt;in &lt;/span&gt;newColumns.Keys)
    {
        &lt;span style="color: blue"&gt;string &lt;/span&gt;newColumnName = newColumns[sourceColumn];
        &lt;span style="color: blue"&gt;string &lt;/span&gt;description = &lt;span style="color: blue"&gt;string&lt;/span&gt;.Format(&lt;span style="color: #a31515"&gt;&amp;quot;Copy of {0}&amp;quot;&lt;/span&gt;, sourceColumn);

        &lt;span style="color: green"&gt;// insert the new column
        &lt;/span&gt;&lt;span style="color: #2b91af"&gt;IDTSOutputColumn100 &lt;/span&gt;outputColumn = lookupWrapper.InsertOutputColumnAt(lookupMatchOutput.ID, 0, newColumnName, description);
        lookupWrapper.SetOutputColumnProperty(lookupMatchOutput.ID, outputColumn.ID, &lt;span style="color: #a31515"&gt;&amp;quot;CopyFromReferenceColumn&amp;quot;&lt;/span&gt;, sourceColumn);
    }
}&lt;/pre&gt;
&lt;a href="http://11011.net/software/vspaste"&gt;&lt;/a&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9269538" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Samples/default.aspx">Samples</category><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category><category domain="http://blogs.msdn.com/mattm/archive/tags/API/default.aspx">API</category></item><item><title>Lookup Pattern: Range Lookups</title><link>http://blogs.msdn.com/mattm/archive/2008/11/25/lookup-pattern-range-lookups.aspx</link><pubDate>Tue, 25 Nov 2008 20:30:28 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9142432</guid><dc:creator>mmasson</dc:creator><slash:comments>11</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9142432.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9142432</wfw:commentRss><description>&lt;p&gt;Performing range lookups (i.e. to find a key for a given range) is a common ETL operation in data warehousing scenarios. It's especially for historical loads and late arriving fact situations, where you're using &lt;a href="http://en.wikipedia.org/wiki/Slowly_Changing_Dimension"&gt;type 2 dimensions&lt;/a&gt; and you need to locate the key which represents the dimension value for a given point in time.&lt;/p&gt;  &lt;p&gt;This blog post outlines three separate approaches for doing range lookups in SSIS:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;Using the Lookup Transform&lt;/li&gt;    &lt;li&gt;Merge Join + Conditional Split&lt;/li&gt;    &lt;li&gt;Script Component&lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;All of our scenarios will use the AdventureWorksDW2008 sample database (DimProduct table) as the dimension, and take its fact data from AdventureWorks2008 (SalesOrderHeader and SalesOrderDetail tables). The &amp;quot;ProductNumber&amp;quot; column from the SalesOrderDetail table maps to the natural key of the DimProduct dimension (ProductAlternateKey column). In all cases we want to lookup the key (ProductKey) for the product which was valid (identified by StartDate and EndDate) for the given OrderDate.&lt;/p&gt;  &lt;p&gt;One last thing to note is that the Merge Join and Script Component solutions assume that a valid range exists for each incoming value. The Lookup Transform approach is the only one that will identify rows that have no matches (although the Script Component solution could be modified to do so as well).&lt;/p&gt;  &lt;h3&gt;Lookup Transform&lt;/h3&gt;  &lt;p&gt;The Lookup Transform was designed to handle 1:1 key matching, but it can also be used in the range lookup scenario by using a partial cache mode, and tweaking the query on the Advanced Settings page. However, the Lookup doesn't cache the range itself, and will end up going to the database very often - it will only detect a match in its cache if all of the parameters are the same (i.e. same product purchased on the same date). &lt;/p&gt;  &lt;p&gt;We can use the following query to have the lookup transform perform our range lookup:&lt;/p&gt;  &lt;pre class="code"&gt;&lt;span style="color: #a7b4bc"&gt;select &lt;/span&gt;&lt;span style="color: black"&gt;[ProductKey], [ProductAlternateKey], 
     [StartDate], [EndDate]
&lt;/span&gt;&lt;span style="color: #a7b4bc"&gt;from &lt;/span&gt;&lt;span style="color: black"&gt;[dbo].[DimProduct]
&lt;/span&gt;&lt;span style="color: #a7b4bc"&gt;where &lt;/span&gt;&lt;span style="color: black"&gt;[ProductAlternateKey] = ?
&lt;/span&gt;&lt;span style="color: #a7b4bc"&gt;and   &lt;/span&gt;&lt;span style="color: black"&gt;[StartDate] &amp;lt;= ?
&lt;/span&gt;&lt;span style="color: #a7b4bc"&gt;and &lt;/span&gt;&lt;span style="color: black"&gt;(
    [EndDate] is null &lt;/span&gt;&lt;span style="color: #a7b4bc"&gt;or 
    &lt;/span&gt;&lt;span style="color: black"&gt;[EndDate] &amp;gt; ?
)
&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;On the query parameters page, we map 0 -&amp;gt; ProductNumber, 1 and 2 -&amp;gt; OrderDate. &lt;/p&gt;

&lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/image_2.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="275" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/image_thumb.png" width="265" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This approach is effective and easy to setup, but it is pretty slow when dealing with a large number of rows, as most lookups will be going to the database. &lt;/p&gt;

&lt;h3&gt;Merge Join and Conditional Split&lt;/h3&gt;

&lt;p&gt;This approach doesn't use the Lookup Transform. Instead we use a Merge Join Transform to do an inner join on our dimension table. This will give us more rows coming out than we had coming in (you'll get a row for every repeated ProductAlternateKey). We use the conditional split to do the actual range check, and take only the rows that fall into the right range.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/image_4.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="275" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/image_thumb_1.png" width="357" border="0" /&gt;&lt;/a&gt;&amp;#160;&lt;/p&gt;

&lt;p&gt;For example, a row coming in from our source would contain an OrderDate and ProductNumber, like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/table1_2.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="56" alt="table1" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/table1_thumb.png" width="240" border="0" /&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;From the DimProduct source, we take three additional columns - ProductKey (what we're after), StartDate and EndDate. The DimProduct dimension contains three entries for the &amp;quot;LJ-0192-L&amp;quot; product (as its information, like unit price, has changed over time). After going through the Merge Join, the single row becomes three rows. &lt;/p&gt;

&lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/table2_2.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="100" alt="table2" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/table2_thumb.png" width="386" border="0" /&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;We use the Conditional Split to do the range lookup, and take the single row we want. Here is our expression (remember, in our case an EndDate value of NULL indicates that it's the most current row): &lt;/p&gt;

&lt;p&gt;&lt;font face="Courier New" size="2"&gt;StartDate &amp;lt;= OrderDate &amp;amp;&amp;amp; (OrderDate &amp;lt; EndDate || ISNULL(EndDate))&lt;/font&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/table3_2.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="60" alt="table3" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/table3_thumb.png" width="240" border="0" /&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;This approach is a little more complicated, but performs a lot better than using the Lookup Transform.&lt;/p&gt;

&lt;h3&gt;Script Component&lt;/h3&gt;

&lt;p&gt;The third approach uses a custom script component to perform the lookup. I wrote the script in two ways - one that simulates a &amp;quot;Full Cache&amp;quot; type lookup, and one that is similar to partial cache except it pulls back all values for a given natural key, instead of just the one for the given date range. The caching behavior is controlled by the PreCache boolean package variable.&lt;/p&gt;

&lt;p&gt;&lt;iframe style="border-right: #dde5e9 1px solid; padding-right: 0px; border-top: #dde5e9 1px solid; padding-left: 0px; padding-bottom: 0px; margin: 3px; border-left: #dde5e9 1px solid; width: 240px; padding-top: 0px; border-bottom: #dde5e9 1px solid; height: 66px; background-color: #ffffff" marginwidth="0" marginheight="0" src="http://cid-2aeb3aa8bb4bd9fd.skydrive.live.com/embedrowdetail.aspx/Public/RangeLookupScript.cs" frameborder="0" scrolling="no"&gt;&lt;/iframe&gt;&lt;/p&gt;

&lt;h3&gt;Conclusion&lt;/h3&gt;

&lt;p&gt;I ran the three packages using the following environment (my laptop):&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Dual core Intel 1.8ghz&lt;/li&gt;

  &lt;li&gt;3gb of RAM&lt;/li&gt;

  &lt;li&gt;AdventureWorks2008 and AdventureWorksDW2008&lt;/li&gt;

  &lt;ul&gt;
    &lt;li&gt;~120,000 order rows (SalesOrderDetail)&lt;/li&gt;

    &lt;li&gt;~600 reference rows (DimProduct)&lt;/li&gt;
  &lt;/ul&gt;
&lt;/ul&gt;

&lt;p&gt;Here are the results, in rows per second (larger being better):&lt;/p&gt;

&lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/image_12.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="362" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternRangeLookups_ADC3/image_thumb_5.png" width="550" border="0" /&gt;&lt;/a&gt;&amp;#160;&lt;/p&gt;

&lt;p&gt;At 120k+ rows per second, we can see that the custom script (or better yet, a custom transform) is the best alternative here. We can also see that even though the Lookup approach was by far the slowest (3639 rows / second), it is still a viable choice when you're processing a small number of rows. &lt;/p&gt;

&lt;p&gt;There's a couple of reasons that the Lookup Transform performs poorly here. First, because it's not able to pre-cache any of the reference data, it has to go to the database often. Second, it matches only on actual parameter values - it doesn't have a concept of ranges. Since it will only find a cache hit if all parameters are the same, it ends up hitting the database for almost every row (120k times). By comparison, the script component will only query once per unique ProductNumber (~600 times max). &lt;/p&gt;

&lt;p&gt;So there you have three different approaches for doing range lookups in SSIS. I'm hoping we'll be able to either enhance the Lookup component to support this functionality in the future, or perhaps provide a new transform to handle this case. &lt;/p&gt;

&lt;p&gt;In the mean time, please feel free to post / email any alternative approaches you might have.&lt;/p&gt;

&lt;p&gt;I've attached the packages used in this post incase you want to try out the different options for yourself.&lt;/p&gt;

&lt;p&gt;&amp;#160;&lt;iframe style="border-right: #dde5e9 1px solid; padding-right: 0px; border-top: #dde5e9 1px solid; padding-left: 0px; padding-bottom: 0px; margin: 3px; border-left: #dde5e9 1px solid; width: 240px; padding-top: 0px; border-bottom: #dde5e9 1px solid; height: 66px; background-color: #ffffff" marginwidth="0" marginheight="0" src="http://cid-2aeb3aa8bb4bd9fd.skydrive.live.com/embedrowdetail.aspx/Public/RangeLookup.zip" frameborder="0" scrolling="no"&gt;&lt;/iframe&gt;&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9142432" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup Pattern: Incremental persistent cache updates</title><link>http://blogs.msdn.com/mattm/archive/2008/11/23/lookup-pattern-incremental-persistent-cache-updates.aspx</link><pubDate>Sun, 23 Nov 2008 23:16:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9134966</guid><dc:creator>mmasson</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9134966.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9134966</wfw:commentRss><description>&lt;P&gt;The &lt;A href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-using-the-cache-connection-manager.aspx" mce_href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-using-the-cache-connection-manager.aspx"&gt;Cache Transform&lt;/A&gt; isn't currently able to update (i.e. append to) an existing persistent cache file. This pattern presents a way to incrementally build up your lookup cache if your data flow process is responsible for adding new rows to your reference table. For an alternative approach, please see the Other Resources section at the end of this post.&lt;/P&gt;
&lt;P&gt;The process is split up across multiple packages. Because a Cache Connection Manager reads its cache file once, and keeps the cache in memory until package execution is complete, you can't read and update a cache file in the same package. &lt;/P&gt;
&lt;H3&gt;Control Package&lt;/H3&gt;
&lt;P&gt;This is the control package which uses a script task to check if the cache file already exists. If it doesn't, it executes the "Create Cache" package. Once the cache is there, the main data flow package which performs your general ETL logic is executed. The "Data Flow" package will write out any new rows for the reference cache to a temporary file, and update a variable in the control package to indicate how many rows were written out. The parent checks if any rows were written, and if so, runs the "Update Cache" package to update the persistent cache file&lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_2.png" mce_href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_2.png"&gt;&lt;IMG style="BORDER-TOP-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-RIGHT-WIDTH: 0px" height=355 alt=image src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb.png" width=640 border=0 mce_src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb.png"&gt;&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Create Cache&lt;/H3&gt;
&lt;P&gt;The cache creation package reads the source reference data, and then writes it to two separate locations. First it uses the Cache Transform to create a persistent cache, and then it writes the same data out to a RAW file. This RAW file will be incrementally updated by the Data Flow package to store the latest data for our reference set.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_6.png" mce_href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_6.png"&gt;&lt;IMG style="BORDER-TOP-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-RIGHT-WIDTH: 0px" height=319 alt=image src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb_2.png" width=640 border=0 mce_src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb_2.png"&gt;&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Data Flow&lt;/H3&gt;
&lt;P&gt;In the main ETL package, we use a &lt;A href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-key-generation.aspx" mce_href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-key-generation.aspx"&gt;key generation pattern&lt;/A&gt; to generate and insert new values into our reference table. Instead of rejoining the main flow right away, we multicast it out to a Row Count, followed by a RAW File Destination. &lt;/P&gt;
&lt;P&gt;The Row Count writes to a local variable. After this data flow, we'll use a script task to copy this variable's value over to a variable in the parent package. You can see &lt;A href="http://blogs.conchango.com/jamiethomson/archive/2005/09/01/2096.aspx" mce_href="http://blogs.conchango.com/jamiethomson/archive/2005/09/01/2096.aspx"&gt;Jamie Thomson's post&lt;/A&gt; for more about how this works.&lt;/P&gt;
&lt;P&gt;The RAW File Destination is able to append to an existing RAW File. Here we use it to add the new rows to the "incremental update" file we created in the "Create Cache" package.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_8.png" mce_href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_8.png"&gt;&lt;IMG style="BORDER-TOP-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-RIGHT-WIDTH: 0px" height=370 alt=image src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb_3.png" width=640 border=0 mce_src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb_3.png"&gt;&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Update Cache&lt;/H3&gt;
&lt;P&gt;If the main ETL package wrote any new rows to the incremental update RAW file, the control package will run the Update Cache package. This package simply reads the incremental update file (which will contain the entire cache, since it's simply being appended to each time), and recreate the cache file.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_10.png" mce_href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_10.png"&gt;&lt;IMG style="BORDER-TOP-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-RIGHT-WIDTH: 0px" height=270 alt=image src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb_4.png" width=640 border=0 mce_src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternIncrementalpersistentcacheu_950C/image_thumb_4.png"&gt;&lt;/A&gt; &lt;/P&gt;
&lt;P&gt;The sample packages used in this post can be downloaded from my SkyDrive share. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;IFRAME style="BORDER-RIGHT: #dde5e9 1px solid; PADDING-RIGHT: 0px; BORDER-TOP: #dde5e9 1px solid; PADDING-LEFT: 0px; PADDING-BOTTOM: 0px; MARGIN: 3px; BORDER-LEFT: #dde5e9 1px solid; WIDTH: 240px; PADDING-TOP: 0px; BORDER-BOTTOM: #dde5e9 1px solid; HEIGHT: 66px; BACKGROUND-COLOR: #ffffff" marginWidth=0 marginHeight=0 src="http://cid-2aeb3aa8bb4bd9fd.skydrive.live.com/embedrowdetail.aspx/Public/IncrementalCacheUpdate.zip" frameBorder=0 scrolling=no mce_src="http://cid-2aeb3aa8bb4bd9fd.skydrive.live.com/embedrowdetail.aspx/Public/IncrementalCacheUpdate.zip"&gt;&lt;/IFRAME&gt;&lt;/P&gt;
&lt;H3&gt;Other Resources&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;John Welch has an &lt;A href="http://agilebi.com/cs/blogs/jwelch/archive/2008/05/06/ssis-2008-incrementally-updating-the-lookup-cache-file.aspx" mce_href="http://agilebi.com/cs/blogs/jwelch/archive/2008/05/06/ssis-2008-incrementally-updating-the-lookup-cache-file.aspx"&gt;alternative approach to incremental updates&lt;/A&gt; &lt;/LI&gt;&lt;/UL&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9134966" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup Pattern: Case Insensitive</title><link>http://blogs.msdn.com/mattm/archive/2008/11/23/lookup-pattern-case-insensitive.aspx</link><pubDate>Sun, 23 Nov 2008 20:53:24 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9134725</guid><dc:creator>mmasson</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9134725.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9134725</wfw:commentRss><description>&lt;p&gt;The Lookup Transform does case sensitive string comparisons. This means that you need to a little bit of special handling to get it to work in a case insensitive way.&lt;/p&gt;  &lt;p&gt;In most cases (and especially if you're using a case sensitive collation on the database/table that holds your reference data), you'll want to add an upper case version of your index column(s) to your data flow (or replace your current column(s) if you don't need it later downstream). This can be done directly at the source, through a Character Map transform, or a Derived Column transform (both transforms should perform about the same). &lt;/p&gt;  &lt;p&gt;You'll have to use the UPPER() function on the columns in your lookup transform's SQL statement as well to get them to match. You'll want to use a &lt;a href="http://blogs.msdn.com/mattm/archive/2008/10/18/lookup-cache-modes.aspx"&gt;Full cache mode&lt;/a&gt; here, because using a function (like UPPER()) on your index columns will prevent the database from using any indexes it has created on the column. &lt;/p&gt;  &lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternCaseInsensitive_12D04/image_6.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="375" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternCaseInsensitive_12D04/image_thumb_2.png" width="662" border="0" /&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;This is a case where you can also use the new &lt;a href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-using-the-cache-connection-manager.aspx"&gt;cache connection manager&lt;/a&gt; to pre-create your case insensitive cache.&lt;/p&gt;  &lt;p&gt;If your lookup columns are indexed, your database has a case insensitive collation, and you're not going to be hitting a large percentage of your reference table, you might want to take an alternate approach. Switch the cache mode to Partial and remove the UPPER() function from your SQL statement in the Lookup. This will allow the query engine to make use of your table indexes. &lt;/p&gt;  &lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternCaseInsensitive_12D04/image_2.png"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="375" alt="case insensitive collation" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternCaseInsensitive_12D04/image_thumb.png" width="640" border="0" /&gt;&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;h3&gt;Other Resources&lt;/h3&gt;  &lt;p&gt;See a related post and discussion about case insensitive lookups on &lt;a href="http://blogs.conchango.com/jamiethomson/archive/2008/02/12/SSIS_3A00_-Case_2D00_sensitivity-in-Lookup-component.aspx"&gt;Jamie Thomson's blog&lt;/a&gt;. &lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9134725" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup Pattern: Cascading</title><link>http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-cascading.aspx</link><pubDate>Sun, 23 Nov 2008 07:46:34 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9133432</guid><dc:creator>mmasson</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9133432.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9133432</wfw:commentRss><description>&lt;p&gt;The cascading lookup pattern uses two lookup transforms with different &lt;a href="http://blogs.msdn.com/mattm/archive/2008/10/18/lookup-cache-modes.aspx"&gt;cache modes&lt;/a&gt;. A common use of this pattern is when your data flow is inserting new rows into your reference table. &lt;/p&gt;  &lt;p&gt;The first lookup in the chain is set to Full cache mode. Since it creates its cache before the data flow begins, it will only have the keys that exist before the package was executed. &lt;/p&gt;  &lt;p&gt;A second lookup is hooked up to the No Match output of the first, using a Partial Cache mode. This one will pick up any rows that have been added since the data flow began. We hookup any logic needed to generate the key, or insert the row into the database into the No Match output of the second lookup.&lt;/p&gt;  &lt;p&gt;Note, you don&amp;#8217;t really need that first lookup &amp;#8211; you could accomplish the same thing with a single lookup in a partial cache mode. But if you&amp;#8217;re processing a good number of rows, and a large number of your keys already exist, the first lookup will improve your overall performance.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternCascading_11F2E/image_2.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="443" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternCascading_11F2E/image_thumb.png" width="640" border="0" /&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;Make sure that you do not enable the Miss Cache (&amp;quot;Enable cache for rows with no matching entries&amp;quot; on the &lt;a href="http://msdn.microsoft.com/en-us/library/ms189962.aspx"&gt;advanced options page&lt;/a&gt;). If you do, the partial cache won't go to the database the next time the key value comes in again. &lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9133432" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup Pattern: Upsert</title><link>http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-upsert.aspx</link><pubDate>Sun, 23 Nov 2008 07:31:46 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9133401</guid><dc:creator>mmasson</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9133401.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9133401</wfw:commentRss><description>&lt;p&gt;This is a pretty basic pattern where we use a lookup to determine whether we need to update and existing row, or insert a new one. The lookup checks if a key or set of values exists. If the key isn't found, the row is sent to an OLEDB Destination for the insert. If it is found, it is sent to an OLEDB Command to do the update. &lt;/p&gt;  &lt;p&gt;Note, the OLEDB Command transform operates on a row by row basis - so a separate SQL statement will be executed for every row going in. As such, the OLEDB Command can be very slow if you're processing a large number of rows. An alternate approach is to stage the data, and either update your target table using the &lt;a href="http://msdn.microsoft.com/en-us/library/bb510625.aspx"&gt;MERGE statement&lt;/a&gt;, or an UPDATE ... FROM batch command.&lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternUpsert_11BAE/image_2.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="275" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternUpsert_11BAE/image_thumb.png" width="693" border="0" /&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;You can also check out the &lt;a href="http://www.codeplex.com/SQLSrvIntegrationSrv/Release/ProjectReleases.aspx?ReleaseId=19048"&gt;MERGE Destination&lt;/a&gt; or &lt;a href="http://www.codeplex.com/ssisctc/Wiki/View.aspx?title=Batch%20Destination&amp;amp;referringTitle=Home"&gt;Batch Destination&lt;/a&gt; available on Codeplex. John Welch (author of the Batch Destination) has a &lt;a href="http://agilebi.com/cs/blogs/jwelch/archive/2008/11/07/batch-destination-and-the-merge-destination.aspx"&gt;blog post&lt;/a&gt; which compares the two. &lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9133401" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup Pattern: Key Generation</title><link>http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-key-generation.aspx</link><pubDate>Sun, 23 Nov 2008 07:28:54 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9133397</guid><dc:creator>mmasson</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9133397.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9133397</wfw:commentRss><description>&lt;p&gt;This pattern is used when you have transformation logic which relies on a key which might not already exist. If the lookup fails to find the key, a new key is generated with a script task so it can be used later on downstream. Optionally, the key could be inserted immediately into the reference table following the script task (multicast to send to an OLEDB Destination).&lt;/p&gt;  &lt;p&gt;The way you generate the key will vary depending on the situation. If you don't need to worry about concurrency issues, you could use an Execute SQL Task in the control flow to retrieve the next or current maximum key value, and store it in a variable. You&amp;#8217;d then increment it each time you go through the key generation process. &lt;/p&gt;  &lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternKeyGeneration_11956/image_4.png"&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="480" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatternKeyGeneration_11956/image_thumb_1.png" width="331" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9133397" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup Patterns</title><link>http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-patterns.aspx</link><pubDate>Sun, 23 Nov 2008 07:28:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9133394</guid><dc:creator>mmasson</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9133394.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9133394</wfw:commentRss><description>&lt;p&gt;From the Lookup presentation I put together for the MS BI conference in October, here is a series of posts which describe different patterns for using the Lookup transform. &lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-key-generation.aspx"&gt;Key Generation&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-upsert.aspx"&gt;Upsert&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-pattern-cascading.aspx"&gt;Cascading&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/11/23/lookup-pattern-case-insensitive.aspx"&gt;Case Insensitive&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/11/23/lookup-pattern-incremental-persistent-cache-updates.aspx"&gt;Incremental Cache Update&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/11/25/lookup-pattern-range-lookups.aspx"&gt;Range Lookups&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Each pattern will contain a diagram with Upstream and Downstream images.&lt;/p&gt;  &lt;p&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="50" alt="image" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupPatterns_11A2E/image_3.png" width="324" border="0" /&gt;&lt;/p&gt;  &lt;p&gt;These represent any collection of source / transform / destination components that could appear before/after the Lookup section described in the pattern.&amp;#160; &lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9133394" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup - Using the cache connection manager</title><link>http://blogs.msdn.com/mattm/archive/2008/11/22/lookup-using-the-cache-connection-manager.aspx</link><pubDate>Sun, 23 Nov 2008 06:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9133330</guid><dc:creator>mmasson</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9133330.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9133330</wfw:commentRss><description>&lt;P&gt;Using the &lt;A href="http://msdn.microsoft.com/en-us/library/bb895290.aspx" mce_href="http://msdn.microsoft.com/en-us/library/bb895290.aspx"&gt;Cache Connection Manager&lt;/A&gt; (CCM) is a new option for the Lookup transform in SSIS 2008. The CCM provides an alternative to doing lookups against a database table using an OLEDB connection. &lt;/P&gt;
&lt;P&gt;This post will suggest some best practices for using the Cache Connection Manager, and illustrate a couple of common scenarios that the CCM was meant to handle.&lt;/P&gt;
&lt;P&gt;For more information on the Cache Connection Manager and how to make use of it with the Lookup Transform, please see the books online entries linked at the end of this post.&lt;/P&gt;
&lt;H3&gt;Best practices&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Reuse the cache to reduce database load&lt;/LI&gt;
&lt;LI&gt;Share the cache between lookups to reduce memory usage&lt;/LI&gt;
&lt;LI&gt;Using the CCM is not always faster than OLEDB - the cost of disk access can out weight the benefits of pre-creating the cache&lt;/LI&gt;
&lt;LI&gt;The cache is essentially clear text - do not store sensitive data inside of the cache&lt;/LI&gt;
&lt;LI&gt;In terms of &lt;A href="http://blogs.msdn.com/mattm/archive/2008/10/18/lookup-cache-modes.aspx" mce_href="http://blogs.msdn.com/mattm/archive/2008/10/18/lookup-cache-modes.aspx"&gt;Cache Modes&lt;/A&gt; and the best practices that surround them, using a cache connection manager is equivalent to using a Full Cache mode&lt;/LI&gt;&lt;/UL&gt;
&lt;H3&gt;Reducing database and memory usage&lt;/H3&gt;
&lt;P&gt;If your reference database is remote, or under heavy load, consider using the Cache Connection Manager instead of an OLEDB connection. &lt;/P&gt;
&lt;P&gt;Once a cache is used (or created) in an SSIS package, it will be kept in memory until the package has finished executing. The cache can be reused across multiple data flows, and shared between multiple lookups in the same data flow. It can also be persisted to disk, and reused across package executions.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_4.png" mce_href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_4.png"&gt;&lt;IMG style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px" height=275 alt=image src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_thumb_1.png" width=367 border=0 mce_src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_thumb_1.png"&gt;&lt;/A&gt; &lt;/P&gt;
&lt;H3&gt;Create the cache from any SSIS data source&lt;/H3&gt;
&lt;P&gt;To use the CCM, you need to create a lookup cache in a separate data flow using the &lt;A href="http://msdn.microsoft.com/en-us/library/bb895264.aspx" mce_href="http://msdn.microsoft.com/en-us/library/bb895264.aspx"&gt;Cache Transformation&lt;/A&gt;. Because the cache is created in a regular data flow, this means that you can now use any data source that SSIS can connect to as a source for your lookup reference (flat file, excel, SAP, etc).&lt;/P&gt;
&lt;P&gt;With SSIS 2005, a common approach when using non-OLEDB accessible lookup sources was to stage the data first. If this data is only being used for your lookups, consider creating a persisted cache instead. &lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_13.png" mce_href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_13.png"&gt;&lt;IMG style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px" height=275 alt=image src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_thumb_5.png" width=344 border=0 mce_src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_thumb_5.png"&gt;&lt;/A&gt; &lt;/P&gt;
&lt;H3&gt;Cache the most common values&lt;/H3&gt;
&lt;P&gt;Sometimes you might have a large reference table, but the majority of your incoming data only uses a small portion of it. For example, you have a very large custom list, and the top 5% of your customers generate 90% of your transactions. In a scenario like this, you could pre-cache the information of your most active customers. Your data flow could use a cascading lookup pattern where you have one lookup which uses the cache, with its No Match output falling through to a second lookup running in a partial cache mode to hit the database to handle the remaining 10% of rows.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_14.png" mce_href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_14.png"&gt;&lt;IMG style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px" height=275 alt=image src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_thumb_6.png" width=440 border=0 mce_src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupUsingthecacheconnectionmanager_F61C/image_thumb_6.png"&gt;&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Resources&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/ms141821.aspx" mce_href="http://msdn.microsoft.com/en-us/library/ms141821.aspx"&gt;Lookup Transformation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/bb895290.aspx" mce_href="http://msdn.microsoft.com/en-us/library/bb895290.aspx"&gt;Cache Connection Manager&lt;/A&gt; &lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://msdn.microsoft.com/en-us/library/bb895289.aspx" mce_href="http://msdn.microsoft.com/en-us/library/bb895289.aspx"&gt;How to: Implement a Lookup Transformation in Full Cache Mode Using the Cache Connection Manager Transformation&lt;/A&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9133330" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Calculating the size of your Lookup cache</title><link>http://blogs.msdn.com/mattm/archive/2008/10/18/calculating-the-size-of-your-lookup-cache.aspx</link><pubDate>Sun, 19 Oct 2008 02:01:15 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9005616</guid><dc:creator>mmasson</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9005616.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9005616</wfw:commentRss><description>&lt;p&gt;Good news - a couple of Information log events where added to the Lookup Transform in SQL 2008 to help you better understand your lookup cache. When you're running in &lt;a href="http://blogs.msdn.com/mattm/archive/2008/10/18/lookup-cache-modes.aspx"&gt;Full Cache mode&lt;/a&gt;, the message will tell you the number of rows in the cache, its total size, and how long it took to create it. When running in partial cache mode you don't get the cache size, but you do get the number of database hits vs. number of cache hits, which can be helpful in determining whether you should be using a full cache lookup instead.&lt;/p&gt;  &lt;p&gt;Here are some equations you can use to estimate the amount of memory a cache will use (I say estimate because given memory boundaries, pages, etc, it will always vary). There are separate (but similar) equations for Full and Partial cache modes, as they handle things a little differently internally. &lt;/p&gt;  &lt;p&gt;For each row, in bytes: &lt;/p&gt;  &lt;h3&gt;Full cache&lt;/h3&gt;  &lt;p&gt;&lt;font color="#008000"&gt;&amp;lt;Row size&amp;gt;&lt;/font&gt; + 20 + (&lt;font color="#ff8000"&gt;4 * # of used columns&lt;/font&gt;)&lt;/p&gt;  &lt;h3&gt;Partial cache&lt;/h3&gt;  &lt;p&gt;&lt;font color="#008000"&gt;&amp;lt;Row size&amp;gt;&lt;/font&gt; + 36 + (&lt;font color="#ff8000"&gt;4 * # of columns in reference query&lt;/font&gt;)&lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;Row size is the total data length of the index and values columns. &lt;/p&gt;  &lt;p&gt;The 20/36 number is a constant representing the size of the hash used for comparisons.&lt;/p&gt;  &lt;p&gt;The last part is 4 (technically, the size of an int) times the number of used columns for full cache, and total number of columns in the reference query for partial cache. &lt;/p&gt;  &lt;h3&gt;Example&lt;/h3&gt;  &lt;p&gt;Our lookup query is:&lt;/p&gt;  &lt;p&gt;&lt;font face="Courier New" size="2"&gt;select ProductKey, ProductName from [Products]&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;ProductKey is an int - 4 bytes&lt;/p&gt;  &lt;p&gt;ProductName is an nvarchar(25) - which comes out to 52 bytes (two bytes per nchar, + 2 for null)&lt;/p&gt;  &lt;p&gt;Plugging this into our Partial cache equation, we get:&lt;/p&gt;  &lt;p&gt;&lt;font color="#008000"&gt;56&lt;/font&gt; + 36 +&lt;font color="#ff8000"&gt; 8 &lt;/font&gt;&lt;font color="#000000"&gt;= 100 bytes per row&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font color="#000000"&gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font color="#000000"&gt;If our reference table had 100,000 rows, we'd need ~10mb to hold the entire data set.&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font color="#000000"&gt;(100,000 * 100) / 1024 / 1024 = ~10mb&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font color="#000000"&gt;(divide by 1024 twice to go from bytes to kilobytes, kilobytes to megabytes)&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;In my example, I've edited my lookup query to contain only the columns I need - ProductKey, which is my index column (the column I'm matching on), and ProductName, which is my value column (the column I'm adding to my data flow). If I had done a select * (or picked a table/view name from the drop down in the UI, which results in a select *), I'd be increasing the number of columns in my reference query, and my partial cache row size would change. I'd end up with an additional 4 bytes for every column in my query, even though they aren't being used directly in the lookup.&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9005616" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Lookup cache modes</title><link>http://blogs.msdn.com/mattm/archive/2008/10/18/lookup-cache-modes.aspx</link><pubDate>Sat, 18 Oct 2008 23:54:16 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9005417</guid><dc:creator>mmasson</dc:creator><slash:comments>7</slash:comments><comments>http://blogs.msdn.com/mattm/comments/9005417.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=9005417</wfw:commentRss><description>&lt;p&gt;&lt;em&gt;Over the past couple of months I've been putting together a presentation on the Lookup Transform. I presented most of it as a &lt;/em&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/10/01/presenting-at-the-microsoft-bi-conference.aspx"&gt;&lt;em&gt;Chalk Talk at the MS BI Conference&lt;/em&gt;&lt;/a&gt;&lt;em&gt; last week, and from the evaluation scores, it seems like it was pretty well received. I'll be splitting up some of its content into a series of blog posts over the next little while. If you're interested in seeing the whole talk, it will also be shown at the &lt;/em&gt;&lt;a href="http://blogs.msdn.com/mattm/archive/2008/10/14/sswug-virtual-conference-interview.aspx"&gt;&lt;em&gt;SSWUG Virtual Conference&lt;/em&gt;&lt;/a&gt;&lt;em&gt; in November.&lt;/em&gt;&lt;/p&gt;  &lt;p&gt;&lt;em&gt;----&lt;/em&gt;&lt;/p&gt;  &lt;p&gt;The most important setting of the Lookup Transform is the Cache Mode - it can greatly impact your data flow performance, and affects overall package design. Because of its importance, we made it the first thing you see in the new 2008 Lookup UI. I feel this is a great improvement over 2005, where the cache mode was abstracted away - see &lt;a href="http://blogs.msdn.com/michen/archive/2007/10/03/ssis-lookups-modes.aspx"&gt;Michael Entin's post&lt;/a&gt; for more details. &lt;/p&gt;  &lt;p&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="461" alt="2008 Lookup UI" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/Lookupcachemodes_74DE/lk_ui_modes_3.png" width="534" border="0" /&gt;&lt;/p&gt;  &lt;p&gt;This blog post describes the three cache modes, how they work, and best practices around using them. Note that these cache modes apply when you're using an OLE DB connection manager - using the new Cache connection manager is similar to using a Full Cache mode.&lt;/p&gt;  &lt;h2&gt;Full Cache&lt;/h2&gt;  &lt;p&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="53" alt="lk_fullcache" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/Lookupcachemodes_74DE/lk_fullcache_3.png" width="240" border="0" /&gt;&lt;/p&gt;  &lt;p&gt;The default cache mode for the lookup is Full cache. In this mode, the database is queried once during the pre-execute phase of the data flow. The entire reference set is pulled into memory. This approach uses the most memory, and adds additional startup time for your data flow, as all of the caching takes place before any rows are read from the data flow source(s). The trade off is that the lookup operations will be very fast during execution. One thing to note is that the lookup will not swap memory out to disk, so your data flow will fail if you run out of memory. &lt;/p&gt;  &lt;h3&gt;When to use this cache mode&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;When you're accessing a large portion of your reference set &lt;/li&gt;    &lt;li&gt;When you have a small reference table &lt;/li&gt;    &lt;li&gt;When your database is remote or under heavy load, and you want to reduce the number of queries sent to the server &lt;/li&gt; &lt;/ul&gt;  &lt;h3&gt;Keys to using this cache mode&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;Ensure that you have enough memory to fit your cache &lt;/li&gt;    &lt;li&gt;Ensure that you don't need to pick up any changes made to the reference table      &lt;ul&gt;       &lt;li&gt;Since the lookup query is executed before the data flow begins, any changes made to the reference table during the data flow execution will not be reflected in the cache &lt;/li&gt;     &lt;/ul&gt;   &lt;/li&gt; &lt;/ul&gt;  &lt;h2&gt;Partial Cache&lt;/h2&gt;  &lt;p&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="53" alt="lk_partialcache" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/Lookupcachemodes_74DE/lk_partialcache_3.png" width="240" border="0" /&gt;&lt;/p&gt;  &lt;p&gt;In this mode, the lookup cache starts off empty at the beginning of the data flow. When a new row comes in, the lookup transform checks its cache for the matching values. If no match is found, it queries the database. If the match is found at the database, the values are cached so they can be used the next time a matching row comes in. &lt;/p&gt;  &lt;p&gt;Since no caching is done during the pre-execute phase, the startup time using a partial cache mode is less than it would be for a full cache. However, your lookup operations would be slower, as you will most likely be hitting the database more often.&lt;/p&gt;  &lt;p&gt;When running in partial cache mode, you can configure the maximum size of the cache. This setting can be found on the Advanced Options page of the lookup UI. There are actually two separate values - one for 32bit execution, and one for 64bit. If the cache gets filled up, the lookup transform will start dropping the least seen rows from the cache to make room for the new ones. &lt;/p&gt;  &lt;p&gt;In 2008 there is a new Miss Cache feature that allows you to allocate a certain percentage of your cache to remembering rows that had no match in the database. This is useful in a lot of situations, as it prevents the transform from querying the database multiple times for values that don't exist. However, there are cases where you don't want to remember the misses - for example, if your data flow is adding new rows to your reference table. The Miss Cache is disabled by default. &lt;/p&gt;  &lt;h3&gt;When to use this cache mode&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;When you're processing a small number of rows and it's not worth the time to charge the full cache &lt;/li&gt;    &lt;li&gt;When you have a large reference table &lt;/li&gt;    &lt;li&gt;When your data flow is adding new rows to your reference table &lt;/li&gt;    &lt;li&gt;When you want to limit the size of your reference table by modifying query with parameters from the data flow &lt;/li&gt; &lt;/ul&gt;  &lt;h3&gt;Keys to using this cache mode&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;Ensure that your &lt;a href="http://blogs.msdn.com/mattm/archive/2008/10/18/calculating-the-size-of-your-lookup-cache.aspx"&gt;cache size setting is large enough&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;Use the Miss Cache appropriately &lt;/li&gt;    &lt;li&gt;If the cache size isn't large enough for your rows, sort on lookup index columns if possible &lt;/li&gt; &lt;/ul&gt;  &lt;h2&gt;No Cache&lt;/h2&gt;  &lt;p&gt;&lt;img style="border-top-width: 0px; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px" height="53" alt="lk_nocache" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/Lookupcachemodes_74DE/lk_nocache_3.png" width="240" border="0" /&gt; &lt;/p&gt;  &lt;p&gt;As the name implies, in this mode the lookup transform doesn't maintain a lookup cache (actually, not quite true - we keep the last match around, as the memory has already been allocated). In most situations, this means that you'll be hitting the database for every row.&lt;/p&gt;  &lt;h3&gt;When to use this cache mode&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;When you're processing a small number of rows &lt;/li&gt;    &lt;li&gt;When you have non-repeating lookup indexes &lt;/li&gt;    &lt;li&gt;When your reference table is changing (inserts, updates, deletes) &lt;/li&gt;    &lt;li&gt;When you have severe memory limitations &lt;/li&gt; &lt;/ul&gt;  &lt;h3&gt;Keys to using this cache mode&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;Ensure that the partial cache mode isn't the better choice &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;---&lt;/p&gt;  &lt;p&gt;To find out more on how to implement the look up transform, please see these books online entries:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/bb895314.aspx"&gt;How to: Implement a Lookup Transformation in Full Cache Mode Using the OLE DB Connection Manager&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/bb895289.aspx"&gt;How to: Implement a Lookup Transformation in Full Cache Mode Using the Cache Connection Manager&lt;/a&gt; &lt;/li&gt;    &lt;li&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/ms137820.aspx"&gt;How to: Implement a Lookup in No Cache or Partial Cache Mode&lt;/a&gt; &lt;/li&gt; &lt;/ul&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9005417" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>Presenting at the Microsoft BI Conference</title><link>http://blogs.msdn.com/mattm/archive/2008/10/01/presenting-at-the-microsoft-bi-conference.aspx</link><pubDate>Thu, 02 Oct 2008 00:27:14 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8972227</guid><dc:creator>mmasson</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/mattm/comments/8972227.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=8972227</wfw:commentRss><description>&lt;p&gt;On Monday (October 6th) I’ll be doing a chalk talk presentation at the &lt;a href="http://www.msbiconference.com/ "&gt;MS BI Conference&lt;/a&gt;. The topic is &lt;a href="http://www.msbiconference.com/pages/members/sessiondetails.aspx?sid=504"&gt;Advanced Scenarios with the Lookup Transform&lt;/a&gt;. Here is the abstract:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;Performing lookups is one of the most common operations in the ETL process, and doing them incorrectly can severely affect the performance of your data load. In this talk you’ll learn best practices and design patterns for using the lookup component in SQL Server 2008 Integration Services, and how to take advantage of the new lookup features. If you’ve ever wondered about the differences between full and partial caches, the advantages of cascading lookups, and how to do ranged lookups, you won’t want to miss this talk.&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;I’ll also be hanging out at the SSIS / SQL BI booths throughout the day. Come by and say hello!&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8972227" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category><category domain="http://blogs.msdn.com/mattm/archive/tags/Conferences/default.aspx">Conferences</category></item><item><title>Enum value for Lookup’s NoMatchBehavior property</title><link>http://blogs.msdn.com/mattm/archive/2008/08/22/enum-value-for-lookup-s-nomatchbehavior-property.aspx</link><pubDate>Sat, 23 Aug 2008 02:17:37 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8889108</guid><dc:creator>mmasson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/mattm/comments/8889108.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=8889108</wfw:commentRss><description>&lt;p&gt;&lt;/p&gt;  &lt;p&gt;&lt;/p&gt;  &lt;p&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/ms136014.aspx#lookup"&gt;The lookup transform has a new property in SQL 2008&lt;/a&gt; which controls how to handle rows with no matches – NoMatchBehavior. It has two values - “Treat rows with no matching entries as errors” and “Send rows with no matching entries to the no match output”. &lt;/p&gt;  &lt;p&gt;From the BOL entry:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;When the property is set to &lt;b&gt;Treat rows with no matching entries as errors&lt;/b&gt;, the rows without matching entries are treated as errors. You can specify what should happen when this type of error occurs by using the &lt;b&gt;Error Output&lt;/b&gt; page of the &lt;b&gt;Lookup Transformation Editor&lt;/b&gt; dialog box. &lt;/p&gt;    &lt;p&gt;When the property is set to &lt;b&gt;Send rows with no matching entries to the no match output&lt;/b&gt;, the rows are not treaded as errors. &lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;In the properties window, this shows up as a drop down list. However, it looks like we forgot to document the actual enumeration values you would use if you are setting the property programmatically. &lt;/p&gt;  &lt;p&gt;The enum looks like this:&lt;/p&gt;  &lt;pre class="code"&gt;&lt;span style="color: blue"&gt;public enum &lt;/span&gt;&lt;span style="color: #2b91af"&gt;NoMatchPropertyEnum &lt;/span&gt;: &lt;span style="color: blue"&gt;int
&lt;/span&gt;{
    TreatAsError = 0,
    SendToNoMatchOutput = 1
}&lt;/pre&gt;
&lt;a href="http://11011.net/software/vspaste"&gt;&lt;/a&gt;

&lt;p&gt;This info should appear in the BOL docs the next time they are updated.&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8889108" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item><item><title>SSWUG Business Intelligence Virtual Conference</title><link>http://blogs.msdn.com/mattm/archive/2008/07/30/sswug-business-intelligence-virtual-conference.aspx</link><pubDate>Thu, 31 Jul 2008 00:00:30 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8792458</guid><dc:creator>mmasson</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/mattm/comments/8792458.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=8792458</wfw:commentRss><description>&lt;p&gt;I’ve been invited to speak at the &lt;a href="http://www.vconferenceonline.com/business-intelligence/"&gt;SSWUG BI Virtual Conference&lt;/a&gt; in September. Like &lt;a href="http://agilebi.com/cs/blogs/jwelch/archive/2008/07/20/presenting-at-the-sswug-virtual-bi-conference.aspx"&gt;John Welch&lt;/a&gt; mentioned in his blog, the &lt;a href="http://www.vconferenceonline.com/business-intelligence/speakers.asp"&gt;current speaker lineup&lt;/a&gt; is very impressive. I’m honored (and a little intimidated) to be on the presenters list! I’ll be flying out to Tucson in early September to record three SSIS related sessions:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;What’s new in SSIS in SQL Server 2008&lt;/li&gt;    &lt;li&gt;Beyond Scripting – Developing reusable extensions for SSIS&lt;/li&gt;    &lt;li&gt;Advanced lookup scenarios in Integration Services with SQL Server 2008&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;The conference will take place online September 24th – 26th. The &lt;a href="http://www.vconferenceonline.com/business-intelligence/sessions.asp"&gt;session list&lt;/a&gt; hasn’t been posted yet, but judging by the list of speakers, I’m sure it will be interesting.&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8792458" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Script+Task/default.aspx">Script Task</category><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category><category domain="http://blogs.msdn.com/mattm/archive/tags/Katmai/default.aspx">Katmai</category><category domain="http://blogs.msdn.com/mattm/archive/tags/Conferences/default.aspx">Conferences</category></item><item><title>Lookups with Visual FoxPro</title><link>http://blogs.msdn.com/mattm/archive/2008/03/03/lookups-with-visual-foxpro.aspx</link><pubDate>Tue, 04 Mar 2008 08:20:23 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8019674</guid><dc:creator>mmasson</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/mattm/comments/8019674.aspx</comments><wfw:commentRss>http://blogs.msdn.com/mattm/commentrss.aspx?PostID=8019674</wfw:commentRss><description>&lt;p&gt;Here's a quick tip from a couple other members of the SSIS team, Da Lin and David Noor. &lt;/p&gt;  &lt;p&gt;If you have a lookup component using partial cache mode using a Visual FoxPro oledb provider to connect to your reference table, be sure to change the OLE DB Services setting in the connection manager to something other than &amp;quot;default&amp;quot; or &amp;quot;enable all&amp;quot;. &lt;/p&gt;  &lt;p&gt;&lt;a href="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupswithVisualFoxPro_E67B/clip_image002_2.jpg"&gt;&lt;img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="244" alt="clip_image002" src="http://blogs.msdn.com/blogfiles/mattm/WindowsLiveWriter/LookupswithVisualFoxPro_E67B/clip_image002_thumb.jpg" width="239" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;If you don't change this setting, you'll get the following error at runtime:&lt;/p&gt;  &lt;p&gt;An OLE DB record is available.&amp;#160; Source: &amp;quot;Microsoft OLE DB Provider for Visual FoxPro&amp;quot;&amp;#160; Hresult: 0x80040E46&amp;#160; Description: &amp;quot;One or more accessor flags were invalid.&amp;quot;&lt;/p&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8019674" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/mattm/archive/tags/Lookup/default.aspx">Lookup</category></item></channel></rss>