I was working on a common scenario and found an interesting side effect. Applications typically have a listing page that get search criteria from a UI and passes to a stored procedure. In the procedure you often find statements like:
SELECT a,b,c,d FROM MyTableWHERE c = @SearchC AND d = @SearchD
Often times you need to join information to the main data that is the most current from a historical table. For example in SQL 2005.
SELECT t.*FROM(SELECT ID, Amt, AmtDepositDate, row_number() over (partition by ID order by ID, AmtDepositDate desc) as rn FROM MoneyTable) tWHERE rn=1
I found that when you join the above query to the main search query above a weird optimization occurs in SQL 2005. I suspect (unconfirmed with product group) that a left join can't optmize the query with the over so it ends up do a sub select on all records every row in the main search.
The solution (in my case where there is small sets of data of < 5000 records), I create a table variable and select into the lasted record from the history table. Then I join the table variable to the main table search.
In my scenario with 2000 main records and 1 million history records the left join with the latest select (no table variable) took 5+ minutes. Using the table variables took 2 seconds for everything.
Here is a snippet of the overall solution:
DECLARE @MyTable table ( ID int ,Amt float ,AmtDepositDate datetime )
INSERT INTO @MyTableSELECT t.ID ,t.Amt , t.AmtDepositDateFROM ( SELECT ID , Amt , AmtDepositDate , row_number() over (partition by ID order by ID, AmtDepositDate desc) as rn FROM MoneyTable ) tWHERE rn=1
SELECT MT2.ID , MT2.Name , MT2.CityFROM MyTable2 MT2LEFT JOIN @MyTable MT ON MT2.ID = MT.IDWHERE c = @SearchC AND d = @SearchD
Anyone experience similar or can offer up an reason for the optimization?