Another pattern that can cause too many round trips to the database is when each object is treated as a black box and is responsible for retrieving and saving its own data. This by itself is not the problem; it’s when the object hierarchy reaches three or more levels. The natural programming pattern becomes this: for each row in the parent object, retrieve all the related child rows in a separate query. Two levels in the hierarchy is not too bad unless the final record set is huge. But carrying this pattern to the third and fourth level, you can imagine that the number of round trips to the database can geometrically increase.
Consider the following object hierarchy: Project – Task – Sub Task – task history.
This is a hierarchy for building a house. The house is modeled in the project table. The tasks for each vendor are stored in the tasks table. Each task has several steps, and each step can have multiple history records. The first round trip to the database was done by the application to retrieve the list of houses. The user would pick a house to work on. Then the application will retrieve all tasks for the project. This is a normal pattern so far and we have not generated any bad performance. But then the code gets interesting because for each task there is another trip to the database to retrieve its sub task. Then for each subtask there is a separate round trip to the data base to get its history. This is done because the objects are a black box and have no knowledge of other objects even if they are related. The really bad aspect of this approach is that all these round trips were done with one user action!
Solution A: Create a view in the database that joins all the tables and returns one data set. Use that to populate your objects. I’ve used OUTER JOINS in the sample below so that all projects get returned, even if they don’t have any tasks, sub-tasks or history.
Create View vProject AS SELECT * FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID OUTER JOIN SubTask ST on T.Task_ID = ST.Task_ID OUTER JOIN TaskHistory TH on ST.SubTask_ID = TH.SubTask_ID
Solution B: For those of you who don’t like returning everything in one record set, another good solution is to have a business object that can encompass all the related entities and populate each individual object as needed, the code will execute a stored procedure that well return multiple result sets. For Example:
Execute dbo.pr_GetTasksForProject @parm1=100
SELECT * FROM Project WHERE Proj_ID = @parm1
SELECT T.* FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID AND P.Proj_ID = @parm1
SELECT ST.* FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID OUTER JOIN SubTask ST on T.Task_ID = ST.Task_ID OUTER JOIN TaskHistory TH on ST.SubTask_ID = TH.SubTask_ID AND P.Proj_ID = @parm1
SELECT TH.* FROM Project P OUTER JOIN Task T on P.Proj_ID = T.Proj_ID OUTER JOIN SubTask ST on T.Task_ID = ST.Task_ID OUTER JOIN TaskHistory TH on ST.SubTask_ID = TH.SubTask_ID AND P.Proj_ID = @parm1
This will return multiple record sets So you well have to remember to create the proper code to loop through all the ADO return sets (using IDataSet.NextResult()) as was shown in an earlier blog. You will notice that the last select statement is almost the same select as the view in solution A except that in this case it is only returning data from one table. One customer preferred this solution because it was easier to plug into their current object model – they simply loaded each class object from each result set.
Solution C: This project example is a hierarchy and may be perfect for XML. You could return XML from the relational tables by using the FOR XML clause in the select statement. But is this a perfect case for using the XML data type, or should you map your XML input to relational tables? Read these articles before you decide.
Discovery: What is an easy way to discover whether you have these patterns in your code? The first way is to ask your users which screens and actions are the slowest. They will be able to direct you to the slowest portions of the application. A more scientific way is to use SQL Profiler in a carefully controlled test environment. Here is a step-by-step method for discovering the number of round trips for each user action in your application.
The point of this last section is that collecting this metric is a task that you should add to your test routines. And beware of using simple test data; i.e. an invoice with one detail line item. Create at least one customer with a large number of instances per object, especially where you have a multi-level object hierarchy.