Why massive inserts using SubmitChanges lack in performance.

Why massive inserts using SubmitChanges lack in performance.

  • Comments 2

 

Bulk inserting using LINQ to SQL, why is it so slow?

Well, the short answer is that it is not designed to do bulk inserts.

 

But out of curiosity, why is it slow?

 

Well, let us do it by example (my preferred way). First create a table in SQL Server like so (this will basically represent a person):

 

create table OurLinqPerson (cid int primary key, fname nvarchar(10), lname nvarchar(10), age int)

--drop table OurLinqPerson

 

Then create a new C# console project in Visual Studio.

Right click the project and add a new item, “LINQ to SQL Classes”, call it Person.dbml.

Then use the Server Explorer to find the table (OurLinqPerson) and drag it to the designer surface and “Save”.

Then add the following code:

 

        static void Main(string[] args)

        {

            int noOfPersonsToInsert = 10000;

            long calc = 0;

            try

            {

                PersonDataContext pdc = new PersonDataContext();

                Stopwatch sw = new Stopwatch();

                sw.Start();

                for (int i = 0; i < noOfPersonsToInsert; i++)

                {

                    OurLinqPerson olp = new OurLinqPerson { cid = i, fname = "Peter", lname = "Peterson", age = 33 };

                    // Add the object to the datacontext.

                    pdc.OurLinqPersons.InsertOnSubmit(olp);

                    if (i % 500 == 0)

                    {

                        Console.WriteLine("Time (ms): Total / This batch {0, 3} / {1, 3}, Objects using InsertOnSubmit: {2}", sw.ElapsedMilliseconds.ToString(), (sw.ElapsedMilliseconds - calc).ToString(), i);

                        calc = sw.ElapsedMilliseconds;

                    }

                }

                Console.WriteLine("All InsertOnSubmits done. Total elapsed time (ms): {0}. \n\nNow calling SubmitChanges...", sw.ElapsedMilliseconds.ToString());

                calc = sw.ElapsedMilliseconds;

               

                // Now submit changes to the database.

                pdc.SubmitChanges();

               

                Console.WriteLine("SubmitChanges done, time for this (ms): {0}", (sw.ElapsedMilliseconds - calc).ToString());

                sw.Stop();

                Console.WriteLine("\n\nTotal elapsed time (ms): {0}", sw.ElapsedMilliseconds.ToString());

            }

            catch (Exception ex)

            {

                Console.WriteLine(ex);

            }

        }

 

The code hopefully is self-explanatory.  We want to insert 10000 rows into the OurLinqPerson table, we do this by creating an instance of the OurLinqPerson object.

(Basically, a row in the table in the database is an instance of a class representing a row, the columns are properties on that class).

We then add that instance to the table in the DataContext by using it as an argument to the InsertOnSubmit method.

We write out the time elapsed and the time it took for doing this for every 500 rows.

Finally we submit the changes to the database by calling SubmitChanges.

 

So, run the application and the output should be something like this:

 

Time (ms): Total / This batch  23 /  23, Objects using InsertOnSubmit: 0

Time (ms): Total / This batch  27 /   4, Objects using InsertOnSubmit: 500

Time (ms): Total / This batch  30 /   3, Objects using InsertOnSubmit: 1000

Time (ms): Total / This batch  33 /   3, Objects using InsertOnSubmit: 1500

Time (ms): Total / This batch  44 /  11, Objects using InsertOnSubmit: 2000

Time (ms): Total / This batch  47 /   3, Objects using InsertOnSubmit: 2500

Time (ms): Total / This batch  50 /   3, Objects using InsertOnSubmit: 3000

Time (ms): Total / This batch  53 /   3, Objects using InsertOnSubmit: 3500

Time (ms): Total / This batch  58 /   5, Objects using InsertOnSubmit: 4000

Time (ms): Total / This batch  61 /   3, Objects using InsertOnSubmit: 4500

Time (ms): Total / This batch  65 /   3, Objects using InsertOnSubmit: 5000

Time (ms): Total / This batch  73 /   8, Objects using InsertOnSubmit: 5500

Time (ms): Total / This batch  76 /   3, Objects using InsertOnSubmit: 6000

Time (ms): Total / This batch  80 /   4, Objects using InsertOnSubmit: 6500

Time (ms): Total / This batch  83 /   3, Objects using InsertOnSubmit: 7000

Time (ms): Total / This batch  87 /   4, Objects using InsertOnSubmit: 7500

Time (ms): Total / This batch  90 /   3, Objects using InsertOnSubmit: 8000

Time (ms): Total / This batch  94 /   4, Objects using InsertOnSubmit: 8500

Time (ms): Total / This batch  97 /   3, Objects using InsertOnSubmit: 9000

Time (ms): Total / This batch 100 /   3, Objects using InsertOnSubmit: 9500

 

All InsertOnSubmits done. Total elapsed time (ms): 104.

 

Now calling SubmitChanges...

SubmitChanges done, time for this (ms): 3396

 

Total elapsed time (ms): 3500

 

So it is pretty obvious that adding the rows/objects performs quickly, around 5 ms to insert 500 of them, however, the call to SubmitChanges takes 3396 ms out of a 3500 total.

So, back to the original questions, why is it slow?

 

Well, when calling SubmitChanges every object/row in the table (OurLinqPersons) has to be inspected in order to decide what action to take on that particular row.

Once the Object State has been determined for the instance it will create the appropriate SQL for that state and then call the database with this SQL.

The rows can be in several states, for example, ToBeInserted, Unchanged, Deleted etc.

In this case the row is in the ToBeInserted state and so the following SQL is created and ran for each row:

 

INSERT INTO [dbo].[OurLinqPerson]([cid], [fname], [lname], [age]) VALUES (@p0, @p1, @p2, @p3)

 

So in short, the reason it takes time is because every row/object has to be inspected in order to decide what action to take on it and then that action has to be executed.

It doesn’t matter if the row/object is unchanged; it still has to be inspected.

 

If you want to insert a large number of row, pure ADO.Net (for example SqlBulkCopy is a better option).

 

"Table<TEntity>.InsertOnSubmit Method"

http://msdn.microsoft.com/en-us/library/bb763516.aspx

"DataContext.SubmitChanges Method "

http://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.submitchanges.aspx

"Object States and Change-Tracking (LINQ to SQL)"

http://msdn.microsoft.com/en-us/library/bb386982.aspx

"Insert, Update, and Delete Operations (LINQ to SQL)"

http://msdn.microsoft.com/en-us/library/bb386931.aspx

Leave a Comment
  • Please add 7 and 4 and type the answer here:
  • Post
  • Why is the inspection for every row/object so expensive? Is a call to the SQL server made?  It would be nice to be able to turn this inspection off when necessary.

  • I short - just use transaction and performance will skyrocket.

Page 1 of 1 (2 items)