Wednesday, September 29, 2010

LINQ: Bulk Insert/Delete/Update


One of the major stumbling blocks that keeps LINQ 2 SQL / LINQ 2 Entity, from being considered seriously in a commercial development is performance. But not just any performance, because as you should already know running DB queries with LINQ is relatively fast (sometimes even faster, than hand coded SQL script). SQL SERVER 2005 (if not earlier) and up also ensure that the compiled queries are cached, which means the difference between LINQ script vs. stored procedure should be irrelevant.

Of course it is easier to do something stupid with LINQ 2 SQL such as referencing a late bound property that ends up returning thousands of rows. But even then careful coding and some help from the DataContext.AssociateWith method should limit the fallout, somewhat.

Additionally you should also check out LINQ pad for a tool to help you visualize the query execution plan.

With that out of the way, the only thing LINQ 2 SQL/Entity Framework cannot run batched insert, delete  or update, out of the box. Of course you can always write a SQL Stored Procedure, or call execute command or what have you, but then you will be losing all the benefits that come with LINQ: type checking, ease of maintenance, re-factorable code and so on.  What I mean is that when you have 10,000 data to delete from table A, LINQ will first query for the 10,000, then do a line by line delete based on primary key. (See: <--Terry Anney describes the problem far better than I ever could. She also provides part of the solution for batched Update and Delete.

However those alone are not enough. There is one case left: what if you need to do a batch insert from another table into another, after a little manipulation. Such scenario isn’t uncommon. For that, I really have to recommend, Magiq at: <—These fine folks are doing what Microsoft should have done from the start (before shoehorning the craptastic, LINQ to Entity down our throats).

One final thing, once you started playing with LINQ 2 SQL,  you probably noticed that it is fairly difficult to manage the DataContext object  scope/lifetime. The problem is compounded in a distributed application. For that, you should check out Rick Strahl blog over at: He did a great job at describing the problem and showing a few possible solutions for it.