Wednesday, September 29, 2010

LINQ: Bulk Insert/Delete/Update


One of the major stumbling blocks that keeps LINQ 2 SQL / LINQ 2 Entity, from being considered seriously in a commercial development is performance. But not just any performance, because as you should already know running DB queries with LINQ is relatively fast (sometimes even faster, than hand coded SQL script). SQL SERVER 2005 (if not earlier) and up also ensure that the compiled queries are cached, which means the difference between LINQ script vs. stored procedure should be irrelevant.

Of course it is easier to do something stupid with LINQ 2 SQL such as referencing a late bound property that ends up returning thousands of rows. But even then careful coding and some help from the DataContext.AssociateWith method should limit the fallout, somewhat.

Additionally you should also check out LINQ pad for a tool to help you visualize the query execution plan.

With that out of the way, the only thing LINQ 2 SQL/Entity Framework cannot run batched insert, delete  or update, out of the box. Of course you can always write a SQL Stored Procedure, or call execute command or what have you, but then you will be losing all the benefits that come with LINQ: type checking, ease of maintenance, re-factorable code and so on.  What I mean is that when you have 10,000 data to delete from table A, LINQ will first query for the 10,000, then do a line by line delete based on primary key. (See: <--Terry Anney describes the problem far better than I ever could. She also provides part of the solution for batched Update and Delete.

However those alone are not enough. There is one case left: what if you need to do a batch insert from another table into another, after a little manipulation. Such scenario isn’t uncommon. For that, I really have to recommend, Magiq at: <—These fine folks are doing what Microsoft should have done from the start (before shoehorning the craptastic, LINQ to Entity down our throats).

One final thing, once you started playing with LINQ 2 SQL,  you probably noticed that it is fairly difficult to manage the DataContext object  scope/lifetime. The problem is compounded in a distributed application. For that, you should check out Rick Strahl blog over at: He did a great job at describing the problem and showing a few possible solutions for it.

Saturday, September 11, 2010

Recursive Text File Replacement in Powershell

Text manipulation is something that Powershell is very natural at.

For example, the script below, finds all string occurrences that matches the given regex and replace them as necessary in any file and subfolder. I can think of a great deal of use for this, especially when changing localization texts (among other things), where the Visual Studio built in  string replace is simply too inclusive and underpowered.

--------------------------------------------Powershell Script-------------------------------------------------------------

gci -r -i <#The files to search for eg. @("*.aspx","*.cs","*.resx") #>    | %{
        [bool] $SomethingChanged = $false;
        $newContent =
                gc $_.FullName| %{
                            if($_ -match <#PUT YOUR REGEX HERE (in Quotes)!#>)                                                
<#PUT YOUR STRING REPLACE LOGIC IN THIS BLOCK OF CODE AND SET $SomethingChanged TO TRUE IF SUCCESSFUL#>                                               
                            else  #No match found, output the original string.                          
            #Saves the updated content.
            sc $_.Fullname $newContent -Encoding UTF8

---------------------------------------------Powershell Script-------------------------------------------------------------

The following table should help in figuring out how the script above works.

Syntax Definition
gci Get-ChildItem, it works like the “dir” in old DOS command.
-r Tells the “dir”/”gci” command to search recursively, ala. the old “/s” in normal cmd prompt.
-i This is the input filter flag. It should be followed by the filter string. eg. “*.aspx”
| Piping, it allows you to use the result from the previous computation in the next. It’s like the memory recall button in your calculator. (Note: It can be nested using parenthesis! )
$_ This variable/field stores the result piped from the previous statement. Ie. if the last computational result is “1”, then $_ == 1 is $true
% For each statement. It runs for each element, piped from the previous statement.
gc Get-Content. It’s like “type” in old DOS (which still works in powershell btw). It reads the content of a text file line by line.
sc Save-Content. Saves the content, it takes an array of string. Each element in the array, represents a line in the text file.