Tuesday, October 5, 2010

Linq to SQL is not a perfect match for reporting needs

Is clean (or performant) SQL achievable in Linq to Sql?

I wanted Linq to Sql produce this code:

SELECT C.CustomerID, COUNT(O.CustomerID) AS N
FROM Customers C
LEFT JOIN Orders O ON O.CustomerID = C.CustomerID
GROUP BY C.CustomerID

Here's my first attempt:

var q = 
from c in db.Customers
join o in db.Orders on c.CustomerID equals o.CustomerID into sr
from x in sr.DefaultIfEmpty()
group x by c.CustomerID into y
select new { y.Key, N = y.Count(t => t.CustomerID != null) };


First, my beef with Linq is it cannot allow the natural way to write a query. For comparison expressions, it is best to put on the left side the value/expression that is varying much, the less varying ones(e.g. constants) should be on the right side. To convey the point:

int customerId = 7;
bool found = false;
for(int i = 0; i < orders.Length; ++i)
{
    if(orders[i].CustomerId == customerId)
    {
        found = true;
        break;
    }
}

Linq doesn't allow this kind of statement:
var q = from c in db.Customers
        join o in db.Orders on o.CustomerID equals c.CustomerID into sr


Writing query on this form: from c in db.Customers join o in db.Orders on c.CustomerID equals o.CustomerID into sr is akin to doing this code:

int customerId = 7;
bool found = false;
for(int i = 0; i < orders.Length; ++i)
{
    if(customerId == orders[i].CustomerId)
    {
        found = true;
        break;
    }
}

Well that code just doesn't feel natural, isn't it?


Back to the gist of this topic, is performant code achievable in Linq to SQL?

The Linq to SQL code above produces:

SELECT [t2].[CustomerID] AS [Key], (
    SELECT COUNT(*)
    FROM [Customers] AS [t3]
    LEFT OUTER JOIN [Orders] AS [t4] ON [t3].[CustomerID] = [t4].[CustomerID]
    WHERE ([t4].[CustomerID] IS NOT NULL) AND ((([t2].[CustomerID] IS NULL) AND ([t3].[CustomerID] IS NULL)) OR (([t2].[CustomerID] IS NOT NULL) AND ([t3].[CustomerID] IS NOT NULL) AND ([t2].[CustomerID] = [t3].[CustomerID])))
    ) AS [N]
FROM (
    SELECT [t0].[CustomerID]
    FROM [Customers] AS [t0]
    LEFT OUTER JOIN [Orders] AS [t1] ON [t0].[CustomerID] = [t1].[CustomerID]
    GROUP BY [t0].[CustomerID]
    ) AS [t2]

Well, that is far cry from the simple thing we want to achieve; that is, Linq's Count(expression here) should not produce COUNT(*), it should produce COUNT(O.CustomerId), but since we cannot feed non-lambda operation on Linq's Count's parameter, we must do: select new { y.Key, N = y.Count(t => t.CustomerID != null) };. And that code produces inefficient query, look at the Sql output above.

Here's my second attempt...

var q = 
from c in db.Customers
join o in db.Orders on c.CustomerID equals o.CustomerID into sr
from x in sr.DefaultIfEmpty()
group x by c.CustomerID into y                                        
select new { y.Key, N = y.Sum(t => t.CustomerID != null ? 1 : 0 )};

...that code produces:

SELECT SUM(
    (CASE
        WHEN [t1].[CustomerID] IS NOT NULL THEN @p0
        ELSE @p1
     END)) AS [N], [t0].[CustomerID] AS [Key]
FROM [Customers] AS [t0]
LEFT OUTER JOIN [Orders] AS [t1] ON [t0].[CustomerID] = [t1].[CustomerID]
GROUP BY [t0].[CustomerID]

Though cleaner and performance-wise is passable, it still leave much to be desired, it doesn't inline constants; parameters might be the way of Linq to Sql for passing values or expressions to query(be it a variable or constant), however it still doesn't have a natural feel into it. As far as I know, there's no way to make Linq-to-Sql produce this SQL code: COUNT(O.CustomerId)


Linq to Sql is good as an ORM, storage persistence, fetching data, CRUD operations; but for reporting needs, I wouldn't bet my farm on it though. I would continue using plain old combo of SP,functions,views and ADO.NET for reporting needs.

No comments:

Post a Comment