Saturday, June 16, 2012

Linq is the new Stored Procedure

After reading Rob Conery's And seeing this Aaron Bertrand's comment on stackoverflow:

Then I suspect brainwashing may have been involved

I believe that .NET developers should practice Buddhism.

Most .NET developers don't know the middle path. If they chosen Stored Procedures, they will decidedly use stored procedures 100% exclusively, SP foreva! to the detriment on excluding other sound techniques. They forgot to be pragmatic, overlooking the fact that some solutions are meant to be solved in a certain way. If you have flexible query filters, instead of making a stored procedure with gazillion parameters, it's better you use techniques or DAL that is a perfect fit for that. Linq perfectly fits the bill and it is type-safe to boot, as it allows you to chain filters in any way you deem fit, and consequently, making the system performant.

var query = MemberRepository.All;

if (lastnameFilter != "")
    query = query.Where(x => x.Lastname == lastnameFilter);
if (firstnameFilter != "")
    query = query.Where(x => x.Firstname == firstnameFilter);
if (memberSinceFilter != 0)
    query = query.Where(x => x.YearMember >= memberSinceFilter);
if (pointsFilter != 0)
    query = query.Where(x => x.Points >= pointsFilter);

return query.ToList(); 

If you only have one filter, example on points only, that would generate this short query only:

select * 
from Member 
where Points >= 10000;

Whereas if you use stored procedure, this is how that will be tackled:

create procedure MemberSearch(@Lastname varchar(50), @Firstname varchar(50), @MemberSince int, @Points int)

select * 
form Member
where (@Lastname = '' OR Lastname = @Lastname)
      AND (@Firstname = '' OR Firsname = @Firstname)
   AND (@MemberSince = 0 OR YearMember >= @memberSince)
   AND (@Points = 0 OR Points >= @Points)


I would hazard a guess, most prefers the Linq approach now, as it produces a performant query. If you filter only on specific column(s), other columns will not be included in the query. Linq is my preferred approach too.

Sadly though, even how awesome Linq is, I'm feeling there's a new Stored Procedure in town. Linq is the new Stored Procedure! Don't be confused I'm praising Linq one second and bashing it another. Let me tell you the premise of this article.

Upon seeing this question:

And seeing this comments:

This is close to the output I am looking for. I want where the most recent action is an open, not the most recent open, but to get that, I would just have to move the Status = Open condition to the second select. The bigger issue, though, is that I am being "strongly encouraged" to use Entity to communicate with the DB instead of directly querying it. – user1459547

For what purpose? If it is harder to make EF generate the query you need, and you know how to do it in SQL... – Aaron Bertrand

I am entirely on your side, but I can't convince the project manager. – user1459547

Then I suspect brainwashing may have been involved... – Aaron Bertrand

It eludes me why the need to impose rules or be dogmatic that all data access should be in Linq, when a straightforward SQL is as readable and more performant than Linq-generated SQL.

I would rather see this operate in production:

;WITH MostRecentActions AS 
  SELECT RequestID, ActionTimestamp,
    /* other relevant RequestAction columns, */
    rn = ROW_NUMBER() OVER (PARTITION BY RequestID ORDER BY ActionTimestamp DESC)
  FROM dbo.RequestActions
  WHERE Status = 'Open'
SELECT RequestID, ActionTimestamp /* , other columns */ 
  FROM MostRecentActions
  WHERE rn = 1;      

Than this complex Linq-generated one:

[Extent1].[Id] AS [Id], 
[Extent1].[SomeProp1] AS [SomeProp1], 
[Extent1].[SomeProp2] AS [SomeProp2], -- ...etc.
FROM  [dbo].[Requests] AS [Extent1]
CROSS APPLY  (SELECT TOP (1) [Project1].[Status] AS [Status]
           [Extent2].[Status] AS [Status], 
           [Extent2].[ActionTimestamp] AS [ActionTimestamp]
           FROM [dbo].[RequestActions] AS [Extent2]
           WHERE [Extent1].[Id] = [Extent2].[RequestId]
    ) AS [Project1]
    ORDER BY [Project1].[ActionTimestamp] DESC ) AS [Limit1]
WHERE N'Open' = [Limit1].[Status]

I have an inkling that code is slow compared to the first one(the CTE approach). Ok, an astute dev might actually profile and try these two codes and may found that the latter code is faster, but that is besides the point. The point is why should we impose dogmatic rules, when you can choose an approach that you can implement quickly and works. Sure, some Linq problems are just an stackoverflow away, but don't forget that you can still drop to bare metal SQL if you want to have more confidence the solution works and as smoothly as you wanted them be. Don't be overzealous on a given platform or framework, don't say "Stored Procedure Forever!" Don't say "Stored Procedure is Dead! Long Live Linq!"

Everything should be in moderation, don't be overzealous on one choice only. Programmers hate -ism, but let me take this as an exception, programmers should sometimes practice Buddhism

In line with pragmatism and to avoid becoming overzealous on certain decisions, the only sound bite that should be tolerated is:

Sound bites considered harmful

Context here:

No comments:

Post a Comment