Monday, July 18, 2011

Make your query malleable with Linq

Perhaps you've already seen/made this query pattern in your coding career:

SELECT *
FROM SchoolOfThought
WHERE
 (
  @inputTeacher = ''
  OR 
  Teacher = @inputTeacher
 )
 AND
 (
  @inputSubject = ''
  OR
  Subject = @inputSubject
 )
 AND
 (
  @inputQuality IS NULL
  OR
  Quality = @inputQuality
 ) 


That query is synonymous with this approach:

string inputTeacher = Console.ReadLine();
string inputSubject = Console.ReadLine();
string inputQuality = Console.ReadLine();

string filter = "";

if (inputTeacher != "")
 filter = string.Format("Teacher = '{0}'", inputTeacher);
 
if (inputSubject != "")
 filter = filter + (filter.Length != 0 ? " AND " : "" ) + string.Format("Subject = '{0}'", inputSubject);
 
if (inputQuality != "")
 filter = filter + (filter.Length != 0 ? " AND " : "" ) + string.Format("Subject = {0}", int.Parse(inputQuality));
 
string query = "SELECT * FROM SchoolOfThought " + (filter.Length != 0 ? "WHERE " + filter : "");


If the user has input on subject only, say Math, the resulting query is shorter:

SELECT  *
FROM  SchoolOfThought
WHERE Subject = 'Math';


If the user didn't input anything, The resulting query is much shorter:

SELECT  *
FROM  SchoolOfThought;



Even though concatenation approach is performant and lighter on network traffic(but don't do micro-optimizations), you will eschew the concatenation approach in favor of the first code. The reasons are twofold; first, you can't guarantee that you can safeguard your query from SQL-injection; second, code-review-wise, you don't want your code be flagged as having a code smell, the second approach is longer and look every bit as brittle.




Now, everything changes when Linq came to the scene, we no longer have to write our query patterned after the first code. We could now write the code below with a warm and fuzzy feeling that there will be no SQL-injection that could accidentally creep in the code. Lambda-using-Linq:

string inputTeacher = Console.ReadLine();
string inputSubject = Console.ReadLine();
string inputQuality = Console.ReadLine();

var query = s.Query<SchoolOfThought>();

if (inputTeacher.Length > 0)
 query = query.Where(x => x.Teacher == inputTeacher);

if (inputSubject.Length > 0)
 query = query.Where(x => x.Subject == inputSubject);

if (inputQuality.Length > 0)
 query = query.Where(x => x.Quality == int.Parse(inputQuality));

foreach (var item in query)
{
 Console.WriteLine("{0} {1} {2}", item.Teacher, item.Subject, item.Quality);
}


If you only input one variable, say Math on Subject, the resulting query will be this:

exec sp_executesql N'select schoolofth0_.SchoolOfThoughtId as SchoolOf1_3_, schoolofth0_.Teacher as Teacher3_, schoolofth0_.Subject as Subject3_, schoolofth0_.Quality as Quality3_ 
from [SchoolOfThought] schoolofth0_ 
where schoolofth0_.Subject=@p0',N'@p0 nvarchar(4000)',@p0=N'Math'



Linq-chaining has the advantage of preventing Sql-injection while maintaining the advantage of strongly-typed(as opposed to stringly-typed programming, e.g. concatenation) programming, strongly-typed means less error, you have autocomplete at your fingertips when you are using an IDE, and you have a code that is refactoring-friendly anytime. And last but not the least, with Linq you'll have more confidence the program is correct even before it is run.


If the inputSubject and inputQuality are required inputs; compiled-expression-wise, these five code has no differences:

Approach 1
string inputSubject = "Math";
string inputQuality = "80";

var query = 
 from x in s.Query<SchoolOfThought>()
 where x.Subject == inputSubject && x.Quality == int.Parse(inputQuality)
 select x;

Approach 2. Don't worry, parenthesis isn't required, compiler is not confused where the Linq boundaries end, think of from having a built-in open parenthesis, and select having a built-in close parenthesis.
string inputSubject = "Math";
string inputQuality = "80";

var query =
 from x in   
               
      from x in s.Query<SchoolOfThought>()
      where x.Subject == inputSubject
      select x  

 where x.Quality == int.Parse(inputQuality)
 select x;

Approach 3
string inputSubject = "Math";
string inputQuality = "80";

var query =
 (from x in s.Query<SchoolOfThought>()
 where x.Subject == inputSubject
 select x)
 .Where(x => x.Quality == int.Parse(inputQuality));


Approach 4
string inputSubject = "Math";
string inputQuality = "80";

var query = s.Query<SchoolOfThought>()
 .Where(x => x.Subject == inputSubject && x.Quality == int.Parse(inputQuality));

Approach 5
string inputSubject = "Math";
string inputQuality = "80";

var query = s.Query<SchoolOfThought>()
 .Where(x => x.Subject == inputSubject)
 .Where(x => x.Quality == int.Parse(inputQuality));


All the five code approaches above is compiled to this code:
exec sp_executesql 
N'select schoolofth0_.SchoolOfThoughtId as SchoolOf1_3_, schoolofth0_.Teacher as Teacher3_, schoolofth0_.Subject as Subject3_, schoolofth0_.Quality as Quality3_ 
from [SchoolOfThought] schoolofth0_ 
where schoolofth0_.Subject=@p0 and schoolofth0_.Quality=@p1',N'@p0 nvarchar(4000),@p1 int',@p0=N'Math',@p1=80 


Lastly, if you want your Linq-chaining to look as query-like(instead of Lambda-ish approach) as possible:


string inputTeacher = Console.ReadLine();
string inputSubject = Console.ReadLine();
string inputQuality = Console.ReadLine();


var query = s.Query<SchoolOfThought>();

if (inputTeacher.Length > 0)
 query = 
  from x in query 
  where x.Teacher == inputTeacher 
  select x;

if (inputSubject.Length > 0)
 query = 
  from x in query 
  where x.Subject == inputSubject 
  select x;

if (inputQuality.Length > 0)
 query = 
  from x in query 
  where x.Quality == int.Parse(inputQuality)
  select x;

I deliberately wrap the Linq clauses on their own line; so as to sway you to use the Lambda-ish approach instead, which arguably is way much shorter some of the times :-) Some of the times only, not all the times ;-)

No comments:

Post a Comment