Tuesday, April 15, 2014

Optimization through materialization

Saw a slow query which for the life of me I can't figure out why joining on inline TVF is slow, considering inline TVF is generally faster than variable-based TVF (a.k.a. multi-statement TVF)


SELECT p.Fullname, o.Organizations
FROM Person p
JOIN dbo.fn_GetPersonOrganizations() o ON P.PersonId = B.PersonId

And was optimized by pre-materializing the rows of the inline TVF by dumping its result to a temporary table first then joining to temporary table instead. As much as we don't like to use a temporary table and make a more solid solution by fixing the problem of the inline TVF itself instead, we don't have the luxury of time as the ticket is a hotfix one; besides, we are having a hard time discerning if the problem really lies on the inline TVF itself, the inline TVF's code is very tight and looks performant(at the risk of sounding like a sales brochure). So temporary table we go:


SELECT p.Fullname, o.Organizations
FROM Person p
JOIN #Organizations o ON P.PersonId = B.PersonId



Table variable don't need to be cleaned up, so we could use table variable instead of temporary table; however, temporary table could be faster than table variable, but then temporary table need to be cleaned up, so we come up with this alternative approach instead:


SELECT p.Fullname, o.Organizations
FROM Person p
JOIN
(
    SELECT TOP 4000000000 *
    FROM dbo.fn_GetPersonOrganizations()
) o ON P.PersonId = B.PersonId


We found out that TOP blah achieves the same effect of materializing the rows immediately; same fast performance as temporary table. We were able to speed up the query, but we don't know yet why we need to materialize the inline TVF's results to gain performance on joins.



Happy Coding! ツ

No comments:

Post a Comment