Saturday, October 5, 2013

Variable-based Table-Valued Function vs Inline Table-Valued Function

Variable-Based Table-Valued Function, a.k.a. Multistatement Table-Valued Function are performance killers, for a good explanation, see the answer here: http://stackoverflow.com/questions/2554333/multi-statement-table-valued-function-vs-inline-table-valued-function

Here's a sample variable-based table-valued Function:

create function getSamplePersonOrgsViaVariableBasedTableValuedFunction()
returns @t table
(
    BusinessEntityID int primary key clustered,
    OrgUnits nvarchar(max)
)
as
begin
    insert into @t(BusinessEntityID, OrgUnits)
    select p.BusinessEntityID, eorg.orgUnits
    from Person.Person p
    cross apply
    (
        select 'blah'
    ) eorg(orgUnits);
     
    return;
     
end;
go


Here's a sample inline table-valued function:

create function getSamplePersonOrgsViaInlineTableValuedFunction() returns table
as
return
    select p.BusinessEntityID, eorg.orgUnits
    from Person.Person p
    cross apply
    (
        select 'blah'
    ) eorg(orgUnits)
go


For 50 invocations of this sample query on inline table-valued function, this took 1 second only

SELECT p.BusinessEntityID, p.FirstName, p.LastName, gsp.orgUnits
FROM Person.Person p
join getSamplePersonOrgsViaInlineTableValuedFunction() gsp on p.BusinessEntityID = gsp.BusinessEntityID
where p.BusinessEntityID < 9
go 50


Contrast that efficiency to variable-based table-valued function, variable-based table-valued function took 9 seconds:

SELECT p.BusinessEntityID, p.FirstName, p.LastName, gsp.orgUnits
FROM Person.Person p
join getSamplePersonOrgsViaVariableBasedTableValuedFunction() gsp on p.BusinessEntityID = gsp.BusinessEntityID
where p.BusinessEntityID < 9
go 50

Eager loading is the root cause of the performance problem for variable-based table-valued function, i.e. even we only need 8 rows on source table (p.BusinessEntityID < 9), when the source table is joined to variable-based table-valued function, the source table has to wait first for the result (19,972 rows) of the variable-based table-valued function before it is finally being joined to.


Inline table-valued function is very efficient and smart, it treats the function like a view, i.e., the query of the inline table-valued function is expanded to an actual tables when being joined to another table, behaves exactly like the joining of a table to a table-deriving query / view / CTE. Hence when the inline table-valued function is joined to a source table, when you fetch 8 rows only on source table (p.BusinessEntityID < 9), the query will also fetch 8 rows only on inline-table-valued function too.


On variable-based table-valued function, the result of the function is the one being expanded then put to another table bucket (variable table), hence causing performance problem, so when for example we have 8 rows then we join it to a variable-based table-valued function, we are joining the 8 rows to eagerly-loaded 19,972 rows.


To illustrate the efficiency of inline table-valued function, let's cause a divide by zero when the record encounters Person ID number 9.

alter function getSamplePersonOrgsViaInlineTableValuedFunction() returns table
as
return
    select p.BusinessEntityId, eorg.orgUnits
    from dbo.Person.Person p
    cross apply
    (
        select 'blah ' + convert(varchar,case when p.BusinessEntityId = 9 then 1 / 0 else 7 end)
    ) eorg(orgUnits)
go
 
alter function getSamplePersonOrgsViaVariableBasedTableValuedFunction()
returns @t table
(
    BusinessEntityId int primary key clustered,
    OrgUnits nvarchar(max)
)
as
begin
    insert into @t(BusinessEntityId, OrgUnits)
    select p.BusinessEntityId, eorg.orgUnits
    from dbo.Person.Person p
    cross apply
    (
        select 'blah ' + convert(varchar,case when p.BusinessEntityId = 9 then 1 / 0 else 7 end)
    ) eorg(orgUnits);
     
    return;
     
end;
go


This query will not cause divide-by-zero error on inline table-valued function, a proof that the function doesn't fetch organization units after BusinessEntityID number 8

SELECT p.BusinessEntityID, p.FirstName, p.LastName, gsp.orgUnits
FROM Person.Person p
join getSamplePersonOrgsViaInlineTableValuedFunction() gsp on p.BusinessEntityID = gsp.BusinessEntityID
where p.BusinessEntityID < 9


This query gets a divide-by-zero error though, a proof that the query fetch all the 19,972 rows of the variable-based table-valued function first before it is being joined to.

SELECT p.BusinessEntityID, p.FirstName, p.LastName, gsp.orgUnits
FROM Person.Person p
join getSamplePersonOrgsViaVariableBasedTableValuedFunction() gsp on p.BusinessEntityID = gsp.BusinessEntityID
where p.BusinessEntityID < 9




Happy Computing! ツ

No comments:

Post a Comment