[Bug] dbt adapter tests for utils unreliable #7670

sdebruyn · 2023-05-20T10:19:36Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

The current adapter tests for the utils use an incorrect way to verify the end results of the tests.

{% test assert_equal(model, actual, expected) %}
select * from {{ model }} where {{ actual }} != {{ expected }}

{% endtest %}

The above macro is currently used. However, NULL works in a strange way in SQL.

This is documented for SQL Server or Postgres in the same way as it is the ANSI standard that dictates this behaviour.

The result is that the test above will completely ignore results where either actual or expected is null and the other one is not.

I just noticed this in the dbt-sqlserver tests. Apparently the hashing test should have been failing from as soon it was implemented, but it never did.

Expected Behavior

There are multiple routes we can take here. I'd be glad to submit a PR once we agree on the way forward.

distinct from
This SQL syntax is not supported in all common databases (e.g. SQL Server just added support in the 2022 version), but it effectively compares the values as was probably intended in the mentioned macro.
coalesce
If we would coalesce the result values to an invalid value (e.g. coalesce(actual, 'invalid') for strings), then the tests could continue using the != operator to compare actual and expected. I think this would mean that we have to adapt every utils test as some work on strings, others on integers etc.
add null checks to where clause
IMHO the best way forward. Just add conditions to see if actual is null while expected is not and vice versa.

{% test assert_equal(model, actual, expected) %}
select * from {{ model }}
where {{ actual }} != {{ expected }}
or ({{ actual }} is null and {{ expected }} is not null)
or ({{ expected }} is null and {{ actual }} is not null)
{% endtest %}

Steps To Reproduce

This issue is visible in the latest version of the dbt-sqlserver tests, but it will probably have caused more false positives for other adapters as well.

The text was updated successfully, but these errors were encountered:

sdebruyn · 2023-05-20T10:21:17Z

@dbeatty10 @colin-rogers-dbt @McKnight-42 WDYT?

dbeatty10 · 2023-05-20T14:45:35Z

Thank you for raising this @sdebruyn!

Agreed that the three-valued logic is not intuitive for most of us and then it leads to situations like the this one with the assert_equal macro for these functional tests for adapters.

On a related note, #6997 has a feature request to add a cross-database macro for is [not] distinct from. Implementing it and using it would be a 4th route we could take.

Deciding on a route

I agree with you that Option 3 is the easiest short-term path forward for addressing the assert_equal macro. The downside is that this replacement assert_equal would not have its own testing to make sure it is implemented correctly. Since it's a simple and well-known implementation, I'd be comfortable having assert_equal as un-tested itself.

Would you be interested in raising a PR to implement Option 3, by any chance? If not, @alzaar was looking for a good_first_issue recently and might be interested in picking it up.

Implementation of Option 3

I'd suggest a slightly less verbose variant that should yield equivalent results:

{% test assert_equal(model, actual, expected) %}
select * from {{ model }}
{# -- actual is distinct from expected #}
where not (
  ({{ actual }} = {{ expected }})
  or ({{ actual }} is null and {{ expected }} is null)
)
{% endtest %}

or maybe even:

{% macro equals(actual, expected) %}
{# -- actual is not distinct from expected #}
(({{ actual }} = {{ expected }}) or ({{ actual }} is null and {{ expected }} is null))
{% endmacro %}

{% test assert_equal(model, actual, expected) %}
select * from {{ model }}
where not {{ equals(actual, expected) }}
{% endtest %}

sdebruyn · 2023-05-21T11:02:23Z

Thanks @dbeatty10, opened #7672. I don't have local postgres testing set up, so let's see in CI if some of them start failing now.

dbeatty10 · 2023-05-22T12:58:59Z

Awesome @sdebruyn! Looks like all the CI checks on that PR are passing, and a member of our engineering team will give it a review.

lllong33 · 2023-05-26T05:45:33Z

@sdebruyn @dbeatty10 It looks like not null is not true, can't write it like that. FYI

select version();  -- PostgreSQL 12.3 on x86_64-pc-linux-gnu

with t1 as (
	select 1 as actual, 2 as expected
	union all 
	select 1 as actual, 1 as expected
	union all 
	select 1 as actual, cast(null as int) as expected
	union all 
	select cast(null as int) as actual, cast(null as int) as expected 
)

select * from t1 
--where actual = expected or (actual is null and expected is null) -- normal: 1,1; null,null
--where actual != expected 
--	or (actual is null and expected is not null )
--	or (actual is not null and expected is null ) -- normal: 1,2; 1,null
where not (actual = expected or (actual is null and expected is null)) -- issue: actual=1, expected=null is not work

dbeatty10 · 2023-05-26T17:37:27Z

@lllong33 Is the key take-away that the implementation of {{ equals(actual, expected) }} should probably include a case statement as described here and shown below?

Minimal test cases

Here's some Postgres-specific SQL that is helpful to demonstrate all the permutations listed here using the case-when logic explained here:

with x as (

    select 1::int as i union all
    select 2::int as i union all
    select null::int as i

),

permutations as (

    select
        x1.i as x1_i,
        x2.i as x2_i 
    from x as x1, x as x2
    order by x1.i, x2.i

)

select
    -- values to compare
    x1_i,
    x2_i,

    -- logic for equals macro
    ((x1_i = x2_i) or (x1_i is null and x2_i is null)) as equals_macro,

    -- compliant syntax for null-safe equality
    x1_i is not distinct from x2_i as is_not_distinct_from,

    -- case/when logic for equals macro
    case when ((x1_i = x2_i) or (x1_i is null and x2_i is null))
        then 0
        else 1
    end = 0 as case_equals_macro,

    -- negation of logic for equals macro
    not ((x1_i = x2_i) or (x1_i is null and x2_i is null)) as not_equals_macro,

    -- compliant syntax for null-safe inequality
    x1_i is distinct from x2_i as is_distinct_from,

    -- negation of case/when logic for equals macro
    not (case when ((x1_i = x2_i) or (x1_i is null and x2_i is null))
     then 0
     else 1
    end = 0) as case_equals_macro

from permutations
;

Reflection

This discussion makes me think that what we really need are cross-database null-safe implementation(s) of is [not] distinct from as proposed in #6997. Doing those implementation(s) would necessarily include all the relevant testing to make sure the edge cases are covered.

Summary

At the very least, we need more test cases and then to update the implementation in #7672 to use case/when syntax.

While we are doing more test cases, we might as well just do #6997.

sdebruyn added bug Something isn't working triage labels May 20, 2023

dbeatty10 added awaiting_response and removed triage labels May 20, 2023

sdebruyn mentioned this issue May 21, 2023

Include null checks in utils test base #7672

Merged

6 tasks

github-actions bot added triage and removed awaiting_response labels May 21, 2023

dbeatty10 removed the triage label May 22, 2023

VersusFacit closed this as completed in #7672 May 24, 2023

VersusFacit self-assigned this May 24, 2023

github-actions bot added the triage label May 26, 2023

This was referenced Jun 4, 2023

[CT-2634] [Bug] not {{ equals(actual, expected) }} isn't working properly #7778

Closed

Fix not equals comparison to be null-safe for adapters/utils tests #7776

Merged

dbeatty10 removed the triage label Jun 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] dbt adapter tests for utils unreliable #7670

[Bug] dbt adapter tests for utils unreliable #7670

sdebruyn commented May 20, 2023 •

edited

Loading

sdebruyn commented May 20, 2023

dbeatty10 commented May 20, 2023

sdebruyn commented May 21, 2023

dbeatty10 commented May 22, 2023

lllong33 commented May 26, 2023

dbeatty10 commented May 26, 2023 •

edited

Loading

[Bug] dbt adapter tests for utils unreliable #7670

[Bug] dbt adapter tests for utils unreliable #7670

Comments

sdebruyn commented May 20, 2023 • edited Loading

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

sdebruyn commented May 20, 2023

dbeatty10 commented May 20, 2023

Deciding on a route

Implementation of Option 3

sdebruyn commented May 21, 2023

dbeatty10 commented May 22, 2023

lllong33 commented May 26, 2023

dbeatty10 commented May 26, 2023 • edited Loading

Minimal test cases

Reflection

Summary

sdebruyn commented May 20, 2023 •

edited

Loading

dbeatty10 commented May 26, 2023 •

edited

Loading