Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fliter on join condition get wrong result #334

Open
loneylee opened this issue Mar 7, 2023 · 0 comments
Open

Fliter on join condition get wrong result #334

loneylee opened this issue Mar 7, 2023 · 0 comments

Comments

@loneylee
Copy link

loneylee commented Mar 7, 2023

Describe what's wrong

Execute sql with

create table upperCaseData (N int,L String);
insert into upperCaseData values (1, 'A'), (2, 'B'), (3,'C'), (4,'D'), (5,'E'), (6,'F');
create table lowerCaseData (n int,l String);
insert into lowerCaseData values (1, 'a'), (2, 'b'), (3,'c'), (4,'d');
SELECT *
FROM upperCaseData
LEFT JOIN lowerCaseData ON (N = n) AND (N > 1)

The wrong result is

┌─N─┬─L─┬─n─┬─l─┐
│ 2 │ B │ 2 │ b │
│ 3 │ C │ 3 │ c │
│ 4 │ D │ 4 │ d │
│ 5 │ E │ 0 │ │
│ 6 │ F │ null │null │
└───┴───┴───┴───┘

Expected behavior

┌─N─┬─L─┬─n─┬─l─┐
│ 1 │ A │ null │ null │
│ 2 │ B │ 2 │ b │
│ 3 │ C │ 3 │ c │
│ 4 │ D │ 4 │ d │
│ 5 │ E │ 0 │ │
│ 6 │ F │ null │null │
└───┴───┴───┴───┘

Error message and/or stacktrace
Spark plan

   CHNativeColumnarToRow
   +- *(6) (**)CHBroadcastHashJoinExecTransformer [N#83], [n#93], LeftOuter, BuildRight, (N#83 > 1), false
      :- (InputAdapter)CoalesceBatches
      :  +- (InputAdapter)ColumnarAQEShuffleRead local
      :     +- (InputAdapter)ShuffleQueryStage 0
      :        +- ColumnarExchangeAdaptor hashpartitioning(N#83, 1), 
      :           +- RowToCHNativeColumnar
      :              +- SerializeFromObject 
      :                 +- Scan[obj#82]
      +- (InputAdapter)BroadcastQueryStage 2
         +- ColumnarBroadcastExchange HashedRelationBroadcastMode
            +- CoalesceBatches
               +- ColumnarAQEShuffleRead local
                  +- ShuffleQueryStage 1
                     +- ColumnarExchangeAdaptor hashpartitioning(n#93, 1), 
                        +- *(4) (*wholestage*)FilterExecTransformer (n#93 > 1)
                           +- (InputAdapter)RowToCHNativeColumnar
                              +- (InputAdapter)SerializeFromObject 
                                 +- (InputAdapter)Scan[obj#92]

Clickhouse plan

Expression (Rename Output)
Header: N#83 Int32
        L#84 Nullable(String)
        n#93 Nullable(Int32)
        l#94 Nullable(String)
Actions: INPUT :: 0 -> N#83 Int32 : 0
         INPUT :: 1 -> L#84 Nullable(String) : 1
         INPUT :: 2 -> n#93 Nullable(Int32) : 2
         INPUT :: 3 -> l#94 Nullable(String) : 3
Positions: 0 1 2 3
  Expression (Project)
  Header: N#83 Int32
          L#84 Nullable(String)
          n#93 Nullable(Int32)
          l#94 Nullable(String)
  Actions: INPUT :: 0 -> N#83 Int32 : 0
           INPUT :: 1 -> L#84 Nullable(String) : 1
           INPUT :: 2 -> n#93 Nullable(Int32) : 2
           INPUT :: 3 -> l#94 Nullable(String) : 3
  Positions: 0 1 2 3
    Filter (Post Join Filter)
    Header: N#83 Int32
            L#84 Nullable(String)
            toInt64(N#83) Int64
            n#93 Nullable(Int32)
            l#94 Nullable(String)
            col_0#229 Nullable(Int64)
    Filter column: greater(N#83,1_1) (removed)
    Actions: INPUT : 0 -> N#83 Int32 : 0
             INPUT :: 1 -> L#84 Nullable(String) : 1
             INPUT :: 2 -> toInt64(N#83) Int64 : 2
             INPUT :: 3 -> n#93 Nullable(Int32) : 3
             INPUT :: 4 -> l#94 Nullable(String) : 4
             INPUT :: 5 -> col_0#229 Nullable(Int64) : 5
             COLUMN Const(Int32) -> 1_1 Int32 : 6
             FUNCTION greater(N#83 : 0, 1_1 :: 6) -> greater(N#83,1_1) UInt8 : 7
    Positions: 0 1 2 3 4 5 7
      Expression (Reorder Join Output)
      Header: N#83 Int32
              L#84 Nullable(String)
              toInt64(N#83) Int64
              n#93 Nullable(Int32)
              l#94 Nullable(String)
              col_0#229 Nullable(Int64)
      Actions: INPUT :: 0 -> N#83 Int32 : 0
               INPUT :: 1 -> L#84 Nullable(String) : 1
               INPUT :: 2 -> toInt64(N#83) Int64 : 2
               INPUT :: 3 -> n#93 Nullable(Int32) : 3
               INPUT :: 4 -> l#94 Nullable(String) : 4
               INPUT :: 5 -> col_0#229 Nullable(Int64) : 5
      Positions: 0 1 2 3 4 5
        FilledJoin (JOIN)
        Header: N#83 Int32
                L#84 Nullable(String)
                toInt64(N#83) Int64
                n#93 Nullable(Int32)
                l#94 Nullable(String)
                col_0#229 Nullable(Int64)
          Expression (Project)
          Header: N#83 Int32
                  L#84 Nullable(String)
                  toInt64(N#83) Int64
          Actions: INPUT : 0 -> N#83 Int32 : 0
                   INPUT :: 1 -> L#84 Nullable(String) : 1
                   FUNCTION toInt64(N#83 : 0) -> toInt64(N#83) Int64 : 2
          Positions: 0 1 2
            ReadFromPreparedSource (Read From Java Iter)
            Header: N#83 Int32
                    L#84 Nullable(String) 

Clickhouse plan generate a 'Post Join Filter' executor, it makes wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant