Why do we recommend to disable "preferSortMergeJoin"? #6133
Replies: 2 comments 1 reply
-
@zhouyuan @z123 @PHILO-HE Do you have information to share? e.g. Do you use SortMergeJoin or ShuffledHashJoin in production? |
Beta Was this translation helpful? Give feedback.
-
Hi, @xumingming I didn't have much information the production env, but for functionality and performance in Gluten/Velox - Hash Join is better. We are also improving the merge join code path in Velox recently but still requires more tests and validations from Gluten users. thanks, -yuan |
Beta Was this translation helpful? Give feedback.
-
incubator-gluten/docs/Configuration.md
Line 21 in 800cadd
Just curious, why do we recommend to disable
preferSortMergeJoin
? Do we have some kind of benchmark result? Would be great if you can share the benchmark results 👍The reason I ask this is that Spark claims SortMergeJoin works better for large tables:
Beta Was this translation helpful? Give feedback.
All reactions