Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coral-Trino] Migrate cast transformation from RelNode layer to SqlNode layer #491

Merged
merged 24 commits into from
Apr 11, 2024

Conversation

KevinGe00
Copy link
Contributor

What changes are proposed in this pull request, and why are they necessary?

The goal of this PR is to migrate the logic of visitCast from RelNode layer (in Calcite2TrinoUDFConverter.convertRel which takes a calcite rel and transforms it to trino compatible rel) to the SqlNode layer as part of Coral IR v2. Some important changes to make this happen:

  1. New CastOperatorTransformer class which is a sqlNode transformer that handles Calcite to Trino cast operator transformation.
  2. Since CastOperatorTransformer requires RelType derivation, which requires Calcite validation, which does not preserve laterals (known and chosen behavior), we had to override this behavior as some queries which had casts and lateral joins were having their laterals incorrectly dropped as an unwanted side effect of CastOperatorTransformer.
  3. Calcite2TrinoUDFConverter deleted since there are no more transformations in the class left

How was this patch tested?

gradle clean build
regression tests for trino, avro and spark all passing

Copy link
Contributor

@wmoustafa wmoustafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is no direct relationship between this PR and LATERAL removal. Does it makes sense to add test cases for LATERAL in this PR or a separate PR?

Also, I recall we had Coral-side code that reintroduces the LATERA before linkedin/linkedin-calcite#98. Do you know why it does not show in the LHS version of this PR?

@@ -84,8 +83,7 @@ public RelToTrinoConverter(HiveMetastoreClient mscClient, Map<String, Boolean> c
* @return SQL string
*/
public String convert(RelNode relNode) {
RelNode rel = convertRel(relNode, configs);
SqlNode sqlNode = convertToSqlNode(rel);
SqlNode sqlNode = convertToSqlNode(relNode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you ignore the configs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convertRel(relNode, configs) is a method from the now deleted Calcite2TrinoUDFConverter. I believe the logic of configs has been moved to CoralToTrinoSqlCallConverter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aastha25 could you verify this change is safe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be safe to remove.
Removing the Rel transformation layer, Calcite2TrinoUDFConverter, in Coral IR -> Trino SQL is a huge milestone. Thanks @KevinGe00 for getting us there :)

gradle/dependencies.gradle Show resolved Hide resolved
@KevinGe00
Copy link
Contributor Author

Looks like there is no direct relationship between this PR and LATERAL removal. Does it makes sense to add test cases for LATERAL in this PR or a separate PR?

The goal here is to have no test regressions caused by dropped LATERALs since we know that would be semantically incorrect. There's no need to add any additional tests in this PR because if the current tests pass, we know that the LATERAL was added back correctly by linkedin/linkedin-calcite#98. Therefore we are already testing LATERAL in Coral.

Also, I recall we had Coral-side code that reintroduces the LATERA before linkedin/linkedin-calcite#98. Do you know why it does not show in the LHS version of this PR?

We introduce LATERAL in RelToTrinoConverter.convertToSqlNode which happens before CastOperatorTransformer, where LATERAL is then dropped again due to type derivation.

@wmoustafa
Copy link
Contributor

we know that the LATERAL was added back correctly by linkedin/linkedin-calcite#98

I was suggesting to add Coral-side tests for the same issue. Could be inspired by the failing regression integration tests.

Also, I recall we had Coral-side code that reintroduces the LATERAL before linkedin/linkedin-calcite#98

I was referring to code that @aastha25 introduced to manually reinstate LATERAL after validation. I think it was in TypeValidationUtil if I recall correctly. The reason I am asking is that now that we appropriately handle LATERAL in calcite, this reinstating code is no longer needed and should appear as deleted in the LHS of the PR diff.

@KevinGe00
Copy link
Contributor Author

KevinGe00 commented Apr 9, 2024

I was suggesting to add Coral-side tests for the same issue. Could be inspired by the failing regression integration tests.

We already have Coral-side unit tests that capture this issue. As long as the test input query as a CAST operator and a LATERAL, the missing LATERAL post-validation issue will be tested.

to manually reinstate LATERAL after validation. I think it was in TypeValidationUtil if I recall correctly

I believe you're talking about the post processor @aastha25 wrote a long time ago to manually add the LATERAL back. This code was never commited/merged in the first place since the patch was a WIP so it wouldn't need to be deleted (deleted in the LHS of the PR diff).

Copy link
Contributor

@wmoustafa wmoustafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after minor Java Doc changes. Thanks for all the deep investigation down to the LATERAL issue in Calcite. Also Glad to see Calcite2TrinoUDFCconverter finally gone. Great work!

@KevinGe00
Copy link
Contributor Author

@wmoustafa Addressed the last 2 javadoc comments. Thanks for the thorough review :)

@@ -44,7 +45,7 @@ public DataTypeDerivedSqlCallConverter(HiveMetastoreClient mscClient, SqlNode to
operatorTransformerList = SqlCallTransformers.of(new FromUtcTimestampOperatorTransformer(typeDerivationUtil),
new GenericProjectTransformer(typeDerivationUtil), new NamedStructToCastTransformer(typeDerivationUtil),
new ConcatOperatorTransformer(typeDerivationUtil), new SubstrOperatorTransformer(typeDerivationUtil),
new UnionSqlCallTransformer(typeDerivationUtil));
new CastOperatorTransformer(typeDerivationUtil), new UnionSqlCallTransformer(typeDerivationUtil));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the ordering relevant here? Generally we add the new transformer at the end. But in this case, CastOperatorTransformer is appended before UnionSqlCallTransformer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Order shouldn't matter here since UnionSqlCallTransformer doesn't create any types of cast that CastOperatorTransformer would need to transform.

@aastha25
Copy link
Contributor

Could you also add in the PR description that this patch has been tested against the li-trino-hotfix branch?

rest LGTM.

Copy link
Collaborator

@ljfgem ljfgem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.
Thank you for removing the last RelNode layer transformation in Coral-Trino, this is a great milestone!

Copy link
Collaborator

@ljfgem ljfgem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KevinGe00 KevinGe00 merged commit e09015e into linkedin:master Apr 11, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants