Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry on Guaranteeing Rewrite Equivalence and Request for Rewritten Results #3

Open
kr11 opened this issue Oct 22, 2024 · 1 comment

Comments

@kr11
Copy link

kr11 commented Oct 22, 2024

Thank you for your valuable work!

About Equivalence
Could you explain how LLM-R2 ensures that the rewritten SQL is equivalent to the original SQL? We have observed that Calcite sometimes produces non-equivalent outcomes for MySQL. I am unfamiliar with its performance on PostgreSQL.

About rewritten results
LLM-R2 shows impressive improvements on TPCH, IMDB, and DSB benchmarks, yet the rewritten SQLs seem absent from the repository. Could you please upload the rewritten SQLs for the TPCH benchmarks? Your assistance would be highly appreciated.

@LZ12DH
Copy link
Collaborator

LZ12DH commented Nov 13, 2024

Hi,

Thank you for your feedback!

It is interesting to know that Calcite's rewrite tool may produce non-equivalent outcomes for MySQL. We did not perform any experiments on MySQL and we would appreciate it if you could share some of such examples. By far in PostgreSQL we assume that Calcite will always provide equivalent rewrites and by far we did not observe any non-equivalent cases. I think this is also true for the Learned Rewrite paper [1].

For rewritten results, you may check out the 'data/data_llmr2/pools' folder. Some of our rewrite examples on all three datasets can be found there.

[1] Zhou et al., A Learned Query Rewrite System using Monte Carlo Tree Search, VLDB 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants