-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for polling. #18
base: master
Are you sure you want to change the base?
Add support for polling. #18
Conversation
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@prestosql.io. For more information, see https://github.com/prestosql/cla. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you confirm the CI failure?
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@prestosql.io. For more information, see https://github.com/prestosql/cla. |
@ebyhr There was a trivial oversight in keeping track of progress when requesting one row at a time. I've pushed the fix, and signed the CLA. |
@cla-bot check |
The cla-bot has been summoned, and re-checked this pull request! |
We're running into the same issue with long pulling queries. Any chance we could see this being merged in a near future? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bunch of backward incompatible changes here
- removal of result()
- removal of rownumber()
- removal of public fetch() (it's now internal)
- fetchone used to be mean "fetch first", now its "fetch only"
Are these changes intentional?
If there is a good reason for making them, let's make introduce
them as separate commits, with explanation in the messages.
If the goal is to add only poll(), i think some of the changes can
be avoided.
@matthewwardrop Are you planning to work on this soon? |
@amitds1997 I'm not sure. My original PRs to the presto-python-client and then this one have "enjoyed" very slow review times, making it challenging to make much headway (by the time a review occurs, I've not thought much about the patch for many months). @findepi's comments were valid ones, but as far as I can tell I already answered these concerns in the original PR description. I'm happy to restage the PR as a series of commits if that is helpful, but note that it won't be entirely straightforward: these changes hang together, and each standalone commit won't make a tonne of sense without the context of the others. I'm also happy if someone else closer to the project wants to reimplement things in a different way, but after a review of this patch I still feel this is a pretty clean implementation. It does make several "unnecessary" changes, but only in the sense that you although you could achieve polling without the other changes, the extra changes prevent weird epicycles in the code and the risk of losing data depending on the order in which you call methods on the class instances. I originally wrote this patch so I could migrate to this more official client from |
This PR was originally put up against the original prestodb version of this library, which now seems to be defunct; and adds support for polling the status of presto queries, decoupling it from collection of results (fixing #63 on that repository). This is important especially for long-running queries.
Changes:
PrestoResult
and merged its functionality withPrestoQuery
. Keeping both led to some weird logic loops as each tried to keep the other up to date once polling was added.PrestoQuery.fetch
a private methodPrestoQuery._fetch
, since calling it directly leads to the iterator not being in sync.PrestoQuery
object cache the query results, otherwise there is some weird behaviour whereby if you stop iterating over the rows, you throw away all of the rows left over in the last retrieved chunk.fetchone
/fetchmany
relationship so thatfetchone
is a special case offetchmany
(because the changes above allow multiple iterators to be created over time, so long as there is only one at time, and in this way we don't create a new generator for every row).