-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: groupby.first/last loses timezone information followup #21603
Comments
Hey i would like to take up this issue. Also i am new to open source.. |
@mroeschke haven't looked in detail yet but do you know if there's a reason why we even have a compat function here? Under the impression the Cythonized version should work here, no? |
@WillAyd Just quickly looking at the code, like a general pandas/pandas/core/groupby/groupby.py Line 1436 in f1aa08c
@aggarwalvinayak the |
It may in it’s current form, but at least in theory I don’t think the first and last operations should have to behave differently based on the type being referenced. There are indexing operations for things like shift that can operate on any type arbitrarily, so seems logical that could apply to accessing the first and last elements as well. That’s a larger change I can look into and perhaps deserves a separate issue. @aggarealvinayak feel free to continue diagnosing this as prescribed above |
this needs to be fixed in cython i think |
Fair point @WillAyd, the indexing should be agnostic to the data types. Alternatively, I was thinking; why isn't
like how |
first/last i think could be in terms of nth (which came later) - nan handling is the same i think (that’s the defining issue and how they r different from head/tail) |
If you take the compat out of the equation pandas/pandas/core/groupby/groupby.py Line 2411 in f1aa08c
pandas/pandas/_libs/groupby_helper.pxi.in Line 300 in f1aa08c
|
Not sure if this is the right test, but it looks like
@aggarwalvinayak what you may want to do then is start from @WillAyd comments above and try to investigate if its possible to implement first/last in terms of the nth method. |
@aggarwalvinayak welcome to still work on this but just as a heads up I removed the "good first issue" tag as this is a little more complicated touching on Cython code |
@WillAyd I am not at all experienced with cython.. Will try to explore about that first.. Because this is my second issue that the |
xref #21573 (comment)
The example above passes data through the
first/last
compat method which strips timezone information. PR #15885 (now closed) should fix this issue (and offer a performance boost to Categorial data as mentioned in #19026)The text was updated successfully, but these errors were encountered: