Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: nlargest raises TypeError "No matching signature found" on Float64Dtype Series, versions >1.3.0 #42816

Closed
hottwaj opened this issue Jul 30, 2021 · 4 comments · Fixed by #42838
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug NA - MaskedArrays Related to pd.NA and nullable extension arrays Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@hottwaj
Copy link

hottwaj commented Jul 30, 2021

  • [yes] I have checked that this issue has not already been reported.

  • [yes] I have confirmed this bug exists on the latest version of pandas.

Code Sample, a copy-pastable example

# this works:
pandas.Series(numpy.random.random(10)).nlargest(5)

# this works on pandas 1.2.5 but fails on 1.3.0 and 1.3.1.  All Float64DType Series seem to have the same issue
pandas.Series(numpy.random.random(10)).astype('Float64').nlargest(5)

Software/hardware

python version: 3.8.10
Ubuntu 20.04
(Intel Tiger Lake CPU)

Stack trace

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_213893/2592114079.py in <module>
----> 1 pandas.Series(numpy.random.random(10)).astype('Float64').nlargest(5)

~/.pyenv/versions/3.8.10/envs/myvenv/lib/python3.8/site-packages/pandas/core/series.py in nlargest(self, n, keep)
   3764         dtype: int64
   3765         """
-> 3766         return algorithms.SelectNSeries(self, n=n, keep=keep).nlargest()
   3767 
   3768     def nsmallest(self, n: int = 5, keep: str = "first") -> Series:

~/.pyenv/versions/3.8.10/envs/myvenv/lib/python3.8/site-packages/pandas/core/algorithms.py in nlargest(self)
   1217 
   1218     def nlargest(self):
-> 1219         return self.compute("nlargest")
   1220 
   1221     def nsmallest(self):

~/.pyenv/versions/3.8.10/envs/myvenv/lib/python3.8/site-packages/pandas/core/algorithms.py in compute(self, method)
   1285         # arr passed into kth_smallest must be contiguous. We copy
   1286         # here because kth_smallest will modify its input
-> 1287         kth_val = algos.kth_smallest(arr.copy(order="C"), n - 1)
   1288         (ns,) = np.nonzero(arr <= kth_val)
   1289         inds = ns[arr[ns].argsort(kind="mergesort")]

~/.pyenv/versions/3.8.10/envs/myvenv/lib/python3.8/site-packages/pandas/_libs/algos.pyx in pandas._libs.algos.__pyx_fused_cpdef()

TypeError: No matching signature found

Thanks!

@hottwaj hottwaj added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 30, 2021
simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Jul 30, 2021
@simonjayhawkins
Copy link
Member

Thanks @hottwaj for the report.

this works on pandas 1.2.5 but fails on 1.3.0 and 1.3.1. All Float64DType Series seem to have the same issue

first bad commit: [b4375a4] REF: avoid unnecessary casting in algorithms (#41256)

cc @jbrockmendel

@simonjayhawkins simonjayhawkins added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff NA - MaskedArrays Related to pd.NA and nullable extension arrays and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 30, 2021
@simonjayhawkins simonjayhawkins added this to the 1.3.2 milestone Jul 30, 2021
@simonjayhawkins simonjayhawkins added the Regression Functionality that used to work in a prior pandas version label Jul 30, 2021
@jbrockmendel
Copy link
Member

It's easy to make a (kludgy) shim that fixes cases without any NAs, need to give some thought to the general case

@mzeitlin11
Copy link
Member

xref #42737

@jbrockmendel
Copy link
Member

3 options here

  1. make nsmallest/nlargest an EA method for authors to override
  2. for our masked arrays, in cases where there are no NAs, we can kludge _ensure_data to work
  3. make libalgos.kth_smallest support object dtype

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug NA - MaskedArrays Related to pd.NA and nullable extension arrays Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants