Skip to content

COMPAT: safe argsort in Index/Series #17010

Closed
@jreback

Description

@jreback
In [1]: Index([0, '1']).sort_values()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-c2949e5a9d7f> in <module>()
----> 1 Index([0, '1']).sort_values()

/Users/jreback/pandas/pandas/core/indexes/base.py in sort_values(self, return_indexer, ascending)
   2026         Return sorted copy of Index
   2027         """
-> 2028         _as = self.argsort()
   2029         if not ascending:
   2030             _as = _as[::-1]

/Users/jreback/pandas/pandas/core/indexes/base.py in argsort(self, *args, **kwargs)
   2089         if result is None:
   2090             result = np.array(self)
-> 2091         return result.argsort(*args, **kwargs)
   2092 
   2093     def __add__(self, other):

TypeError: '>' not supported between instances of 'str' and 'int'
In [7]: Series([0, '1']).sort_values()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/jreback/pandas/pandas/core/series.py in _try_kind_sort(arr)
   1762                 # if kind==mergesort, it can fail for object dtype
-> 1763                 return arr.argsort(kind=kind)
   1764             except TypeError:

TypeError: '>' not supported between instances of 'str' and 'int'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-7-b0c6ec3dcb93> in <module>()
----> 1 Series([0, '1']).sort_values()

/Users/jreback/pandas/pandas/core/series.py in sort_values(self, axis, ascending, inplace, kind, na_position)
   1775         idx = _default_index(len(self))
   1776 
-> 1777         argsorted = _try_kind_sort(arr[good])
   1778 
   1779         if is_list_like(ascending):

/Users/jreback/pandas/pandas/core/series.py in _try_kind_sort(arr)
   1765                 # stable sort not available for object dtype
   1766                 # uses the argsort default quicksort
-> 1767                 return arr.argsort(kind='quicksort')
   1768 
   1769         arr = self._values

TypeError: '>' not supported between instances of 'str' and 'int'

These can both be fixed by falling back to using .get_indexer() if these fail (.argsort is faster and handles duplicates for object dtypes, so the default is good).

Metadata

Metadata

Assignees

No one assigned

    Labels

    AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffBugClosing CandidateMay be closeable, needs more eyeballs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions