Skip to content

ValueError in Categorical Constructor with empty data and boolean categories #22702

Closed
@TomAugspurger

Description

@TomAugspurger

This works,

In [15]: pd.Categorical([], categories=['a', 'b'])
Out[15]: [], Categories (2, object): [a, b]

This doesn't

In [16]: pd.Categorical([], categories=[True, False])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-8e79cd310199> in <module>()
----> 1 pd.Categorical([], categories=[True, False])

~/sandbox/pandas/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath)
    426
    427         else:
--> 428             codes = _get_codes_for_values(values, dtype.categories)
    429
    430         if null_mask.any():

~/sandbox/pandas/pandas/core/arrays/categorical.py in _get_codes_for_values(values, categories)
   2449     (_, _), cats = _get_data_algo(categories, _hashtables)
   2450     t = hash_klass(len(cats))
-> 2451     t.map_locations(cats)
   2452     return coerce_indexer_dtype(t.lookup(vals), cats)
   2453

~/sandbox/pandas/pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.map_locations()
   1330             raise KeyError(key)
   1331
-> 1332     def map_locations(self, ndarray[object] values):
   1333         cdef:
   1334             Py_ssize_t i, n = len(values)

ValueError: Buffer dtype mismatch, expected 'Python object' but got 'unsigned long'

the values there is array([], dtype=object). It should be int dtype by this point.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions