Skip to content

str.get fails if Series contains dict #20671

Closed
@datapythonista

Description

@datapythonista

Code Sample, a copy-pastable example if possible

>>> s = pandas.Series([{0: 'a', 1: 'b'}])
>>> s
0    {0: 'a', 1: 'b'}
dtype: object
>>> s.str.get(-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 1556, in get
    result = str_get(self._data, i)
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 1264, in str_get
    return _na_map(f, arr)
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 156, in _na_map
    return _map(f, arr, na_mask=True, na_value=na_result, dtype=dtype)
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 171, in _map
    result = lib.map_infer_mask(arr, f, mask.view(np.uint8), convert)
  File "pandas/_libs/src/inference.pyx", line 1482, in pandas._libs.lib.map_infer_mask
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 1263, in <lambda>
    f = lambda x: x[i] if len(x) > i >= -len(x) else np.nan
KeyError: -1

Problem description

str.get is designed for strings, but also useful with other structures like lists, for which works fine. When the values of the Series contain a dict, str.get tries to get the key provided as an index from the dictionary and fails with a KeyError.

I think it's more consistent with the rest of pandas to simply return numpy.nan when this happens.

Expected Output

>>> s = pandas.Series([{0: 'a', 1: 'b'}])
>>> s
0    {0: 'a', 1: 'b'}
dtype: object
>>> s.str.get(-1)
0    NaN

Output of pd.show_versions()

>>> pandas.show_versions()

INSTALLED VERSIONS

commit: fa231e8
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.8.13-100.fc23.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.utf8
LOCALE: en_GB.UTF-8

pandas: 0.23.0.dev0+740.gfa231e8.dirty
pytest: 3.1.3
pip: 9.0.1
setuptools: 38.5.1
Cython: 0.27.3
numpy: 1.14.0
scipy: 1.0.0
pyarrow: 0.8.0
xarray: 0.10.0
IPython: 6.2.1
sphinx: 1.5
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.1.2
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: 0.8.0
psycopg2: None
jinja2: 2.10
s3fs: 0.1.3
fastparquet: 0.1.4
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugStringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions