Skip to content

BUG: Series.xs() inconsistent with DataFrame.xs() with MultiIndex #5684

Closed
@TomAugspurger

Description

@TomAugspurger

Edited to clarify the bug

Series.xs slice fails with string index labels and MultiIndex:

In [1]: idx = pd.MultiIndex.from_tuples([('a', 'one'), ('a', 'two'), ('b', 'one'), ('b', 'two')])

In [2]: df = pd.Series(np.random.randn(4), index=idx)

In [4]: df.index.set_names(['L1', 'L2'], inplace=True)

In [5]: df
Out[5]: 
L1  L2 
a   one   -0.136418
    two   -0.346941
b   one   -1.468534
    two    1.217693
dtype: float64

In [6]: df.xs('one', level='L2')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-52601adf5184> in <module>()
----> 1 df.xs('one', level='L2')

/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in _xs(self, key, axis, level, copy)
    437 
    438     def _xs(self, key, axis=0, level=None, copy=True):
--> 439         return self.__getitem__(key)
    440 
    441     xs = _xs

/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in __getitem__(self, key)
    482     def __getitem__(self, key):
    483         try:
--> 484             return self.index.get_value(self, key)
    485         except InvalidIndexError:
    486             pass

/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/index.pyc in get_value(self, series, key)
   2294                     raise InvalidIndexError(key)
   2295                 else:
-> 2296                     raise e1
   2297             except Exception:  # pragma: no cover
   2298                 raise e1

KeyError: 'one'

The same slice works on a DataFrame.

Previous post below:

In [12]: idx = pd.MultiIndex.from_tuples([('a', 0), ('a', 1), ('b', 0), ('b', 1)])

In [13]: df = pd.Series(np.random.randn(4), index=idx)

In [14]: df
Out[14]: 
a  0    0.876121
   1    0.638050
b  0    0.965934
   1    1.061716
dtype: float64

In [15]: df.xs(0, level=1)   # returns scaler
Out[15]: 0.87612104445620753

In [16]: df.index.names = ['L1', 'L2']

In [27]: df.xs(0, level='L2')   # returns scaler
Out[27]: -0.98585685847339011

In [28]: df.xs(0, level='L1')   # No key error
Out[28]: -0.98585685847339011

Works for DataFrames:

In [30]: df.xs(0, level='L2')
Out[30]: 
           0
L1          
a  -0.985857
b   0.648114

[2 rows x 1 columns]

Series.xs also seems to fail on string index labels?

In [50]: idx = pd.MultiIndex.from_tuples([('a', 'one'), ('a', 'two'), ('b', 'one'), ('b', 'two')])
In [53]: df = pd.Series(np.random.randn(4), index=idx)
In [56]: df.xs('one', level='L2')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-56-52601adf5184> in <module>()
----> 1 df.xs('one', level='L2')

/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in _xs(self, key, axis, level, copy)
    437 
    438     def _xs(self, key, axis=0, level=None, copy=True):
--> 439         return self.__getitem__(key)
    440 
    441     xs = _xs

/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/series.pyc in __getitem__(self, key)
    482     def __getitem__(self, key):
    483         try:
--> 484             return self.index.get_value(self, key)
    485         except InvalidIndexError:
    486             pass

/Users/tom/Envs/pandas-dev/lib/python2.7/site-packages/pandas-0.13.0rc1_27_g4d5ca5c-py2.7-macosx-10.8-x86_64.egg/pandas/core/index.pyc in get_value(self, series, key)
   2294                     raise InvalidIndexError(key)
   2295                 else:
-> 2296                     raise e1
   2297             except Exception:  # pragma: no cover
   2298                 raise e1

KeyError: 'one'

So I guess this is about 3 errors on Series.xs (possibly related?):

  1. Returning scalers when it should return a Series when the label is an integer
  2. Not raising key errors when the label is an integer
  3. Failing on slices for level>1 when the label is a string.

EDIT: Oh, and I know that .loc / .ix will work for these. I was just surprised by the results.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions