Description
See example, if n
is big, get_loc
returns slice
, otherwise it returns an integer. The boundary of n
being big changes from time to time (but frequently 25 or 50).
http://stackoverflow.com/questions/22067205/when-does-pandas-xs-drop-dimensions-and-how-can-i-force-it-to-not-to
n=23
df = pd.DataFrame({'a':np.append(np.random.randint(0,10,n), -1),
'b':np.append(np.random.randint(0,10,n), -1),
'c':np.append(np.random.randint(0,10,n), -1),
'value':np.random.randint(0,100,n+1)})
df.set_index(['a','b','c'], inplace=True)
df.sortlevel(inplace = True)
#display(df.xs((-1,-1,-1)))
df.index.get_loc((-1,-1,-1))
The directly consequence is, xs
would now returns a Series or a Data Frame (even if there is only 1 match) nondeterministicly (up to whether an integer or a slice is returned from get_loc
)
What more, if the key is not in the indices, get_loc
would sometimes throw KeyError
exception, sometimes returnsSlice(0,0,None)
Try df.index.get_loc((-2,-1,-1))
more times and you will see. I suspect it depends on whether there are duplicate values in the multiindex.