PERF: Index.__getitem__ performance issue

Once again, caused by #6328 investigation.

There's something very strange with how `Index` objects handle slices:

``` python
In [1]: import pandas.util.testing as tm

In [2]: idx = tm.makeStringIndex(1000000)

In [3]: timeit idx[:-1]
100000 loops, best of 3: 2 µs per loop

In [4]: timeit idx[slice(None,-1)]
100 loops, best of 3: 6.5 ms per loop
```

Obviously, this happens because `Index` doesn't override `__getslice__` provided by `ndarray`, hence `idx[:-1]` is executed via `ndarray.__getslice__` -> `Index.__array_finalize__` and `idx[slice(None, -1)]` goes via `Index.__getitem__` -> `Index.__new__`.

`__getitem__` is made 1000x slower trying to infer slice data type and convert it to a different subclass. The problem is that interactive invocation `idx[:-1]`, which is when that milliseconds-vs-microseconds issue doesn't matter, is likely to miss this feature, because it's dispatched via `__getslice__` . But for programmatic invocation `idx[slice(None, -1)]` which hits this soft spot, I'd argue that this type conversion magic is not at all necessary.

Is there a rationale behind this?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PERF: Index.getitem performance issue #6370

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

PERF: Index.__getitem__ performance issue #6370

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

PERF: Index.getitem performance issue #6370