Skip to content

Comparisons result in different dtypes for empty DataFrames #15077

Closed
@jcrist

Description

@jcrist

The comparison methods (lt, gt, etc...) return incorrect dtypes for empty dataframes. Interestingly, using the operators instead results in correct dtypes. Correct dtypes are also returned for empty series.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'x': [1, 2, 3], 'y': [1., 2., 3.]})

In [3]: empty = df.iloc[:0]

In [4]: df.lt(2).dtypes
Out[4]:
x    bool
y    bool
dtype: object

In [5]: empty.lt(2).dtypes   # Should be all bool, but isn't
Out[5]:
x      int64
y    float64
dtype: object

In [6]: (df < 2).dtypes
Out[6]:
x    bool
y    bool
dtype: object

In [7]: (empty < 2).dtypes   # Things do work if you use the operator though
Out[7]:
x    bool
y    bool
dtype: object

In [8]: df.x.lt(2).dtype
Out[8]: dtype('bool')

In [9]: empty.x.lt(2).dtype    # Correct dtype for empty series
Out[9]: dtype('bool')

In [10]: pd.__version__
Out[10]: '0.19.2'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions