Skip to content

Combine_first method converts booleans into floats #20699

Closed
@HansBouwmeester

Description

@HansBouwmeester

Works as expected:

>>> df1 = pd.DataFrame()
>>> df2 = pd.DataFrame({'isBool': [True]})
>>> df1.combine_first(df2)
   isBool
0    True

Does not work as expected:

>>> df1 = pd.DataFrame({'isInt': [1]})
>>> df2 = pd.DataFrame({'isBool': [True]})
>>> df1.combine_first(df2)
   isBool  isInt
0     1.0      1

Problem description

Would expect Pandas to preserve the bool dtype (or at least the behavior should differ for the case where df1 is empty or not).

See also: https://stackoverflow.com/questions/39103144/pandas-dataframe-combine-first-method-converts-boolean-in-floats
And related: https://stackoverflow.com/questions/15349795/pandas-dataframe-combine-first-and-update-methods-have-strange-behavior
And: #3016

Seems to me some of the issues where solved but this case may have escaped(?)

Output of pd.show_versions():

commit: None python: 3.6.3.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None

pandas: 0.21.1
pytest: 3.5.0
pip: 9.0.3
setuptools: 38.5.2
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.7.1
patsy: None
dateutil: 2.7.0
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions