Description
Code Sample
import pandas as pd
data = pd.DataFrame({'INT': [-1, 0, 10, 9],
'FLOAT': [-0.148, 0.2347, 38.237, 12.2233]},
index=pd.date_range("20180101 00:00", periods=4))
print('Original data:')
print(data.head())
print('\nThis is probably not a bug but my misunderstanding:')
print('(So how would I apply "clip_upper" inplace on parts of the dataframe?)')
data.loc[[True, True, True, False], ['INT']].clip_upper(8, inplace=True)
print(data.head())
# I used then:
# data.loc[[True, True, True, False], ['INT']] = data.loc[[True, True, True, False], ['INT']].clip_upper(8)
print('\nIt seems that clip_upper does not preserve the dtypes:')
print(data.clip_upper(8).head())
print('\nSame for inplace:')
data.clip_upper(8, inplace=True)
print(data.head())
Output of this code:
Original data:
INT FLOAT
2018-01-01 -1 -0.1480
2018-01-02 0 0.2347
2018-01-03 10 38.2370
2018-01-04 9 12.2233
(A) This is probably not a bug but my misunderstanding:
(So how would I apply "clip_upper" inplace on parts of the dataframe?)
INT FLOAT
2018-01-01 -1 -0.1480
2018-01-02 0 0.2347
2018-01-03 10 38.2370
2018-01-04 9 12.2233
(B) It seems that clip_upper does not preserve the dtypes:
INT FLOAT
2018-01-01 -1.0 -0.1480
2018-01-02 0.0 0.2347
2018-01-03 8.0 8.0000
2018-01-04 8.0 8.0000
(C) Same for inplace:
INT FLOAT
2018-01-01 -1.0 -0.1480
2018-01-02 0.0 0.2347
2018-01-03 8.0 8.0000
2018-01-04 8.0 8.0000
Problem description
clip_upper
with int- and float- columns convert int-column to float.
Calling data.clip_upper(10)
with an integer, I would expect that it leaves the int-column as integers and the float-column as float. However, it converts everything to float. (see (B) and (C))
Moreover, clip_upper
with inplace=True
does not work with .loc
but this might as well be me understanding the concept wrong... (see (A))
Same for clip_lower
.
Expected Output
For (A):
INT FLOAT
2018-01-01 -1 -0.1480
2018-01-02 0 0.2347
2018-01-03 8 38.2370
2018-01-04 9 12.2233
For (B) and (C):
INT FLOAT
2018-01-01 -1 -0.1480
2018-01-02 0 0.2347
2018-01-03 8 8.0000
2018-01-04 8 8.0000
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
pandas: 0.23.4
pytest: 4.0.1
pip: 18.1
setuptools: 40.6.2
Cython: 0.29
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: 1.8.2
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.1
openpyxl: 2.5.11
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.14
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None