Description
I've noticed that doing an isin on a DataFrame which contains datetime types where the operand is an empty DataFrame produces epoch datetime values (i.e. 1970-01-01), instead of 'False'. It seems unlikely that this is correct?
The following code demonstrates this:
import pandas as pd
data = {'date': ['2014-05-01 18:47:05.069722', '2014-05-01 18:47:05.119994', '2014-05-02 18:47:05.178768']}
data2 = {'date': ['2014-05-01 18:47:05.069722', '2014-05-01 18:47:05.119994']}
df = pd.DataFrame(data, columns = ['date'])
df['date'] = pd.to_datetime(df['date'])
df2 = pd.DataFrame(data2, columns = ['date'])
df2['date'] = pd.to_datetime(df2['date'])
df3 = pd.DataFrame([], columns = ['date'])
df4 = pd.DataFrame()
print df.isin(df2)
print df.isin(df3)
print df.isin(df4)
This outputs:
date
0 True
1 True
2 False
date
0 1970-01-01
1 1970-01-01
2 1970-01-01
date
0 1970-01-01
1 1970-01-01
2 1970-01-01
I would normally expect a list of False values instead of '1970-01-01'? I notice that with pandas = 0.16.2 and numpy = 1.9.2, df.isin(df3) produces the more expected:
date
0 False
1 False
2 False
But df.isin(df4) is as previous.
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 2.2
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: 5.2.2
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None