Closed
Description
get_dummies seems to get caught out by NaNs
In [11]: s1 = pd.Series(['a', 'a', np.nan, 'c', 'c', 'c'])
In [12]: s1
Out[12]:
0 a
1 a
2 NaN
3 c
4 c
5 c
dtype: object
In [13]: pd.get_dummies(s1)
Out[13]:
a c
0 1 0
1 1 0
2 0 1
3 0 1
4 0 1
5 0 1
A rogue c has been used as the NaN value, I think expected is:
In [14]: pd.get_dummies(s1[s1.notnull()])
Out[14]:
a c
0 1 0
1 1 0
3 0 1
4 0 1
5 0 1