Skip to content

get_dummies with NaN #4446

Closed
Closed
@hayd

Description

@hayd

get_dummies seems to get caught out by NaNs

In [11]: s1 = pd.Series(['a', 'a', np.nan, 'c', 'c', 'c'])

In [12]: s1
Out[12]: 
0      a
1      a
2    NaN
3      c
4      c
5      c
dtype: object

In [13]: pd.get_dummies(s1)
Out[13]: 
   a  c
0  1  0
1  1  0
2  0  1
3  0  1
4  0  1
5  0  1

A rogue c has been used as the NaN value, I think expected is:

In [14]: pd.get_dummies(s1[s1.notnull()])
Out[14]: 
   a  c
0  1  0
1  1  0
3  0  1
4  0  1
5  0  1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions