Skip to content

BUG: GroupBy apply loses NaN groups #43227

Closed
@misantroop

Description

@misantroop
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Here I'm expecting GroupBy to return an identical structure but NaN groups get removed even with dropna=False. Providing the level argument as a keyword works as expected.

    mux = pd.MultiIndex.from_arrays([['a', 'a', np.nan, 'b', 'b'], ['t', 'u', np.nan, 'w', 'y']],
                                    names=['level1', 'level2'])
    df = pd.DataFrame({'col': [0, np.nan, np.nan, 3, 4]}, mux)

    	       	    col
    level1  level2
    a	    t	    0.0
            u	    NaN
    NaN	    NaN	    NaN
    b	    w	    3.0
            y	    4.0

    df = df.groupby(by='level2', dropna=False).apply(lambda x: x)

    	       	   col
    level1  level2
    a	    t	   0.0
            u	   NaN
    b	    w	   3.0
            y	   4.0

    df = df.groupby(level='level2', dropna=False).apply(lambda x: x)

    	       	    col
    level1  level2
    a	    t	    0.0
            u	    NaN
    NaN	    NaN	    NaN
    b	    w	    3.0
            y	    4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions