BUG: groupby with categorical drops empty groups when aggregating over a series

- [ ] Series groupby excluding NaN groups with Categorical (DataFrame DOES include)
- [ ] sorting via a returned Interval-like-Index (string based)

Hello,

When grouping a DataFrame over more than one column including a categorical, the empty groups are kept in the aggregation result. A test for this behaviour was introduced in #8138.

However, when performing aggregation on only one column of the DataFrame, the empty groups are dropped. This seems inconsistent to me and I guess that it's an edge case that wasn't thought of at the time.

``` python
d = {'foo': [10, 8, 4, 1], 'bar': [10, 20, 30, 40],
     'baz': ['d', 'c', 'd', 'c']}
df = pd.DataFrame(d)
cat = pd.cut(df['foo'], np.linspace(0, 20, 5))
df['range'] = cat
groups = df.groupby(['range', 'baz'], as_index=True, sort=True)

# Expected result, fixed as part of #8138
fixed = groups.agg('mean')

# Inconsistent behaviour with series
inconsistent = groups['foo'].agg('mean')

# Expected result
expected = fixed['foo']
```

``` python
fixed
```

|  |  | bar | foo |
| --- | --- | --- | --- |
| range | baz |  |  |
| (0, 5] | c | 1 | 40 |
|  | d | 4 | 30 |
| (10, 15] | c | NaN | NaN |
|  | d | NaN | NaN |
| (15, 20] | c | NaN | NaN |
|  | d | NaN | NaN |
| (5, 10] | c | 8 | 20 |
|  | d | 10 | 10 |

``` python
inconsistent
```

| range | baz |  |
| --- | --- | --- |
| (0, 5] | c | 1 |
|  | d | 4 |
| (5, 10] | c | 8 |
|  | d | 10 |

``` python
expected
```

|  |  |  |
| --- | --- | --- |
| range | baz |  |
| (0, 5] | c | 1 |
|  | d | 4 |
| (10, 15] | c | NaN |
|  | d | NaN |
| (15, 20] | c | NaN |
|  | d | NaN |
| (5, 10] | c | 8 |
|  | d | 10 |

Note the strange ordering of the categorical index. I would expect `sorted = True` to sort by categorical level and not by lexical order?

Also note that using `as_index=False` fails due to #8869


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: groupby with categorical drops empty groups when aggregating over a series #8870

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

		bar	foo
range	baz
(0, 5]	c	1	40
	d	4	30
(10, 15]	c	NaN	NaN
	d	NaN	NaN
(15, 20]	c	NaN	NaN
	d	NaN	NaN
(5, 10]	c	8	20
	d	10	10

Uh oh!

BUG: groupby with categorical drops empty groups when aggregating over a series #8870

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions