Closed
Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
This refers to the code that is currently on master 84d9c5e (2021-04-14). The issues also exist on the latest version of pandas but are different.
import pandas as pd
halflife = "23 days"
baseline_df = pd.DataFrame(
{
"A": ["a", "b", "a", "b", "a", "b"],
"B": [0, 0, 1, 1, 2, 2],
"C": pd.to_datetime(
[
"2020-01-01",
"2020-01-01",
"2020-01-10",
"2020-01-02",
"2020-01-23",
"2020-01-03",
]
)
}
)
cython_result = baseline_df.groupby("A").ewm(halflife=halflife, times="C").mean()
print("cython")
print(cython_result)
print("numba")
numba_result = baseline_df.groupby("A").ewm(halflife=halflife, times="C").mean(engine="numba")
print(numba_result)
expected_result_a = pd.DataFrame([0, 1, 2]).ewm(
halflife=halflife, times=pd.to_datetime(["2020-01-01", "2020-01-10", "2020-01-23"])
).mean()
expected_result_b = pd.DataFrame([0, 1, 2]).ewm(
halflife=halflife, times=pd.to_datetime(["2020-01-01", "2020-01-02", "2020-01-03"])
).mean()
print("expected")
print(" group a")
print(expected_result_a)
print(" group b")
print(expected_result_b)
Output:
cython
B
A
a 0 0.000000
2 0.500000
4 1.094088
b 1 0.000000
3 0.500000
5 1.094088
numba
B
A
a 0 0.000000
2 0.666667
4 1.428571
b 1 0.000000
3 0.666667
5 1.428571
expected
group a
0
0 0.000000
1 0.567395
2 1.221209
group b
0
0 0.000000
1 0.507534
2 1.020088
Problem description
There are three problems with the current groupby ewm implementation in the case of non-None times.
- numba implementation: ignores the times
- cython implementation: does not use the correct times/deltas in aggregations.pyx in case of multiple groups
- if the groups are non-trivial the time vector and values become out of sync
I have a branch that fixes these issues, will link to it in a bit.
Expected Output
cython
B
A
a 0 0.000000
2 0.567395
4 1.221209
b 1 0.000000
3 0.507534
5 1.020088
numba
B
A
a 0 0.000000
2 0.567395
4 1.221209
b 1 0.000000
3 0.507534
5 1.020088
expected
group a
0
0 0.000000
1 0.567395
2 1.221209
group b
0
0 0.000000
1 0.507534
2 1.020088