Skip to content

Series.map should return default dictionary values rather than NaN #15999

Closed
@dhimmel

Description

@dhimmel

collections.Counter and collections.defaultdict both have default values. However, pandas.Series.map does not respect these defaults and instead returns missing values.

The issue is illustrated below:

import pandas
from collections import Counter, defaultdict
input = pandas.Series(range(5))
counter = Counter()
counter[1] += 1
output = input.map(counter)
expected = series.map(lambda x: counter[x])
pandas.DataFrame({
    'input': input,
    'output': output,
    'expected': expected,
})

Here's the output:

   expected  input  output
0         0      0     NaN
1         1      1     1.0
2         0      2     NaN
3         0      3     NaN
4         0      4     NaN

The workaround is rather easy (lambda x: dictionary[x]) and shouldn't be to hard to implement. Are people on board with the change? Is there a performance concern with looking up each key independently?

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions