Skip to content

BUG: Series/Index arithmetic result names with NAs #44459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 20, 2021

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Nov 15, 2021

@jreback jreback added this to the 1.4 milestone Nov 15, 2021
@jreback jreback added Bug Numeric Operations Arithmetic, Comparison, and Logical operations Reshaping Concat, Merge/Join, Stack/Unstack, Explode NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Nov 15, 2021
else:
# TODO: what if they both have np.nan for their names?
try:
if a.name == b.name:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we somehow raising on NA comparing to a string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if either is pd.NA then this raises TypeError

@jreback
Copy link
Contributor

jreback commented Nov 16, 2021

looks fine can you add a whatsnew note

@jbrockmendel
Copy link
Member Author

whatsnew added + greenish

@jreback jreback merged commit 7b8c0af into pandas-dev:master Nov 20, 2021
@jbrockmendel jbrockmendel deleted the bug-names branch November 20, 2021 16:12
@rhshadrach
Copy link
Member

rhshadrach commented Nov 20, 2021

Certainly no objection to the modification here, but I do wonder if we should be changing the API to narrow what is allowed as a Series name or DataFrame label. For example, from #39757,

series_a = Series([1,2,3], name=np.float64(1.0))
series_b = Series([1,2,3], name=(np.float64(1.0), np.float64(2.0)))
pd.concat([series_a, series_b], axis=1).columns.sort_values()

still fails. I am wondering if we should change the API to only allow basic primitive types (Boolean, int, str, float?, perhaps some others) and tuples of primitive types, rather than trying to chase these issues down.

@jreback
Copy link
Contributor

jreback commented Nov 20, 2021

the problem is that the name can be any pandas primitive or tuplr if those (as we allow this for index labels) which doesn't it much

@rhshadrach
Copy link
Member

rhshadrach commented Nov 20, 2021

@jreback - I think you're saying that even if we were to add some restrictions, np.float64 (and hence, tuples of such objects) should be allowed as a name. Is that right?

@jreback
Copy link
Contributor

jreback commented Nov 20, 2021

yes ultimately we allow these as names (well have to be hashable) or we could inhibit a number of downstream ops

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug NA - MaskedArrays Related to pd.NA and nullable extension arrays Numeric Operations Arithmetic, Comparison, and Logical operations Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError from operations between Series with specific names Pandas Series: unexpected error while doing difference (_maybe_match_name)
3 participants