Skip to content

ENH: Allow update to use an on keyword. Allow one to many update. #6604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

jseabold
Copy link
Contributor

Scratch an itch I had. Use case

df = DataFrame([[np.nan, 'A'],
                [np.nan, 'A'],
                [np.nan, 'A'],
                [1.5, 'B'],
                [2.2, 'C'],
                [3.1, 'C'],
                [1.2,' B']], columns=['number', 'name'])

df2 = pd.DataFrame([[3.5, 'A']], columns=['number', 'name'])

df.update(df2, on='name')

The tests fail though because when you reset the index, the column order is not preserved and self = self[col_order] doesn't seem to do what I'd expect it to do. Maybe there's a better way to go about doing all of this?

Also, I dropped reindex_like because you were just iterating through columns of NaNs in other.

@jreback
Copy link
Contributor

jreback commented Mar 11, 2014

can you put the output you are expecting up as well?

@jseabold
Copy link
Contributor Author

NaNs "updated" to 3.5.

@jreback
Copy link
Contributor

jreback commented Mar 11, 2014

In [10]: df.loc[df.name.isin(df2.name)] = df2

In [11]: df
Out[11]: 
   number name
0     3.5    A
1     3.5    A
2     3.5    A
3     1.5    B
4     2.2    C
5     3.1    C
6     1.2    B

@jseabold
Copy link
Contributor Author

This is just a toy case. What if on is multiple columns? This also only updates where there are NaNs in df. There might be overlap in df and df2 and this sanity checks that.

@jreback
Copy link
Contributor

jreback commented Mar 11, 2014

true.....update really needs to be rewritten actually (and pushed into internals).....but that's a bit more drastic....

shouldn't you merge these then set?

@jseabold
Copy link
Contributor Author

Sounds like half a dozen of one to me... Merge, conditionally set _x for each column then delete _y and rename everything?

@jreback jreback added this to the 0.15.0 milestone Apr 22, 2014
@jreback
Copy link
Contributor

jreback commented Jan 25, 2015

moving a bit away from adding things to .update and stale

@jreback jreback closed this Jan 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants