Skip to content

Commit 449accc

Browse files
authored
Merge branch 'main' into bug-range
2 parents 64a591a + 4e3d691 commit 449accc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+650
-496
lines changed

asv_bench/benchmarks/frame_methods.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -611,6 +611,9 @@ def time_frame_duplicated(self):
611611
def time_frame_duplicated_wide(self):
612612
self.df2.duplicated()
613613

614+
def time_frame_duplicated_subset(self):
615+
self.df.duplicated(subset=["a"])
616+
614617

615618
class XS:
616619

doc/source/development/contributing_environment.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -222,7 +222,7 @@ Consult the docs for setting up pyenv `here <https://github.com/pyenv/pyenv>`__.
222222
pyenv virtualenv <version> <name-to-give-it>
223223
224224
# For instance:
225-
pyenv virtualenv 3.7.6 pandas-dev
225+
pyenv virtualenv 3.9.10 pandas-dev
226226
227227
# Activate the virtualenv
228228
pyenv activate pandas-dev

doc/source/getting_started/comparison/comparison_with_sql.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ structure.
1818
1919
url = (
2020
"https://raw.github.com/pandas-dev"
21-
"/pandas/master/pandas/tests/io/data/csv/tips.csv"
21+
"/pandas/main/pandas/tests/io/data/csv/tips.csv"
2222
)
2323
tips = pd.read_csv(url)
2424
tips

doc/source/whatsnew/v1.3.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -811,7 +811,7 @@ Other Deprecations
811811
- Deprecated allowing scalars to be passed to the :class:`Categorical` constructor (:issue:`38433`)
812812
- Deprecated constructing :class:`CategoricalIndex` without passing list-like data (:issue:`38944`)
813813
- Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`, :issue:`21311`, :issue:`22315`, :issue:`26974`)
814-
- Deprecated the :meth:`astype` method of datetimelike (``timedelta64[ns]``, ``datetime64[ns]``, ``Datetime64TZDtype``, ``PeriodDtype``) to convert to integer dtypes, use ``values.view(...)`` instead (:issue:`38544`)
814+
- Deprecated the :meth:`astype` method of datetimelike (``timedelta64[ns]``, ``datetime64[ns]``, ``Datetime64TZDtype``, ``PeriodDtype``) to convert to integer dtypes, use ``values.view(...)`` instead (:issue:`38544`). This deprecation was later reverted in pandas 1.4.0.
815815
- Deprecated :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth`, use :meth:`MultiIndex.is_monotonic_increasing` instead (:issue:`32259`)
816816
- Deprecated keyword ``try_cast`` in :meth:`Series.where`, :meth:`Series.mask`, :meth:`DataFrame.where`, :meth:`DataFrame.mask`; cast results manually if desired (:issue:`38836`)
817817
- Deprecated comparison of :class:`Timestamp` objects with ``datetime.date`` objects. Instead of e.g. ``ts <= mydate`` use ``ts <= pd.Timestamp(mydate)`` or ``ts.date() <= mydate`` (:issue:`36131`)

doc/source/whatsnew/v1.4.0.rst

Lines changed: 175 additions & 145 deletions
Large diffs are not rendered by default.

doc/source/whatsnew/v1.5.0.rst

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,11 @@ enhancement2
3131

3232
Other enhancements
3333
^^^^^^^^^^^^^^^^^^
34+
- :meth:`MultiIndex.to_frame` now supports the argument ``allow_duplicates`` and raises on duplicate labels if it is missing or False (:issue:`45245`)
3435
- :class:`StringArray` now accepts array-likes containing nan-likes (``None``, ``np.nan``) for the ``values`` parameter in its constructor in addition to strings and :attr:`pandas.NA`. (:issue:`40839`)
3536
- Improved the rendering of ``categories`` in :class:`CategoricalIndex` (:issue:`45218`)
3637
- :meth:`to_numeric` now preserves float64 arrays when downcasting would generate values not representable in float32 (:issue:`43693`)
38+
- :meth:`.GroupBy.min` and :meth:`.GroupBy.max` now supports `Numba <https://numba.pydata.org/>`_ execution with the ``engine`` keyword (:issue:`45428`)
3739
-
3840

3941
.. ---------------------------------------------------------------------------
@@ -145,6 +147,7 @@ Other Deprecations
145147
- Deprecated behavior of :meth:`SparseArray.astype`, :meth:`Series.astype`, and :meth:`DataFrame.astype` with :class:`SparseDtype` when passing a non-sparse ``dtype``. In a future version, this will cast to that non-sparse dtype instead of wrapping it in a :class:`SparseDtype` (:issue:`34457`)
146148
- Deprecated behavior of :meth:`DatetimeIndex.intersection` and :meth:`DatetimeIndex.symmetric_difference` (``union`` behavior was already deprecated in version 1.3.0) with mixed timezones; in a future version both will be cast to UTC instead of object dtype (:issue:`39328`, :issue:`45357`)
147149
- Deprecated :meth:`DataFrame.iteritems`, :meth:`Series.iteritems`, :meth:`HDFStore.iteritems` in favor of :meth:`DataFrame.items`, :meth:`Series.items`, :meth:`HDFStore.items` (:issue:`45321`)
150+
- Deprecated the ``__array_wrap__`` method of DataFrame and Series, rely on standard numpy ufuncs instead (:issue:`45451`)
148151
-
149152

150153

@@ -153,7 +156,7 @@ Other Deprecations
153156

154157
Performance improvements
155158
~~~~~~~~~~~~~~~~~~~~~~~~
156-
-
159+
- Performance improvement in :meth:`DataFrame.duplicated` when subset consists of only one column (:issue:`45236`)
157160
-
158161

159162
.. ---------------------------------------------------------------------------
@@ -203,7 +206,7 @@ Strings
203206

204207
Interval
205208
^^^^^^^^
206-
-
209+
- Bug in :meth:`IntervalArray.__setitem__` when setting ``np.nan`` into an integer-backed array raising ``ValueError`` instead of ``TypeError`` (:issue:`45484`)
207210
-
208211

209212
Indexing
@@ -213,6 +216,7 @@ Indexing
213216
- Bug in :meth:`Series.__setitem__` with a non-integer :class:`Index` when using an integer key to set a value that cannot be set inplace where a ``ValueError`` was raised insead of casting to a common dtype (:issue:`45070`)
214217
- Bug when setting a value too large for a :class:`Series` dtype failing to coerce to a common type (:issue:`26049`, :issue:`32878`)
215218
- Bug in :meth:`loc.__setitem__` treating ``range`` keys as positional instead of label-based (:issue:`45479`)
219+
- Bug in :meth:`Series.__setitem__` where setting :attr:`NA` into a numeric-dtpye :class:`Series` would incorrectly upcast to object-dtype rather than treating the value as ``np.nan`` (:issue:`44199`)>>>>>>> main
216220
-
217221

218222
Missing
@@ -228,7 +232,7 @@ MultiIndex
228232
I/O
229233
^^^
230234
- Bug in :meth:`DataFrame.to_stata` where no error is raised if the :class:`DataFrame` contains ``-np.inf`` (:issue:`45350`)
231-
-
235+
- Bug in :meth:`DataFrame.info` where a new line at the end of the output is omitted when called on an empty :class:`DataFrame` (:issue:`45494`)
232236

233237
Period
234238
^^^^^^

pandas/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,7 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
199199
# refs/heads/ and refs/tags/ prefixes that would let us distinguish
200200
# between branches and tags. By ignoring refnames without digits, we
201201
# filter out many common branch names like "release" and
202-
# "stabilization", as well as "HEAD" and "master".
202+
# "stabilization", as well as "HEAD" and "main".
203203
tags = {r for r in refs if re.search(r"\d", r)}
204204
if verbose:
205205
print("discarding '%s', no digits" % ",".join(refs - tags))

pandas/core/array_algos/putmask.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,8 @@ def putmask_without_repeat(
126126
mask : np.ndarray[bool]
127127
new : Any
128128
"""
129+
new = setitem_datetimelike_compat(values, mask.sum(), new)
130+
129131
if getattr(new, "ndim", 0) >= 1:
130132
new = new.astype(values.dtype, copy=False)
131133

pandas/core/arrays/base.py

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -427,7 +427,7 @@ def __contains__(self, item: object) -> bool | np.bool_:
427427
if not self._can_hold_na:
428428
return False
429429
elif item is self.dtype.na_value or isinstance(item, self.dtype.type):
430-
return self._hasnans
430+
return self._hasna
431431
else:
432432
return False
433433
else:
@@ -606,7 +606,7 @@ def isna(self) -> np.ndarray | ExtensionArraySupportsAnyAll:
606606
raise AbstractMethodError(self)
607607

608608
@property
609-
def _hasnans(self) -> bool:
609+
def _hasna(self) -> bool:
610610
# GH#22680
611611
"""
612612
Equivalent to `self.isna().any()`.
@@ -628,6 +628,16 @@ def _values_for_argsort(self) -> np.ndarray:
628628
See Also
629629
--------
630630
ExtensionArray.argsort : Return the indices that would sort this array.
631+
632+
Notes
633+
-----
634+
The caller is responsible for *not* modifying these values in-place, so
635+
it is safe for implementors to give views on `self`.
636+
637+
Functions that use this (e.g. ExtensionArray.argsort) should ignore
638+
entries with missing values in the original array (according to `self.isna()`).
639+
This means that the corresponding entries in the returned array don't need to
640+
be modified to sort correctly.
631641
"""
632642
# Note: this is used in `ExtensionArray.argsort`.
633643
return np.array(self)
@@ -698,7 +708,7 @@ def argmin(self, skipna: bool = True) -> int:
698708
ExtensionArray.argmax
699709
"""
700710
validate_bool_kwarg(skipna, "skipna")
701-
if not skipna and self._hasnans:
711+
if not skipna and self._hasna:
702712
raise NotImplementedError
703713
return nargminmax(self, "argmin")
704714

@@ -722,7 +732,7 @@ def argmax(self, skipna: bool = True) -> int:
722732
ExtensionArray.argmin
723733
"""
724734
validate_bool_kwarg(skipna, "skipna")
725-
if not skipna and self._hasnans:
735+
if not skipna and self._hasna:
726736
raise NotImplementedError
727737
return nargminmax(self, "argmax")
728738

@@ -1534,6 +1544,9 @@ def _empty(cls, shape: Shape, dtype: ExtensionDtype):
15341544
ExtensionDtype.empty
15351545
ExtensionDtype.empty is the 'official' public version of this API.
15361546
"""
1547+
# Implementer note: while ExtensionDtype.empty is the public way to
1548+
# call this method, it is still required to implement this `_empty`
1549+
# method as well (it is called internally in pandas)
15371550
obj = cls._from_sequence([], dtype=dtype)
15381551

15391552
taker = np.broadcast_to(np.intp(-1), shape)

pandas/core/arrays/boolean.py

Lines changed: 0 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -421,24 +421,6 @@ def astype(self, dtype: AstypeArg, copy: bool = True) -> ArrayLike:
421421
# coerce
422422
return self.to_numpy(dtype=dtype, na_value=na_value, copy=False)
423423

424-
def _values_for_argsort(self) -> np.ndarray:
425-
"""
426-
Return values for sorting.
427-
428-
Returns
429-
-------
430-
ndarray
431-
The transformed values should maintain the ordering between values
432-
within the array.
433-
434-
See Also
435-
--------
436-
ExtensionArray.argsort : Return the indices that would sort this array.
437-
"""
438-
data = self._data.copy()
439-
data[self._mask] = -1
440-
return data
441-
442424
def _logical_method(self, other, op):
443425

444426
assert op.__name__ in {"or_", "ror_", "and_", "rand_", "xor", "rxor"}

0 commit comments

Comments
 (0)