Skip to content

CLN: make _Indexer.obj a weak ref #15746

Closed
@jreback

Description

@jreback

This will allow pandas objects to be collected on the first generation of the gc rather than wait for it to break cycles. Practically I am not sure this will have much of a user change.

dask/distributed#956 (comment)

The idea is to change this code here

from

class _NDFrameIndexer(object):
    _valid_types = None
    _exception = KeyError
    axis = None

    def __init__(self, obj, name):
        self.obj = obj
        self.ndim = obj.ndim
        self.name = name

to

class _NDFrameIndexer(object):
    _valid_types = None
    _exception = KeyError
    axis = None

    def __init__(self, obj, name):
        self.obj = weakref.ref(obj)
        self.ndim = obj.ndim
        self.name = name

and corresponding self.obj to self.obj()

it 'works' in that gc collection happens immedately upon object deletion (IOW del df). but a few fails on caching / chaining. In particular tests like: https://github.com/pandas-dev/pandas/blob/master/pandas/tests/indexing/test_chaining_and_caching.py#L31 I think were relying upon the reference NOT being collected (so that they can check it).

So this would require some internal reworking to remove / fix this. I suspect we will still achieve the same user effects (meaning of detection of chaining etc).

Metadata

Metadata

Assignees

No one assigned

    Labels

    CleanInternalsRelated to non-user accessible pandas implementation

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions