Skip to content

Optimize pathlib path pickling #112855

Closed
Closed
@barneygale

Description

@barneygale

pathlib.PurePath.__reduce__() currently accesses and returns the parts tuple. Pathlib ensures that the strings therein are interned.

There's a good reason to do this: it ensures that the pickled data is as small as possible, with maximum re-use of small string objects.

However, it comes with some disadvantages:

  1. When normalising any path, we need to call sys.intern(str(part)) on each part
  2. When pickling a path, we must join, parse and normalise, and then generate the parts tuple.

We could instead make __reduce__() return the raw paths fed to the constructor (the _raw_paths attribute). This would be faster but less space efficient. With the cost of storage and bandwidth falling at a faster rate than compute, I suspect this trade-off is worth making.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions