Skip to content

Feedback on new files() and Traverseable API #90

Closed
@jaraco

Description

@jaraco

In GitLab by @gregory.szorc on Mar 15, 2020, 18:04

I'm attempting to implement the new files() and Traverseable APIs in PyOxidizer and am struggling to figure it out.

First, the new classes in abc.py don't have type annotations. I think I can work them out. But being explicit would be much appreciated. Having type annotations would also match the rest of the file.

Second, the overloading of Traverseable to be both a resource loader and an actual resource is a bit confusing. Elsewhere in the importlib world, there is a separation between the interface for loading things and the things returned by that interface. It is weird to me that a single Traverseable both represents the returned value from files() (a logical collection of resources or a mechanism to obtain resources) as well as a handle on an individual resource. I would like for these interfaces to be teased apart, if possible. More on this later.

Third - and this is related to the last point - I just don't understand how you are supposed to handle directories in all cases. Because Traversable is both a loader and a handle on an individual resource, what are we supposed to do for operations like calling Traverseable.open() on an object that represents a directory? Are we supposed to return None? Raise an OsError or IoError or whatever the OS emits when you try to open a directory? Do we assume the caller calls is_dir() / is_file() and isn't dumb enough to call open() on a directory? But what if they do?

Fourth, I don't understand why there exists an open, read_text, and read_bytes on Traverseable. read_text and read_bytes can be implemented in terms of open(). So why are they a) abstract b) part of the interface in the first place? If we're going to provide open(), I think it makes sense to expose helper functions for read_text and read_bytes that are implemented in terms of open().

Fifth, to be honest, I'm not a huge fan of open() because it puts a lot of burden on custom implementations. What subset of Path.open() needs to be implemented? Path.open() is supposed to accept mode, buffering, encoding, errors, and newline arguments. Since these are all *args, **kwrags-awayed in the interface and the docs only call out mode, what is supposed to be supported? Is it really reasonable for custom implementations (which may not be able to simply call out to builtins.open() because they aren't using a traditional filesystem) to have to implement all this functionality? I would highly encourage reducing the scope of open() to binary file reading. Remove buffering, encoding, errors, and newline support. If people really want to buffer or read resources as text, they can use io.BufferedReader and/or io.TextIOWrapper. And of course, helper APIs that wrap low-level binary-only file-like objects can exist to make read_text() a trivial one-liner. Or consider making the text reading APIs optional and having the default resource reading code go through a polyfill that uses io.* wrappers accordingly.

I want to say the new files() + Traverseable API is a net improvement and thank everybody responsible. But at this point in time, I feel like the pile of warts I described in #58 has more or less been replaced by a different pile of warts :/ I failed to see the new API being designed and feel like I missed an opportunity to influence its design towards something more reasonable for advanced use cases like PyOxidizer. For that I feel bad and apologize. If it isn't too late to make backwards-incompatible changes to the interface, I would strongly vote for reducing the scope of open() and removing read_text and read_bytes. If we do that and clarify the interface and semantics a bit more, I think this new API will be clearly better than what came before.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions