-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-123803: Support arbitrary code page encodings on Windows #123804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-123803: Support arbitrary code page encodings on Windows #123804
Conversation
If the cpXXX encoding is not directly implemented in Python, fall back to use the Windows-specific API codecs.code_page_encode() and codecs.code_page_decode().
64d487b
to
c9a5861
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks fine, and I can see now that it's not so trivial to raise a different exception. Maybe we can improve the current "unknown encoding" exception string to say "encoding '{name}' is not registered" or something that suggests it might be possible to fix by adding the encoding? I expect many users would think that encodings are static.
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again. |
Thanks for making the requested changes! @malemburg: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @serhiy-storchaka
This looks good now.
Thank you for your review @zooba and @malemburg. I do not particularly like the change in the error message. Can we do without it? |
I guess if people would rather wait and see if the difference in availability causes confusion, then we can do that. I'd prefer to be more clear in the docs, as I don't think developers or users currently expect codecs to be platform-specific like this. |
How about "encoding XYZ not available" ?! I'd also be fine with leaving the current error message in place. The "not registered" is not quite correct, since it assumes that there is a codec with that name available somewhere, it's just not registered. This is not always the case, though, e.g. if you mistype an encoding name. |
"Not registered" at least implies that there's a potential way to fix it, whereas "not available" suggests the user has to come to us to ask us to make it available. (I'd assume in this case they wouldn't read our docs either.) "Not available on this platform" would be okay, but feels less accurate overall, since as you say a mistyped codec isn't going to be available on any platform. We can't seem to customise the message for a known-but-not-present codec though, so it's a bit of a tough spot. |
This reverts commit 57ad6d2.
Thanks, @serhiy-storchaka |
…thonGH-123804) If the cpXXX encoding is not directly implemented in Python, fall back to use the Windows-specific API codecs.code_page_encode() and codecs.code_page_decode().
If the cpXXX encoding is not directly implemented in Python, fall back to use the Windows-specific API codecs.code_page_encode() and codecs.code_page_decode().