Skip to content

gh-74616: Raise ValueError in case of NULL in input prompt #1738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 7, 2023

Conversation

kushaldas
Copy link
Member

@kushaldas kushaldas commented May 22, 2017

If the input prompt to the builtin input function has any NULL
character, then raise ValueError.

promptstr = "";
}
if (PyBytes_GET_SIZE(po) != strlen(promptstr)) {
PyErr_SetString(PyExc_ValueError,
"input prompt has NULL character");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "input: prompt string cannot contain null characters"? "null" is more common in existing error messages, but "NUL" with one L is also used, since that's the ASCII name of character. NULL refers to the C pointer constant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I will modify it for the same.

@@ -680,6 +680,15 @@ are always available. They are listed here in alphabetical order.
If the :mod:`readline` module was loaded, then :func:`input` will use it
to provide elaborate line editing and history features.

.. versionchanged:: 3.7
If the input prompt string has a NULL character, then it will raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NULL means a null pointer in C. Use "a null character" for consistency with other documentation.

But actually I think this change doesn't need special mentioning in the documentation. This is just a bugfix. A null character is prohibited in many other APIs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will remove this docs update.

@@ -1976,9 +1976,15 @@ builtin_input_impl(PyObject *module, PyObject *prompt)
promptstr = PyBytes_AS_STRING(po);
}
else {
po = NULL;
po = PyBytes_FromString("");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can raise an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I will do a check for NULL case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I don't need this anymore for the change as mentioned the comment below.

If the input prompt to the builtin input function has any null
character, then raise ValueError.
@@ -127,6 +127,13 @@ def test_init(self):
input()
print("History length:", readline.get_current_history_length())
"""
auto_history_script_with_null = """\
import readline
readline.set_auto_history({})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is readline history related to this issue? Does the test work if remove calls of set_auto_history and get_current_history_length?

Copy link
Member

@vadmium vadmium May 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the readline module is not needed at all. Perhaps the test would be better off in test_builtin.py. There are already some tests there in the PtyTests class that call input on a terminal, so you may be able to use them for inspiration.

Copy link
Member

@terryjreedy terryjreedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change, if general, introduces a regression. The doc for input(prompt) says "If the prompt argument is present, it is written to standard output without a trailing newline." The handling of the prompt is up to stdout.write and thence physical displays devices. Python does not specify what happens with any particular character, including \r, \t, and astral chars. [This is an abbreviation of my 2017-05-27 posts on the issue]

@brettcannon
Copy link
Member

To try and help move older pull requests forward, we are going through and backfilling 'awaiting' labels on pull requests that are lacking the label. Based on the current reviews, the best we can tell in an automated fashion is that a core developer requested changes to be made to this pull request.

If/when the requested changes have been made, please leave a comment that says, I have made the requested changes; please review again. That will trigger a bot to flag this pull request as ready for a follow-up review.

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Feb 19, 2022
@erlend-aasland erlend-aasland changed the title bpo-30431: Raise ValueError in case of NULL in input prompt gh-74616: Raise ValueError in case of NULL in input prompt Jun 29, 2022
@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Jul 28, 2022
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Aug 28, 2022
@iritkatriel
Copy link
Member

This change, if general, introduces a regression. The doc for input(prompt) says "If the prompt argument is present, it is written to standard output without a trailing newline." The handling of the prompt is up to stdout.write and thence physical displays devices. Python does not specify what happens with any particular character, including \r, \t, and astral chars. [This is an abbreviation of my 2017-05-27 posts on the issue]

@terryjreedy - are you saying that this PR should be rejected, or did I misunderstand?

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Nov 28, 2022
@terryjreedy
Copy link
Member

Yes, to me the patch is wrong. The doc still says "If the prompt argument is present, it is written to standard output without a trailing newline. " The issue OP, ghost, merely noted that the prompt was truncated. I believe 'ghost' wanted it to not be truncated. To me, raising instead of writing something is worse. It is not normal for Python to check strings for null. Print and sys.stdout.write and sys.stderr.write do not. From my testing of builtin functions, only the filename in open and compile is. The attr functions and int and iter do not.

Eryk says the prompt goes to stderr on terminals. If this is considered correct, then the doc should be changed.

@eryksun
Copy link
Contributor

eryksun commented Nov 28, 2022

It is not normal for Python to check strings for null.

If a string is passed to C as a null-terminated string, ValueError should be raised if it contains embedded null characters. Allowing the string to get truncated arbitrarily is poorly defined behavior.

PyOS_Readline() in "Parser/myreadline.c" takes the prompt to write as a null-terminated string because that's what the GNU Readline API uses.

Print and sys.stdout.write and sys.stderr.write do not.

The underlying calls to POSIX write() and WinAPI WriteFile() and WriteConsoleW() require the number of data elements to write, so embedded null characters aren't a problem.


Eryk says the prompt goes to stderr on terminals. If this is considered correct, then the doc should be changed.

If sys.stdin and sys.stdout are terminal files and respectively the same OS files as the C stdin and stdout streams, then input() is implemented by PyOS_Readline().

The default implementation of PyOS_Readline() is PyOS_StdioReadline(), which writes the prompt to the C stderr stream1. However, if the readline module is imported, then PyOS_Readline() typically uses GNU Readline on Unix systems. GNU Readline writes the prompt to stdout, not to stderr. Usually a script does not import the readline module.

You can examine the different behavior on a Unix system by executing a test script with the single line input('prompt: '). If you run it via python test.py 2>stderr.txt, the prompt will be written to "stderr.txt", and you won't have GNU Readline editing keys (e.g. home and end). If you run it via python -i test.py 2>stderr.txt, the prompt will be written to the terminal, and you'll have GNU Readline editing keys.

Footnotes

  1. On Windows, if C stderr is a console file and legacy mode isn't enabled, we actually re-encode the prompt from UTF-8 to UTF-16 and write it directly to the console via WriteConsoleW().

@terryjreedy
Copy link
Member

It would seem then that calling sys.stdout.write('prompt') (or sys.stderr? when appropriate?) and then PyOS_Readline(), with no prompt, would solve the truncation issue.

As for the destination, I am just suggesting adding 'or error' to the current doc to match the behavior.

@eryksun
Copy link
Contributor

eryksun commented Nov 28, 2022

It would seem then that calling sys.stdout.write('prompt') (or sys.stderr? when appropriate?) and then PyOS_Readline(), with no prompt, would solve the truncation issue.

Unfortunately GNU Readline needs to be passed the prompt in order to set the margin for editing. Without it, for example, pressing the home key will move the cursor to the first column of the terminal instead of the first column after the prompt.

@serhiy-storchaka
Copy link
Member

In Python interface to OS or other libraries an error is raised if the underlying function does not support embedded NULs (for example if it takes argument as a null-terminated string char*). In many cases the check is indirect (when use PyArg_Parse with the "s" format unit) or is embedded in other C API functions like PyUnicode_FSDecoder or PyUnicode_AsWideCharString, in other cases it is explicit.

And since the readline API accepts argument as null-terminated string, we should ensure that it does not contain embedded NULs.

@terryjreedy
Copy link
Member

Eryk's cursor control (prompt avoidance) issue, which IDLE has now, convinced me to drop my objection to the code change.

@eryksun
Copy link
Contributor

eryksun commented Nov 28, 2022

@serhiy-storchaka, do you think PyRun_InteractiveOneObjectEx() in "Python/pythonrun.c" should raise a warning if sys.ps1 or sys.ps2 contain embedded null characters? I don't think raising an error would be appropriate for the REPL.

@serhiy-storchaka
Copy link
Member

Good question. Since PyRun_InteractiveOneObjectEx() already silences many errors (like absent or non-UTF8-encodable sys.ps1), it would not be much worse if it silently truncate or completely ignore sys.ps1 with a null character. In future it would be better to emit some warnings or error, but it is a different issue, and it is not particularly important in any case.

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Dec 30, 2022
@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Dec 7, 2023
@serhiy-storchaka serhiy-storchaka dismissed terryjreedy’s stale review December 7, 2023 09:54

The objection was dropped.

@serhiy-storchaka serhiy-storchaka enabled auto-merge (squash) December 7, 2023 09:57
@serhiy-storchaka serhiy-storchaka merged commit 4ba15de into python:main Dec 7, 2023
aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
…rompt (pythonGH-1738)

If the input prompt to the builtin input function on terminal has any null
character, then raise ValueError instead of silently truncating it.

Co-authored-by: Serhiy Storchaka <[email protected]>
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
…rompt (pythonGH-1738)

If the input prompt to the builtin input function on terminal has any null
character, then raise ValueError instead of silently truncating it.

Co-authored-by: Serhiy Storchaka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants