Name suggestions for AttributeErrors (and possibly NameErrors) should not include names with (single) leading underscores

With sufficiently new versions of Python (I think 3.10+?) and Pandas (I think 2.0.x+?):

$ python
Python 3.11.2 (main, Apr  5 2023, 03:08:14) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.DataFrame().append()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/zahlman/Desktop/dev/local/pandas_test/pdtest/lib/python3.11/site-packages/pandas/core/generic.py", line 6296, in __getattr__
    return object.__getattribute__(self, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

In Pandas 2.0, apparently append was removed from the public API after a prior period of deprecation. But now apparently (judging by one of the answers there, and comments on that answer) lots of people have simply switched their append calls to use the undocumented _append instead, because it seems to work - and they do this at least in part because the error message suggests it to them (despite the clear intent of the Pandas developers that people should not use it).

There is a well established convention that leading underscores denote names that client code is not intended to use, even if it would work. Therefore, the error reporting system that was improved to include suggestions for typo fixes, should not include them among suggestions. At least, not ones decorated with a single leading underscore - I guess suggesting dunders has some value, although in these cases the typo is more likely the other way around (the class has something wrongly named, rather than the calling code).

17 Likes

It looks reasonable. Do you mind to open an issue.

I would add that name suggestions starting with _ should only be proposed if the original name starts with _.

14 Likes

Is it only single _ that should be special?
What about dunders being suggested?

I don’t think suggesting a dunder when the original identifier wasn’t a dunder, or at least starts with one underscore, is going to helpful, nor likely based on edit distance.

1 Like

I was thinking of the case of a typo’ed dunder and not have the underscore rule prevent a dundler suggestion.
Agreed to would be bad to suggest a dundle for a non underscored typo.

That would fall under the suggestion to still propose underscore names when the original identifier starts with an underscore, no?

Enforcing it in any manner or form would transform it from a convention into a rule or a standard.

See mangling rules: 9. Classes — Python 3.12.2 documentation.

I’m not sure that I see this as enforcement.

I don’t see how this is any different than help ignoring single underscore methods.

5 Likes

As a compromise, maybe suggest with a caveat, if there’s a less verbose way of saying something like

AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'? (Warning: '_append' may not intended for public use)?
1 Like

It is already somewhat enforced by star-imports ignoring _ names by default.

3 Likes

There are precedences:

  • glob() only shows names starting with a dot if the pattern starts with a dot.
  • Tab completion only shows underscored names if the prefix is underscored, and only shows double underscored names if the prefix is double underscored.

It is not completely applicable to this case, because there is not a prefix and not a pattern, but the idea is that the output is limited and the limitation rule depends on the input.

Yeah, I’ve been bitten by this in the past. That’s why I keep forgetting it (it is a negative memory). Also, it contrasts with the docs about private variables. IMHO, being part of the language makes it a rule.

But I would still want private variables to be suggested when working with private variables. I think that is the consensus being realized in this thread somehow.

Not at all; here you go.

3 Likes

I still have the suggestion when I raise an AttributeError myself inside a property getter, because of course now x really exists.

>>> class Foo:
...     def __init__(self, x):
...         self._x = x
... 
...     @property
...     def x(self):
...         # some code here that leads to x not being available
...         x_not_available = True
...         if x_not_available:
...             raise AttributeError("x is not available.")
...         return self._x
... 
>>> Foo(1).x
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    Foo(1).x
  File "<python-input-0>", line 10, in x
    raise AttributeError("x is not available.")
AttributeError: x is not available.. Did you mean: '_x'?

There is a simple workaround that involves using this function:

def raise_without_suggestion(error, message):
    raise error(message)

i.e. move as far away as possible the raise statement so that Python does not know the name of the argument. But since it could be fixed at some point, it would be nice to have something else more “official”.

1 Like

Since the name suggestion logics take place only if the name attribute of the AttributeError is not None, an easy way to disable name suggestion is by setting name to None when instantiating an AttributeError:

raise AttributeError("x is not available.", name=None)
1 Like