Motivation:
For starters I’d like to refer to the very well-written Motivation for PEP-727 for why there should be a change to the status quo, where third-party tools are used to render per-name/parameter documentation embedded in a large docstring written in specific microsyntaxes.
While there have been numerous attempts to standardize docstrings for names and parameters over long discussions such as Revisiting attribute docstrings and PEP 727: Documentation Metadata in Typing, all of the proposals so far seem to have fallen short in some ways as they try to repurpose an existing Python grammar for a docstring.
These include some variations of:
class A:
b = 42
"""Some documentation."""
Downside: As a string it’s easy to run into ambiguous usage such as:
def foo(
param: str = "some default value"
"""Some documentation""",
):
...
V = 'hello' #: this is the docstring for V
Downside: As a comment its value is difficult to access at runtime.
def foo(
param: Annotated[str, Doc("Some documentation.")]
) -> None: ...
Downside: Too much boilerplate and can’t exist independently from a type hint.
def some_function(
some_parameter: SomeType -- "Some documentation goes here",
**kwargs: Any -- "Additional keyword arguments"
) -> SomeReturn -- "Some details about this return type":
""" Documenting the function itself here """
Downside: Can’t exist independently from a type hint and would be ambiguous as a docstring for a name:
foo = 'hello' -- world # is it a docstring or a negation and a subtraction?
Rationale:
Since it is apparent from the past proposals that repurposing an existing syntax for a per-name docstring ultimately fails with too much compromise in some ways, it would be justified to introduce a new per-name docstring syntax which:
- is invalid with the current syntax, so it’s unambiguous, easily identifiable by a human and collapsable by an IDE.
- is short and concise, so to leave room for the actual content of the docstring.
- is not nestable, so to avoid ambiguity.
- is optional, so it is used only where it makes sense to individually document a name or a parameter.
- is viable for both variable names and parameters.
Specification:
The main proposal is to introduce a new token, ||
, to mark the start of a docstring or a continuation of a docstring from the previous line, and what follows ||
is the content of the docstring.
In most ways it works just like a comment, except that its content is accessible as a string in the __name_docs__
dict attribute of a module or a class, or __param_docs__
dict attribute of a function. And by default its content is automatically dedented, just like docstrings currently are.
When ||
appears on the same line as a name, the docstring binds to the name (do imagine there’s syntax highlighting for the docstring):
class Color(Enum):
RED = 1 || a lovely color
BLUE = 2 || a moody color
This results in Color.__name_docs__
with a value of:
{'RED': 'a lovely color', 'BLUE': 'a moody color'}
Note that I’m not sure if we should allow a docstring to be on a different line as a name when there’s a multi-line default value:
class Color(Enum):
RED = RGB( || a lovely color
255, 0, 0
)
Should this be allowed?
class Color(Enum):
RED = RGB(
255, 0, 0
) || a lovely color
I’ll leave the decision to the discussion.
Then there’s a second form, where ||
appears at the beginning of a line after dedentation, in which case the docstring binds to the name on the left in the next line:
class Color(Enum):
|| a lovely color
RED = RGB(
255, 0, 0
)
In both cases above Color.__name_docs__
becomes {'RED': 'a lovely color'}
.
A multi-line docstring can be written in multiple lines with each line starting with a ||
(keeping in mind that an IDE can be made to collapse the docstrings as needed):
class Color(Enum):
|| a lovely color that symbolizes:
|| - life
|| - health
|| - courage
|| - love
RED = RGB(
255, 0, 0
)
and Color.__name_docs__
becomes {'RED': 'a lovely color that symbolizes:\n - life\n - health\n - courage\n - love'}
. Note the lack of a trailing newline to be consistent with single-line docstrings.
The same rules apply to docstrings for parameters of a function defintion, which I won’t repeat here for brevity.
Mnemonic:
The choice of ||
as a docstring marker comes from its visual resemblance to a column separator for notes, and its widely understood meaning of “or”, which in English, is a conjunction that can offer an explanation of a preceding word or phrase.