Skip to content

Better mapping of sys.monitoring branches back to source code #122762

Open
@jaltmayerpizzorno

Description

@jaltmayerpizzorno

Feature or enhancement

Proposal:

sys.monitoring allows us to record branches taken during execution, reporting them in terms of their source and destination offsets in bytecode. These offsets are meaningless to developers, however, unless they take the time to generate and study the bytecode disassembly. Code objects provide co_positions() to map bytecode locations to the source code that originated it, but the branches don't always map to locations that make sense to someone only looking at the source code.

For example, when executed,

def foo(x):
    if x: print(x)
foo(0)

shows a branch from offset 4, which maps to the if keyword, to offset 30, which also maps to that if.

  1           0 RESUME                   0

  2           2 LOAD_FAST                0 (x)
              4 POP_JUMP_IF_FALSE       12 (to 30)
              6 LOAD_GLOBAL              1 (NULL + print)
             16 LOAD_CONST               1 ('x')
             18 CALL                     1
             26 POP_TOP
             28 RETURN_CONST             0 (None)
        >>   30 RETURN_CONST             0 (None)

ex1.py 2:7-2:8 (foo@4) -> 2:7-2:8 (foo@30)

How can/should a coverage tool attempting to report this branch recognize that it is a branch out of the function?
The branch does go to a RETURN_... opcode, but is that reliable?

Branches from for loops also show a bit obfuscated. In the example

def foo():
    for i in range(3):
        print(i)

foo()

the branch taken into the block containing the print call actually shows as a branch to the i in line 2:

  1           0 RESUME                   0

  2           2 LOAD_GLOBAL              1 (NULL + range)
             12 LOAD_CONST               1 (3)
             14 CALL                     1
             22 GET_ITER
        >>   24 FOR_ITER                13 (to 54)
             28 STORE_FAST               0 (i)

  3          30 LOAD_GLOBAL              3 (NULL + print)
             40 LOAD_FAST                0 (i)
             42 CALL                     1
             50 POP_TOP
             52 JUMP_BACKWARD           15 (to 24)

  2     >>   54 END_FOR
             56 RETURN_CONST             0 (None)

ex2.py 2:4-3:16 (foo@24) -> 2:8-2:9 (foo@28)

The final branch once the loop is done also shows a bit funny, not to offset 54, as expected, but to offset 56:

ex2.py 2:4-3:16 (foo@24) -> 2:4-3:16 (foo@56)

I've seen unexpected behavior with while loops as well... I can probably recreate it if desired.

@markshannon, @nedbat and I spoke about this issue previously; Mark asked me to open this issue. It should arguably be a bug report rather than a feature request... it can be your call, Mark.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

I spoke briefly about these issues in my PyCon'24 talk: https://www.youtube.com/watch?v=X9aXWeLC_C0

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)type-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions