Hello there.
I maintain a library called pyinstrument, a profiler for CPython. It observes a program’s execution and prints output that looks like this-
_ ._ __/__ _ _ _ _ _/_ Recorded: 14:42:13 Samples: 30
/_//_/// /_\ / //_// / //_'/ // Duration: 0.332 CPU time: 0.050
/ _/ v4.2.0
Program: examples/wikipedia_article_word_count.py
0.332 <module> <string>:1
[9 frames hidden] <string>, runpy, <built-in>
0.330 _run_code runpy.py:63
└─ 0.330 <module> wikipedia_article_word_count.py:1
├─ 0.281 main wikipedia_article_word_count.py:39
│ └─ 0.278 download wikipedia_article_word_count.py:15
│ ├─ 0.274 urlopen urllib/request.py:139
│ │ [61 frames hidden] urllib, http, socket, ssl, <built-in>...
│ └─ 0.004 read http/client.py:449
│ [12 frames hidden] http, socket, ssl, <built-in>
└─ 0.049 <module> urllib/request.py:1
[39 frames hidden] urllib, hashlib, <built-in>, http, ss...
I’m currently working on a feature that prints the class name of a function, if that function is a method. (e.g. in the above output, it would print HTTPResponse.read
, rather than just read
). It does this by looking at the first argument of the frame and, if it’s called ‘self’ or ‘cls’, it gets the type of this object.
That code can be seen here: pyinstrument/stat_profile.c at e4a6e0805a30f0de0077534ac87d1cfeee53beb4 · joerick/pyinstrument · GitHub . It uses code->co_varnames
and frame->f_localsplus
to read this information. It’s slightly complicated by the fact that the local might be a ‘cell’ variable, but that’s handled too.
(It’s quite important that the technique used is fast. Keeping the profiler low-overhead is important because otherwise it can distort the data. So performance is a concern.)
I’ve upgraded this branch to CPython 3.11, and I’ve now got a problem. I can’t access these fields any more, as they’ve become private to the interpreter. The release notes say “f_localsplus
: no public API (renamed to f_frame.localsplus
)”. co_varnames
isn’t mentioned in the release notes, but it’s gone from the headers.
The only way I can see to work around this is using the new PyFrame_GetLocals
function. But that would have the side effect of calling ‘fast-to-locals’ on every frame in the program - which is something that I want to avoid as it seems like it could have pretty major performance impacts - adding profiling overhead or just changing how the program performs.
So my question is - is there a better way to get the class name of a code/frame object? The variable named ‘self’/‘cls’ is a bit of a hack anyway. Perhaps there’s a static way that just uses the ‘code’ object? Or, is the ‘fast-to-locals’ thing not as much of an issue as I’m making it?