Description
Some of the relevant fields in the interpreter state and the frame state in 3.12 are very challenging to fetch from out of process tools because they are in offsets that depend on compilation or platform variables that are different in different platforms. Not only that but they require the tools to copy a huge amount of intermediate structures making the whole thing very verbose.
As an example, this is the list of stuff that needs to be copied so out-of-process debuggers can fetch the interpreters, runtime state and thread list:
https://gist.github.com/godlygeek/271951b20bb4c3783c2dd7c80908b116
With a simple reordering, that would shrink this to
https://gist.github.com/godlygeek/341ce879a638c0fece9d0081d63e5ad9
For the interpreter state is also quite bad. Here is the things that need to be copied:
https://gist.github.com/godlygeek/2468ff3d0f648a1aca7a8305bad7f825
Not only that, but this depends on the compile-time value of all of this:
int PYSTACK_SIZEOF_VOID_P = sizeof(void*);
int ALIGNMENT = PYSTACK_SIZEOF_VOID_P > 4 ? 16 : 8;
int SMALL_REQUEST_THRESHOLD = 512;
int NB_SMALL_SIZE_CLASSES = (SMALL_REQUEST_THRESHOLD / ALIGNMENT);
int OBMALLOC_USED_POOLS_SIZE = (2 * ((NB_SMALL_SIZE_CLASSES + 7) / 8) * 8);
int USE_LARGE_ARENAS = PYSTACK_SIZEOF_VOID_P > 4;
int ARENA_BITS = USE_LARGE_ARENAS ? 20 : 18;
int ARENA_SIZE = 1 << ARENA_BITS;
int ARENA_SIZE_MASK = ARENA_SIZE - 1;
int POINTER_BITS = 8 * PYSTACK_SIZEOF_VOID_P;
int IGNORE_BITS = 0;
int USE_INTERIOR_NODES = PYSTACK_SIZEOF_VOID_P > 4;
int ADDRESS_BITS = (POINTER_BITS - IGNORE_BITS);
int INTERIOR_BITS = USE_INTERIOR_NODES ? ((ADDRESS_BITS - ARENA_BITS + 2) / 3) : 0;
int MAP_TOP_BITS = INTERIOR_BITS;
int MAP_TOP_LENGTH = (1 << MAP_TOP_BITS);
int MAP_TOP_MASK = (MAP_TOP_LENGTH - 1);
int MAP_MID_BITS = INTERIOR_BITS;
int MAP_MID_LENGTH = (1 << MAP_MID_BITS);
int MAP_BOT_BITS = (ADDRESS_BITS - ARENA_BITS - 2 * INTERIOR_BITS);
int MAP_BOT_LENGTH = (1 << MAP_BOT_BITS);
int WITH_PYMALLOC_RADIX_TREE = 1;
int USE_LARGE_POOLS = USE_LARGE_ARENAS ? WITH_PYMALLOC_RADIX_TREE : 0;
int POOL_BITS = USE_LARGE_POOLS ? 14 : 12;
int POOL_SIZE = (1 << POOL_BITS);
int MAX_POOLS_IN_ARENA = (ARENA_SIZE / POOL_SIZE);
If the user changes any of these (like WITH_PYMALLOC_RADIX_TREE
) when compiling Python, then the tools won't be able to work correctly.
We can easily reorder these two structures because they are not in the hot path of anything (unlike frames and code objects).
Linked PRs
- gh-106140: Reorder some fields to facilitate out-of-process inspection #106143
- [3.12] gh-106140: Reorder some fields to facilitate out-of-process inspection (GH-106143) #106147
- gh-106140: Reorder some more fields to facilitate out-of-process inspection #106148
- [3.12] gh-106140: Reorder some more fields to facilitate out-of-process inspection (GH-106148) #106155