Skip to content

calculate_log2_keysize in dictobject.c incorrect #133703

Closed
@colesbury

Description

@colesbury

Bug report

Bug description:

Originally reported by @ThomasBr0 in #132762

The _Py_bit_length() and _BitScanReverse64() code paths are incorrect and don't always return the smallest log2 key size to fit the desired number of items. The bug is benign in the sense that the dictionaries are sometimes too big, but they're never too small.

For example calculate_log2_keysize(7) = 4 but calculate_log2_keysize(8) = 3.

/* Find the smallest dk_size >= minsize. */
static inline uint8_t
calculate_log2_keysize(Py_ssize_t minsize)
{
#if SIZEOF_LONG == SIZEOF_SIZE_T
minsize = (minsize | PyDict_MINSIZE) - 1;
return _Py_bit_length(minsize | (PyDict_MINSIZE-1));
#elif defined(_MSC_VER)
// On 64bit Windows, sizeof(long) == 4.
minsize = (minsize | PyDict_MINSIZE) - 1;
unsigned long msb;
_BitScanReverse64(&msb, (uint64_t)minsize);
return (uint8_t)(msb + 1);
#else
uint8_t log2_size;
for (log2_size = PyDict_LOG_MINSIZE;
(((Py_ssize_t)1) << log2_size) < minsize;
log2_size++)
;
return log2_size;
#endif
}

cc @methane @markshannon

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions