Skip to content

Scaling bottlenecks in the free-threaded build #118527

Closed
@colesbury

Description

@colesbury

There are a few remaining scaling bottlenecks in the free-threaded build that we should fix.

I have been using the following benchmark to detect bottlenecks that were previously issues in older versions of the nogil forks:
https://gist.github.com/colesbury/429fe9f90036d43ad43576c3d357a12e

Note that for reliable results the above benchmark requires some setup:

  • Adjust NTHREADS if necessary on your system
  • Disable turbo boost or equivalent on your system
  • Avoid running on hyper-threading siblings (i.e., use taskset -c 0-<N> to choose separate physical cores)

Current bottlenecks

  • cmodule_function
  • load_string_const
  • load_tuple_const
  • create_closure

Underlying issues

  • Reference count contention on non-string constants. We will want to immortalize most constants in PyCodeObject.
  • Reference count contention on func.__qualname__ or code.co_qualname (when creating closure)
  • Reference count contention on module-level PyCFunctionObjects

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions