Skip to content

concurrent.futures→asyncio state transfer is a bottleneck #134173

Closed
@bdraco

Description

@bdraco

Bug report

Bug description:

The current _copy_future_state implementation requires multiple method calls and lock acquisitions to retrieve the source future's state:

  1. done() - acquires lock to check state
  2. cancelled() - acquires lock again
  3. exception() - acquires lock to get exception
  4. result() - acquires lock to get result

Each method call involves thread synchronization overhead, making this operation a bottleneck for high-frequency executor dispatches.

Our use case involves dispatching a large number of small executor jobs from asyncio to a thread pool. These jobs typically involve open or stat on files that are already cached by the OS, so the actual I/O returns almost instantly. However, we still have to offload them to avoid blocking the event loop, since there's no reliable way to determine in advance whether a read will hit the cache.

As a result, the majority of the overhead isn't from the I/O itself, but from the cost of scheduling. Most of the time is spent copying future state, which involves locking. This PR reduces that overhead, which has a meaningful impact at scale.

CPython versions tested on:

3.13

Operating systems tested on:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions