Skip to content

select.kqueue uses invalid fd after fork #110395

Closed
@sorcio

Description

@sorcio

Bug report

Bug description:

select.kqueue is not aware of forks, and will try to use its file descriptor number after fork. Since kqueues are not inherited1 in children, the fd will be invalid, or it can refer to a file opened by someone else.

Reproducer for a case where select.kqueue will close the fd opened with open() on its destructor:

import os
import select

def repro():
    kq = select.kqueue()
    pid = os.fork()
    if pid == 0:
        f = open("/dev/null", "wb")
        print(f"{f.fileno()=} {kq.fileno()=}")
        del kq
        f.write(b"x")
        f.close()

repro()
Reproducer with asyncio

import asyncio
import gc
import os

def asyncio_repro():
    loop = asyncio.new_event_loop()
    loop.run_until_complete(asyncio.sleep(0))
    if os.fork() == 0:
        del loop
        with open("/dev/null", "wb") as f:
            gc.collect()
            f.write(b"x")

asyncio_repro()

This will fail with OSError: [Errno 9] Bad file descriptor when operating on f, because its fd was coincidentally closed by the loop destructor. Dropping the reference after fork does not help; it actually makes the problem worse, because the loop becomes cyclic garbage and the kqueue can be closed at a later, less predictable time.

In the asyncio example I need a bit of setup: the first loop object needs to be open at fork time2. The bug will be observable if, in the child, the loop's kqueue is closed/destructed after a different fd is opened.

I encountered this because I got test_asyncio.test_unix_events.TestFork failures while working on something unrelated (and it's been a pain to debug), not in production code. I guess it can still happen in real-world code though, because there is no proper way to dispose of a select.kqueue object in a forked process, and it's hard to debug a random EBADF that triggers only on Mac/BSD in otherwise-correct code.

I'm willing to work on a fix, if one is desired, but I'll need some guidance on the strategy. I thought select.kqueue objects can be invalidated after fork, but that would add some (small) tracking overhead so I don't know if it's acceptable.

CPython versions tested on:

3.11, 3.12, CPython main branch

Operating systems tested on:

macOS, Other

Linked PRs

Footnotes

  1. the queue itself is not available in children, and the OS will automatically close any fd referring to a kqueue after fork.

  2. this can also happen if event loop policy holds a reference to a loop, and later is dropped (maybe to create a new loop in the child). Or a Runner context is still active at fork time and is exited in the child.

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.12only security fixes3.13bugs and security fixesstdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions