Description
Bug report
Bug description:
select.kqueue
is not aware of forks, and will try to use its file descriptor number after fork. Since kqueues are not inherited1 in children, the fd will be invalid, or it can refer to a file opened by someone else.
Reproducer for a case where select.kqueue
will close the fd opened with open()
on its destructor:
import os
import select
def repro():
kq = select.kqueue()
pid = os.fork()
if pid == 0:
f = open("/dev/null", "wb")
print(f"{f.fileno()=} {kq.fileno()=}")
del kq
f.write(b"x")
f.close()
repro()
Reproducer with asyncio
import asyncio
import gc
import os
def asyncio_repro():
loop = asyncio.new_event_loop()
loop.run_until_complete(asyncio.sleep(0))
if os.fork() == 0:
del loop
with open("/dev/null", "wb") as f:
gc.collect()
f.write(b"x")
asyncio_repro()
This will fail with OSError: [Errno 9] Bad file descriptor
when operating on f, because its fd was coincidentally closed by the loop destructor. Dropping the reference after fork does not help; it actually makes the problem worse, because the loop becomes cyclic garbage and the kqueue can be closed at a later, less predictable time.
In the asyncio example I need a bit of setup: the first loop object needs to be open at fork time2. The bug will be observable if, in the child, the loop's kqueue is closed/destructed after a different fd is opened.
I encountered this because I got test_asyncio.test_unix_events.TestFork
failures while working on something unrelated (and it's been a pain to debug), not in production code. I guess it can still happen in real-world code though, because there is no proper way to dispose of a select.kqueue object in a forked process, and it's hard to debug a random EBADF that triggers only on Mac/BSD in otherwise-correct code.
I'm willing to work on a fix, if one is desired, but I'll need some guidance on the strategy. I thought select.kqueue
objects can be invalidated after fork, but that would add some (small) tracking overhead so I don't know if it's acceptable.
CPython versions tested on:
3.11, 3.12, CPython main branch
Operating systems tested on:
macOS, Other
Linked PRs
- gh-110395: invalidate open kqueues after fork #110517
- [3.12] gh-110395: invalidate open kqueues after fork (GH-110517) #111745
- gh-110395: test: assert after the child dies. #111816
Footnotes
-
the queue itself is not available in children, and the OS will automatically close any fd referring to a kqueue after fork. ↩
-
this can also happen if event loop policy holds a reference to a loop, and later is dropped (maybe to create a new loop in the child). Or a Runner context is still active at fork time and is exited in the child. ↩