-
Notifications
You must be signed in to change notification settings - Fork 148
selftests/bpf: merge most of test_btf into test_progs #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Master branch: bf74a37 patch https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/ applied successfully |
…xercised regularly. Pretty-printing tests were left alone and renamed into test_btf_pprint because they are very slow and were not even executed by default with test_btf. All the test_btf tests that were moved are modeled as proper sub-tests in test_progs framework for ease of debugging and reporting. No functional or behavioral changes were intended, I tried to preserve original behavior as close to the original as possible. `test_progs -v` will activate "always_log" flag to emit BTF validation log. Signed-off-by: Andrii Nakryiko <[email protected]> --- v1->v2: - pretty-print BTF tests were renamed test_btf -> test_btf_pprint, which allowed GIT to detect that majority of test_btf code was moved into prog_tests/btf.c; so diff is much-much smaller; tools/testing/selftests/bpf/.gitignore | 2 +- .../bpf/{test_btf.c => prog_tests/btf.c} | 1069 +---------------- tools/testing/selftests/bpf/test_btf_pprint.c | 969 +++++++++++++++ 3 files changed, 1033 insertions(+), 1007 deletions(-) rename tools/testing/selftests/bpf/{test_btf.c => prog_tests/btf.c} (85%) create mode 100644 tools/testing/selftests/bpf/test_btf_pprint.c
Master branch: d317b0a patch https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/ applied successfully |
49583d8
to
6be61d6
Compare
In case of memory pressure the MPTCP xmit path keeps at most a single skb in the tx cache, eventually freeing additional ones. The associated counter for forward memory is not update accordingly, and that causes the following splat: WARNING: CPU: 0 PID: 12 at net/core/stream.c:208 sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.11.0-rc2 #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 63 01 00 00 8b ab 00 01 00 00 e9 60 ff ff ff e8 2f 24 d3 fe 0f 0b eb 97 e8 26 24 d3 fe <0f> 0b eb a0 e8 1d 24 d3 fe 0f 0b e9 a5 fe ff ff 4c 89 e7 e8 0e d0 RSP: 0018:ffffc900000c7bc8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88810030ac40 RSI: ffffffff8262ca4a RDI: 0000000000000003 RBP: 0000000000000d00 R08: 0000000000000000 R09: ffffffff85095aa7 R10: ffffffff8262c9ea R11: 0000000000000001 R12: ffff888108908100 R13: ffffffff85095aa0 R14: ffffc900000c7c48 R15: 1ffff92000018f85 FS: 0000000000000000(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa7444baef8 CR3: 0000000035ee9005 CR4: 0000000000170ef0 Call Trace: __mptcp_destroy_sock+0x4a7/0x6c0 net/mptcp/protocol.c:2547 mptcp_worker+0x7dd/0x1610 net/mptcp/protocol.c:2272 process_one_work+0x896/0x1170 kernel/workqueue.c:2275 worker_thread+0x605/0x1350 kernel/workqueue.c:2421 kthread+0x344/0x410 kernel/kthread.c:292 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296 At close time, as reported by syzkaller/Christoph. This change address the issue properly updating the fwd allocated memory counter in the error path. Reported-by: Christoph Paasch <[email protected]> Closes: multipath-tcp/mptcp_net-next#136 Fixes: 724cfd2 ("mptcp: allocate TX skbs in msk context") Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
The rtla osnoise tool is an interface for the osnoise tracer. The osnoise tracer dispatches a kernel thread per-cpu. These threads read the time in a loop while with preemption, softirqs and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise threads take note of the entry and exit point of any source of interferences, increasing a per-cpu interference counter. The osnoise tracer also saves an interference counter for each source of interference. The rtla osnoise top mode displays information about the periodic summary from the osnoise tracer. One example of rtla osnoise top output is: [root@alien ~]# rtla osnoise top -c 0-3 -d 1m -q -r 900000 -P F:1 Operating System Noise duration: 0 00:01:00 | time is in us CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI IRQ Softirq Thread 0 #58 52200000 1031 99.99802 91 60 0 0 52285 0 101 1 #59 53100000 5 99.99999 5 5 0 9 53122 0 18 2 #59 53100000 7 99.99998 7 7 0 8 53115 0 18 3 #59 53100000 8274 99.98441 277 23 0 9 53778 0 660 "rtla osnoise top --help" works and provide information about the available options. Link: https://lkml.kernel.org/r/0d796993abf587ae5a170bb8415c49368d4999e1.1639158831.git.bristot@kernel.org Cc: Tao Zhou <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Clark Williams <[email protected]> Cc: John Kacur <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #55 fentry_fexit:OK #56 fentry_test:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. fentry before bpf trampoline hooked: mov x9, x30 nop fentry after bpf trampoline hooked: mov x9, x30 bl <bpf_trampoline> Tested on qemu, result: #18 bpf_tcp_ca:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #101 modify_return:OK #233 xdp_bpf2bpf:OK Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
Add bpf trampoline support for arm64. Most of the logic is the same as x86. Tested on raspberry pi 4b and qemu with KASLR disabled (avoid long jump), result: #9 /1 bpf_cookie/kprobe:OK #9 /2 bpf_cookie/multi_kprobe_link_api:FAIL #9 /3 bpf_cookie/multi_kprobe_attach_api:FAIL #9 /4 bpf_cookie/uprobe:OK #9 /5 bpf_cookie/tracepoint:OK #9 /6 bpf_cookie/perf_event:OK #9 /7 bpf_cookie/trampoline:OK #9 /8 bpf_cookie/lsm:OK #9 bpf_cookie:FAIL #18 /1 bpf_tcp_ca/dctcp:OK #18 /2 bpf_tcp_ca/cubic:OK #18 /3 bpf_tcp_ca/invalid_license:OK #18 /4 bpf_tcp_ca/dctcp_fallback:OK #18 /5 bpf_tcp_ca/rel_setsockopt:OK #18 bpf_tcp_ca:OK #51 /1 dummy_st_ops/dummy_st_ops_attach:OK #51 /2 dummy_st_ops/dummy_init_ret_value:OK #51 /3 dummy_st_ops/dummy_init_ptr_arg:OK #51 /4 dummy_st_ops/dummy_multiple_args:OK #51 dummy_st_ops:OK #55 fentry_fexit:OK #56 fentry_test:OK #57 /1 fexit_bpf2bpf/target_no_callees:OK #57 /2 fexit_bpf2bpf/target_yes_callees:OK #57 /3 fexit_bpf2bpf/func_replace:OK #57 /4 fexit_bpf2bpf/func_replace_verify:OK #57 /5 fexit_bpf2bpf/func_sockmap_update:OK #57 /6 fexit_bpf2bpf/func_replace_return_code:OK #57 /7 fexit_bpf2bpf/func_map_prog_compatibility:OK #57 /8 fexit_bpf2bpf/func_replace_multi:OK #57 /9 fexit_bpf2bpf/fmod_ret_freplace:OK #57 fexit_bpf2bpf:OK #58 fexit_sleep:OK #59 fexit_stress:OK #60 fexit_test:OK #67 get_func_args_test:OK #68 get_func_ip_test:OK #104 modify_return:OK #237 xdp_bpf2bpf:OK bpf_cookie/multi_kprobe_link_api and bpf_cookie/multi_kprobe_attach_api failed due to lack of multi_kprobe on arm64. Signed-off-by: Xu Kuohai <[email protected]> Acked-by: Song Liu <[email protected]>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
This bug happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the virtio-net driver is still probing with rtnl_lock() hold, this will cause a recursive mutex in netdev_notify_peers(). Fix it by temporarily save the announce status while probing, and then in virtnet_open(), if it sees a delayed announce work is there, it starts to schedule the virtnet_config_changed_work(). Another possible solution is to directly check whether rtnl_is_locked() and call __netdev_notify_peers(), but in that way means we need to relies on netdev_queue to schedule the arp packets after ndo_open(), which we thought is not very intuitive. We've observed a softlockup with Ubuntu 24.04, and can be reproduced with QEMU sending the announce_self rapidly while booting. [ 494.167473] INFO: task swapper/0:1 blocked for more than 368 seconds. [ 494.167667] Not tainted 6.8.0-57-generic kernel-patches#59-Ubuntu [ 494.167810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 494.168015] task:swapper/0 state:D stack:0 pid:1 tgid:1 ppid:0 flags:0x00004000 [ 494.168260] Call Trace: [ 494.168329] <TASK> [ 494.168389] __schedule+0x27c/0x6b0 [ 494.168495] schedule+0x33/0x110 [ 494.168585] schedule_preempt_disabled+0x15/0x30 [ 494.168709] __mutex_lock.constprop.0+0x42f/0x740 [ 494.168835] __mutex_lock_slowpath+0x13/0x20 [ 494.168949] mutex_lock+0x3c/0x50 [ 494.169039] rtnl_lock+0x15/0x20 [ 494.169128] netdev_notify_peers+0x12/0x30 [ 494.169240] virtnet_config_changed_work+0x152/0x1a0 [ 494.169377] virtnet_probe+0xa48/0xe00 [ 494.169484] ? vp_get+0x4d/0x100 [ 494.169574] virtio_dev_probe+0x1e9/0x310 [ 494.169682] really_probe+0x1c7/0x410 [ 494.169783] __driver_probe_device+0x8c/0x180 [ 494.169901] driver_probe_device+0x24/0xd0 [ 494.170011] __driver_attach+0x10b/0x210 [ 494.170117] ? __pfx___driver_attach+0x10/0x10 [ 494.170237] bus_for_each_dev+0x8d/0xf0 [ 494.170341] driver_attach+0x1e/0x30 [ 494.170440] bus_add_driver+0x14e/0x290 [ 494.170548] driver_register+0x5e/0x130 [ 494.170651] ? __pfx_virtio_net_driver_init+0x10/0x10 [ 494.170788] register_virtio_driver+0x20/0x40 [ 494.170905] virtio_net_driver_init+0x97/0xb0 [ 494.171022] do_one_initcall+0x5e/0x340 [ 494.171128] do_initcalls+0x107/0x230 [ 494.171228] ? __pfx_kernel_init+0x10/0x10 [ 494.171340] kernel_init_freeable+0x134/0x210 [ 494.171462] kernel_init+0x1b/0x200 [ 494.171560] ret_from_fork+0x47/0x70 [ 494.171659] ? __pfx_kernel_init+0x10/0x10 [ 494.171769] ret_from_fork_asm+0x1b/0x30 [ 494.171875] </TASK> Fixes: df28de7 ("virtio-net: synchronize operstate with admin state on up/down") Signed-off-by: Zigit Zo <[email protected]> Signed-off-by: NipaLocal <nipa@local>
Pull request for series with
subject: selftests/bpf: merge most of test_btf into test_progs
version: 2
url: https://patchwork.ozlabs.org/project/netdev/list/?series=201720