xref: https://bugzilla.redhat.com/show_bug.cgi?id=1883457 xref: https://github.com/rust-lang/rust/issues/77382 In one particular Rust test case, we're seeing code miss a branch that should be taken, and in my limited knowledge of Z, it looks like a clobbered condition. The IR is too big for bugzilla, and I wasn't able to reduce it, so I uploaded it here: https://jistone.fedorapeople.org/bz1883457/libcoreinst-bc0e880cd40833af.libcoreinst.354hlw07-cgu.0.rcgu.ll.xz The function in question is _ZN11libcoreinst8download5tests22test_write_image_limit17hdd3aea2f720151ebE. In particular, this block: bb60.i: ; preds = %bb59.i call void @llvm.lifetime.end.p0i8(i64 48, i8* nonnull %444), !dbg !182775, !noalias !181924 call void @llvm.lifetime.end.p0i8(i64 16, i8* nonnull %370), !dbg !182775, !noalias !181924 %452 = bitcast %"std::option::Option<std::vec::Vec<u8>>"* %signature.i to {}**, !dbg !182787 store {}* null, {}** %452, align 8, !dbg !182787, !noalias !181924 call void @llvm.dbg.value(metadata %"std::io::Error"* %err4.i, metadata !6256, metadata !DIExpression()), !dbg !182788 call void @llvm.dbg.value(metadata %"std::io::Error"* %err4.i, metadata !6247, metadata !DIExpression()), !dbg !182790 %453 = load i8, i8* %441, align 8, !dbg !182792, !range !20068, !noalias !181924 %switch.i.i.i251 = icmp ult i8 %453, 2, !dbg !182792 br i1 %switch.i.i.i251, label %bb61.i, label %bb2.i.i.i252, !dbg !182792 So that's basically a store null, then branch less than 2. Here's "llc -O0", which looks ok: .LBB387_319: # %bb60.i #DEBUG_VALUE: out <- [DW_OP_plus_uconst 4732] [$r15d+0] #DEBUG_VALUE: sources:self <- [DW_OP_plus_uconst 9056, DW_OP_stack_value] $r15d #DEBUG_VALUE: precious <- [DW_OP_LLVM_fragment 64 64] 11 #DEBUG_VALUE: offset <- 4194304 .loc 64 134 17 # src/source.rs:134:17 lay %r1, 8192(%r15) mvghi 2488(%r1), 0 .Ltmp56831: #DEBUG_VALUE: drop_in_place<std::io::error::Error>: <- [DW_OP_plus_uconst 10800, DW_OP_stack_value] $r15d #DEBUG_VALUE: drop_in_place<std::io::error::Repr>: <- [DW_OP_plus_uconst 10800, DW_OP_stack_value] $r15d .loc 15 175 1 # /rustc/beb5ae474d2835962ebdf7416bd1c9ad864fe101/library/core/src/ptr/mod.rs:175:1 lg %r1, 4000(%r15) # 8-byte Folded Reload llc %r0, 0(%r1) chi %r0, 2 jl .LBB387_325 j .LBB387_320 Here's "llc -O1": # %bb.193: # %bb60.i #DEBUG_VALUE: sources:self <- [DW_OP_plus_uconst 616, DW_OP_stack_value] $r15d #DEBUG_VALUE: offset <- 4194304 #DEBUG_VALUE: precious <- [DW_OP_LLVM_fragment 64 64] 11 .loc 9 175 1 # /rustc/beb5ae474d2835962ebdf7416bd1c9ad864fe101/library/core/src/ptr/mod.rs:175:1 cli 712(%r15), 2 .Ltmp39998: .loc 64 134 17 # src/source.rs:134:17 llilh %r1, 16 agr %r1, %r15 mvghi 912(%r1), 0 .Ltmp39999: #DEBUG_VALUE: drop_in_place<std::io::error::Repr>: <- [DW_OP_plus_uconst 712, DW_OP_stack_value] $r15d #DEBUG_VALUE: drop_in_place<std::io::error::Error>: <- [DW_OP_plus_uconst 712, DW_OP_stack_value] $r15d .loc 9 175 1 # /rustc/beb5ae474d2835962ebdf7416bd1c9ad864fe101/library/core/src/ptr/mod.rs:175:1 jl .LBB387_201 It looks like CLI will set the CC we want for the jump, but AGR also sets CC, so then JL has the wrong condition.
Created attachment 24026 [details] Reproducer after llvm-extract
Created attachment 24027 [details] Reproducer after llvm-reduce
I was trying to reduce with bugpoint before, but I just learned how to use llvm-extract and llvm-reduce instead, with much more success. Those are attached. (It still left a lot of empty blocks, but I didn't try to clean that up.)
Stepping through IR dumps, here's where it added AGR that redefines $cc: # *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***: [...] bb.2.bb59.i: ; predecessors: %bb.0 successors: %bb.4(0x40000000), %bb.3(0x40000000); %bb.4(50.00%), %bb.3(50.00%) CLI %stack.2.err4.i, 0, 2, implicit-def $cc :: (dereferenceable load 1 from %ir.4, align 8, !noalias !7, !range !11) MVGHI %stack.0._42.i, 0, 0 :: (store 8 into %ir.5) BRC 14, 4, %bb.4, implicit killed $cc J %bb.3 # *** IR Dump After Prologue/Epilogue Insertion & Frame Finalization ***: [...] bb.2.bb59.i: ; predecessors: %bb.0 successors: %bb.4(0x40000000), %bb.3(0x40000000); %bb.4(50.00%), %bb.3(50.00%) CLI $r15d, 176, 2, implicit-def $cc :: (dereferenceable load 1 from %ir.4, align 8, !noalias !7, !range !11) $r1d = LLILH 16 $r1d = AGR killed $r1d(tied-def 0), $r15d, implicit-def $cc MVGHI killed $r1d, 192, 0 :: (store 8 into %ir.5) BRC 14, 4, %bb.4, implicit killed $cc J %bb.3
This looks like a bug in SystemZRegisterInfo::eliminateFrameIndex. This routine should never introduce a CC clobber, but there is one code path that will emit an AGR: // Load the anchor address into a scratch register. unsigned LAOpcode = TII->getOpcodeForOffset(SystemZ::LA, HighOffset); if (LAOpcode) BuildMI(MBB, MI, DL, TII->get(LAOpcode),ScratchReg) .addReg(BasePtr).addImm(HighOffset).addReg(0); else { // Load the high offset into the scratch register and use it as // an index. TII->loadImmediate(MBB, MI, ScratchReg, HighOffset); BuildMI(MBB, MI, DL, TII->get(SystemZ::AGR),ScratchReg) .addReg(ScratchReg, RegState::Kill).addReg(BasePtr); } This really ought to use LA in the second branch as well. This only triggers when the offset is larger than 512K, which is probably rare and explains why we haven't really seen this before. Jonas, can you have a look? Should be straightforward to fix.
Created attachment 24033 [details] reduced testcase Managed to reduce the test case a bit further... Patch proposed at https://reviews.llvm.org/D89034.
d851495