LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 49322 - SystemZ stack temporary overflow
Summary: SystemZ stack temporary overflow
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: SystemZ (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Jonas Paulsson
URL:
Keywords:
Depends on:
Blocks: release-12.0.1 50688
  Show dependency tree
 
Reported: 2021-02-22 14:56 PST by Josh Stone
Modified: 2021-06-21 11:00 PDT (History)
6 users (show)

See Also:
Fixed By Commit(s): 52bbbf4 0193a7da8bda


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Josh Stone 2021-02-22 14:56:40 PST
This bug was reduced from one of the failures in Rust #80810:
https://github.com/rust-lang/rust/issues/80810

When a large integer argument on s390x is converted to indirect, but the type is not a multiple of 64 bits, the writes to the stack are all still in 64-bit chunks and may clobber neighboring values on the stack.


### arg-i96.ll

target datalayout = "E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64"
target triple = "s390x-unknown-linux-gnu"

declare hidden void @fn1(i96) unnamed_addr

define hidden i32 @fn2() unnamed_addr {
start:
  %0 = alloca i32, align 4
  store i32 -1, i32* %0, align 4
  call void @fn1(i96 0)
  %1 = load i32, i32* %0, align 4
  ret i32 %1
}


### llc -O0

	.text
	.file	"arg-i96.ll"
	.hidden	fn2                             # -- Begin function fn2
	.globl	fn2
	.p2align	4
	.type	fn2,@function
fn2:                                    # @fn2
	.cfi_startproc
# %bb.0:                                # %start
	stmg	%r14, %r15, 112(%r15)
	.cfi_offset %r14, -48
	.cfi_offset %r15, -40
	aghi	%r15, -176
	.cfi_def_cfa_offset 336
	mvhi	172(%r15), -1
	mvghi	168(%r15), 0
	mvghi	160(%r15), 0
	la	%r2, 160(%r15)
	brasl	%r14, fn1@PLT
	l	%r2, 172(%r15)
	lmg	%r14, %r15, 288(%r15)
	br	%r14
.Lfunc_end0:
	.size	fn2, .Lfunc_end0-fn2
	.cfi_endproc
                                        # -- End function
	.hidden	fn1
	.section	".note.GNU-stack","",@progbits


###

In this reproducer, the 32-bit store to %0 -- mvhi 172(%r15), -1 -- is immediately overwritten by the overflowing 64-bit store to the end of %1 -- mvghi 168(%r15), 0.

With --print-after-all, you can also see the 12-byte (96-bit) frame allocation with two 8-byte writes.

# *** IR Dump After Finalize ISel and expand pseudo-instructions ***:
# Machine code for function fn2: IsSSA, TracksLiveness
Frame Objects:
  fi#0: size=4, align=4, at location [SP]
  fi#1: size=12, align=8, at location [SP]

bb.0.start:
  MVHI %stack.0, 0, -1 :: (store 4 into %ir.0)
  ADJCALLSTACKDOWN 0, 0
  MVGHI %stack.1, 8, 0 :: (store 8 into %stack.1)
  MVGHI %stack.1, 0, 0 :: (store 8 into %stack.1)
  %0:gr64bit = LA %stack.1, 0, $noreg
  $r2d = COPY %0:gr64bit
  CallBRASL @fn1, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d $r11d $r12d $r13d $r14d $r15d $r6h $r7h $r8h and 22 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc
  ADJCALLSTACKUP 0, 0
  %1:gr32bit = L %stack.0, 0, $noreg :: (dereferenceable load 4 from %ir.0)
  $r2l = COPY %1:gr32bit
  Return implicit $r2l

# End machine code for function fn2.
Comment 1 Josh Stone 2021-02-22 15:20:11 PST
> In this reproducer, the 32-bit store to %0 -- mvhi 172(%r15), -1 -- is immediately overwritten by the overflowing 64-bit store to the end of %1 -- mvghi 168(%r15), 0.

Sorry, I misspoke -- the latter move is part of the spilled argument for the call, not %1.
Comment 2 Josh Stone 2021-02-24 09:53:10 PST
The 12-byte allocation comes from SystemZTargetLowering::LowerCall here:

    if (VA.getLocInfo() == CCValAssign::Indirect) {
      // Store the argument in a stack slot and pass its address.
      SDValue SpillSlot = DAG.CreateStackTemporary(Outs[I].ArgVT);

I looked around for other targets that do this for Indirect args. X86 does create stack temporaries for !isByVal args, but using VA.getValVT(), which I think means it's using a separate temporary for each piece of the split arg. However, RISCVTargetLowering::LowerCall is pretty much the same as SystemZ:

    // Promote the value if needed.
    // For now, only handle fully promoted and indirect arguments.
    if (VA.getLocInfo() == CCValAssign::Indirect) {
      // Store the argument in a stack slot and pass its address.
      SDValue SpillSlot = DAG.CreateStackTemporary(Outs[i].ArgVT);

RISCV appears fine with the i96 test, as it just passes that in two registers, but it does have a problem if I bump that to i160:
(-mtriple riscv64-unknown-linux-gnu)

        .text
        .attribute      4, 16
        .attribute      5, "rv64i2p0"
        .file   "<stdin>"
        .hidden fn2                             # -- Begin function fn2
        .globl  fn2
        .p2align        2
        .type   fn2,@function
fn2:                                    # @fn2
        .cfi_startproc
# %bb.0:                                # %start
        addi    sp, sp, -32
        .cfi_def_cfa_offset 32
        sd      ra, 24(sp)                      # 8-byte Folded Spill
        .cfi_offset ra, -8
        addi    a0, zero, 1
        slli    a0, a0, 32
        addi    a0, a0, -1
        sw      a0, 20(sp)
        mv      a0, zero
        sd      a0, 16(sp)
        sd      a0, 8(sp)
        sd      a0, 0(sp)
        mv      a0, sp
        call    fn1
        lw      a0, 20(sp)
        ld      ra, 24(sp)                      # 8-byte Folded Reload
        addi    sp, sp, 32
        ret
.Lfunc_end0:
        .size   fn2, .Lfunc_end0-fn2
        .cfi_endproc
                                        # -- End function
        .hidden fn1
        .section        ".note.GNU-stack","",@progbits

So the -1 "sw a0, 20(sp)" is clobbered by the 0 "sd a0, 16(sp)".
Comment 3 Ulrich Weigand 2021-02-25 10:43:41 PST
This does indeed look like a SystemZ back-end bug.

The problem is the way common code handles large integer types: they are split into "register-sized" pieces.  If the size of the type is not a multiple of the register size, then common code implicitly extends the type to the next-larger type that is.

In this case, this means the i96 input type is implicitly extended to i128, which is then split into two i64 pieces.

However, according to the SystemZ ABI integer types larger than a register are actually supposed to be passed via implicit reference, so the back-end attempts to undo the split into pieces done by common code.  The current back-end code for this works correctly only if no such extension has taken place.  This used to be OK since the only type this code path was really ever intended to be used for was the i128 case (which needs to be passed via implicit reference for compatibility with GCC).

So in the end this is now a question of defining and then implementing a proper ABI for these other large integer types.  I guess there's two "obvious" possibilities:

1) Pass by implicit reference to the actual type (i96 in this case)
2) Pass by implicit reference to the extended type (i128 in this case)

Variant 1) might be considered more natural and uses less stack space, but is more difficult and inefficient to implement (on a big-endian system we'd have to load/store parts of the first piece, and then load/store the remaining pieces at "odd" adjusted offsets).

Variant 2) seems straightforward to implement, the only change to existing code would be to allocate a larger stack slot. So I think this is what I'd prefer to do.

Jonas, can you have a try at implementing this?
Comment 4 Jonas Paulsson 2021-02-25 17:25:11 PST
Thanks for the reduced test case!

Suggested patch at https://reviews.llvm.org/D97514.
Comment 5 Jonas Paulsson 2021-03-02 11:06:37 PST
52bbbf4
Comment 6 Josh Stone 2021-03-09 13:56:01 PST
Thank you both! I have confirmed that this does fix the issues seen in Rust #80810. I also opened bug 49500 for the similar RISCV bug.
Comment 7 Tom Stellard 2021-06-15 17:30:10 PDT
Hi Jonas,

What is your opinion on backporting this?

https://reviews.llvm.org/rG52bbbf4d4459239e0f461bc302ada89e2c5d07fc
Comment 8 Jonas Paulsson 2021-06-17 17:22:30 PDT
(In reply to Tom Stellard from comment #7)
> Hi Jonas,
> 
> What is your opinion on backporting this?
> 
> https://reviews.llvm.org/rG52bbbf4d4459239e0f461bc302ada89e2c5d07fc

That seems like a good idea...
Comment 9 Tom Stellard 2021-06-21 11:00:59 PDT
Merged: 0193a7da8bda