Skip to content

cmd/link: Incorrect symbol linked in darwin/arm64 #58935

@sywhang

Description

@sywhang

Hello,

I'm from the Go Platform team at Uber, and we've been running into what appears to be a linker bug in macOS/M1 while trying to upgrade to Go 1.20.

What version of Go are you using (go version)?

This repros on all of 1.20 minor point releases 1.20.2.

$ go version
1.20.2

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

The issue only manifests in M1 macs.

We are in a Bazel sandbox environment, using rules_go.

go env Output
$ go env
❯ go env
GO111MODULE="off"
GOARCH="arm64"
GOBIN="/Users/sungyoon/go/bin/"
GOCACHE="/Users/sungyoon/Library/Caches/go-build"
GOENV="/Users/sungyoon/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/sungyoon/go-code/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/sungyoon/go-code"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/private/var/tmp/_bazel_sungyoon/ac08da491286b6bc27a8f65f3e5696d3/external/go_sdk"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/private/var/tmp/_bazel_sungyoon/ac08da491286b6bc27a8f65f3e5696d3/external/go_sdk/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.20.2"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/zl/5_l54cc5231fssrrq3thj6z40000gn/T/go-build2387064609=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

We received reports of some tests in our Go Monorepo that are only failing in M1 after upgrading to Go 1.20.

The panic trace depends on the failing targets, but all of them panic in some form during init.

Invalid return address:

runtime: g 15: unexpected return pc for runtime.callers called from 0x0
stack: frame={sp:0x14001dd3e80, fp:0x14001dd3ef0} stack=[0x14001dd2000,0x14001dd4000)
0x0000014001dd3d80:  0x00000140004216f0  0x0000000104d6f903 <testing.tRunner+0x0000000000000033>
0x0000014001dd3d90:  0x00000001162e2560  0x00000001121c655b
0x0000014001dd3da0:  0x000000000000000f  0x000000011485543f
0x0000014001dd3db0:  0x000000000000001d  0x000000000000059a
0x0000014001dd3dc0:  0x0000000000000599  0x0000000104d6f8d0 <testing.tRunner+0x0000000000000000>
0x0000014001dd3dd0:  0x00000001162e2560  0x00000001183881a0
0x0000014001dd3de0:  0x0000000104d6f903 <testing.tRunner+0x0000000000000033>  0x00000001162e2560
0x0000014001dd3df0:  0x00000001121c655b  0x000000000000000f
0x0000014001dd3e00:  0x0000000000000000  0x000000010d318ef5
0x0000014001dd3e10:  0x000000000000002a  0x000000010d2d9395
0x0000014001dd3e20:  0x0000000000000026  0x000000010d44460a
0x0000014001dd3e30:  0x0000000000000036  0x000000010d22f3ca
0x0000014001dd3e40:  0x000000000000001b  0x000000010d3d07d4
0x0000014001dd3e50:  0x0000000000000032  0x000000010d2d936f
0x0000014001dd3e60:  0x0000000000000026  0x000000010d2906eb
0x0000014001dd3e70:  0x0000000000000021  0x0000000000000000
runtime.callers(0x0?, {0x0?, 0x0?, 0x0?})
    GOROOT/src/runtime/traceback.go:843 +0x78 fp=0x14001dd3ef0 sp=0x14001dd3e80 pc=0x104c09988
created by testing.(*T).Run
    GOROOT/src/testing/testing.go:1629 +0x36c

Panics from callee:
case 1:

goroutine 1 [running]:
reflect.Value.lenNonSlice({0x10c2a5b30?, 0x140091352c0?, 0x140091c3080?})
        GOROOT/src/reflect/value.go:1714 +0x160
github.com/go-playground/validator/v10.init.0()
        external/com_github_go_playground_validator_v10/postcode_regexes.go:170 +0x38

case 2:

panic: reflect: call of reflect.Value.Uint on kind31 Value

goroutine 1 [running]:
reflect.Value.Uint({0x0?, 0x0?, 0x111bd4680?})
        GOROOT/src/reflect/value.go:2666 +0xfc
github.com/shopspring/decimal.newFromFloat(0x3de5d8fd1fd19ccd, 0x3de5d8fd1fd19ccd, 0x1160395a0)
        external/com_github_shopspring_decimal/decimal.go:302 +0x154
github.com/shopspring/decimal.NewFromFloat(0x3de5d8fd1fd19ccd)
        external/com_github_shopspring_decimal/decimal.go:262 +0x7c
github.com/shopspring/decimal.init()
        external/com_github_shopspring_decimal/decimal.go:1735 +0x23c

Looking through the disassembly, we're seeing calls to runtime.duffzero getting linked with some arbitrary functions in the problematic targets. If the linked callee panics from invalid args, then it causes this panic to occur. Sometimes the panic happens because the linked target expects a different stack size from one caller set up, and panics from invalid return pc.

Below is part of the disassembled init func of one of monorepo dependencies: (github.com/shopspring/decimal).

  0x5ba0e        a93eeffd        STP (R29, R27), -24(RSP)                
  0x5ba12        d10063fd        SUB $24, RSP, R29                    
  0x5ba16        94000000        CALL 0(PC)                        [0:4]R_CALLARM64:runtime.duffzero<1>+52    
  0x5ba1a        d10023fd        SUB $8, RSP, R29                    
  0x5ba1e        f901dfff        MOVD ZR, 952(RSP)                    
  0x5ba22        f94023e1        MOVD 64(RSP), R1                    
  0x5ba26        910223e0        ADD $136, RSP, R0                    
  0x5ba2a        94000000        CALL 0(PC)                        [0:4]R_CALLARM64:github.com/shopspring/decimal.(*decimal).Assign    
  0x5ba2e        f94247e2        MOVD 1160(RSP), R2                    

This is what we see in the intermediate archive file generated for compile, before it's linked.

But in the final binary, we're seeing the linker somehow linked the call to runtime.duffzero with reflect.Value in the same init function:

  0x107c89d38        a93eeffd        STP (R29, R27), -24(RSP)                    
  0x107c89d3c        d10063fd        SUB $24, RSP, R29                        
  0x107c89d40        97fded21        CALL reflect.Value.Float.island(SB)                 // ???
  0x107c89d44        d10023fd        SUB $8, RSP, R29                        
  0x107c89d48        f901dfff        MOVD ZR, 952(RSP)                        
  0x107c89d4c        f94023e1        MOVD 64(RSP), R1                        
  0x107c89d50        910223e0        ADD $136, RSP, R0                        
  0x107c89d54        9400022f        CALL github.com/shopspring/decimal.(*decimal).Assign(SB)    
  0x107c89d58        f94247e2        MOVD 1160(RSP), R2                        

Similar issue happens with runtime.duffcopy in another target:

Pre-linking:

  0xe6364        90000014        ADRP 0(PC), R20                [0:8]R_ADDRARM64:gopkg.in/yaml%2ev3..stmp_41<1>    
  0xe6368        91000294        ADD $0, R20, R20            
  0xe636c        1000009b        ADR 16(PC), R27                
  0xe6370        a93eeffd        STP (R29, R27), -24(RSP)        
  0xe6374        d10063fd        SUB $24, RSP, R29            
  0xe6378        94000000        CALL 0(PC)                [0:4]R_CALLARM64:runtime.duffcopy<1>+288    
  0xe637c        d10023fd        SUB $8, RSP, R29            
  0xe6380        910c63e0        ADD $792, RSP, R0            
  0xe6384        f900bfe0        MOVD R0, 376(RSP)            
  0xe6388        3980001b        MOVB (R0), R27                
  0xe638c        90000000        ADRP 0(PC), R0                [0:8]R_ADDRARM64:type:bool    

Post-linking:

  0x107cb2884        d004f174        ADRP 165863424(PC), R20                
  0x107cb2888        91304294        ADD $3088, R20, R20                
  0x107cb288c        1000009b        ADR 16(PC), R27                    
  0x107cb2890        a93eeffd        STP (R29, R27), -24(RSP)            
  0x107cb2894        d10063fd        SUB $24, RSP, R29                
  0x107cb2898        97fd49a8        CALL reflect.New.island(SB)              // ???
  0x107cb289c        d10023fd        SUB $8, RSP, R29                
  0x107cb28a0        910c63e0        ADD $792, RSP, R0                
  0x107cb28a4        f900bfe0        MOVD R0, 376(RSP)                
  0x107cb28a8        3980001b        MOVB (R0), R27                    
  0x107cb28ac        90043da0        ADRP 142295040(PC), R0   

This issue does not occur with every binary that uses these dependencies, but only some of them. Another point worth noting is that when we change the binary layout by turning inline optimization off or all optimizations off with -N/-l gcflags, the issues go away, but it starts happening on some other targets that were passing with the optimizations.

This issue does not occur on any other environments we have (Linux amd64 or darwin amd64).

What did you expect to see?

Linker correctly links correct binaries.

What did you see instead?

Panics as described above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions