-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Closed
Description
On Julia 1.8.1 I noticed the following:
function manymul(N, C, A, B, alpha, beta)
for i in 1:N
mul!(C, A, B, alpha, beta)
#BLAS.gemm!('N', 'N', alpha, A, B, beta, C) # eliminates allocations
C, A = A, C
end
C
end
D = 16
A = randn(D, D)
B = randn(D, D)
C = zero(A)
N = 100000
@time manymul(N, C, A, B, 1.0, 0.5) #allocates N times (32 bytes each) with `mul!()`, 0 times with `gemm!()`
Cthulhu suggests this is due to runtime dispatch related to MulAdd()
. This can impact performance of e.g. ODE solving involving mul!()
for small matrix sizes. The example above takes around 10% longer with mul!()
vs. gemm!()
, according to benchmarktools (single-threaded BLAS).
Is this known/intended?
My versioninfo()
:
Julia Version 1.8.1
Commit afb6c60d69a (2022-09-06 15:09 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin21.4.0)
CPU: 12 × Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
Threads: 1 on 6 virtual cores
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS =
JULIA_PKG_USE_CLI_GIT = true
Metadata
Metadata
Assignees
Labels
No labels