-
Notifications
You must be signed in to change notification settings - Fork 44
Open
Description
I'm recently building up some cache array with OffsetArrays and realized the performance bottleneck becomes getindex(::OffsetArray, I)
.
The benchmark result looks interesting; unsure why arr_sum
runs faster on OffsetArray 🤔 Any ideas?
using OffsetArrays
X = rand(4, 4, 4, 4, 4, 4);
XO = OffsetArray(X, -1, -2, -3, 1, 2, 3);
function arr_sum(X)
val = zero(eltype(X))
R = CartesianIndices(X)
for i in R
@inbounds val += X[i]
end
val
end
@btime arr_sum($X) # 5.215 μs (0 allocations: 0 bytes)
@btime arr_sum($XO) # 3.730 μs (0 allocations: 0 bytes)
@btime getindex($X, 1, 1, 1, 1, 1, 1) # 1.983 ns (0 allocations: 0 bytes)
@btime getindex($XO, 3, 2, 1, 2, 3, 4) # 5.855 ns (0 allocations: 0 bytes)
getindex_inbounds(X, inds...) = @inbounds X[inds...]
@btime getindex_inbounds($X, 1, 1, 1, 1, 1, 1) # 1.430 ns (0 allocations: 0 bytes)
@btime getindex_inbounds($XO, 3, 2, 1, 2, 3, 4) # 2.323 ns (0 allocations: 0 bytes)
The default checkbounds
implementation definitely takes too long here. I believe the additional time is spent on the construction of IdOffsetRange
and its generic and thus slower getindex
.
julia> @btime axes($X);
1.431 ns (0 allocations: 0 bytes)
julia> @btime axes($XO);
4.763 ns (0 allocations: 0 bytes)
These might be benchmark artifacts, though.
Metadata
Metadata
Assignees
Labels
No labels