Skip to content

iterate fallback causes 5X performance drop #45

@goretkin

Description

@goretkin

Consider a "function" that has two versions. foo3 is just foo2, but it accepts a function in the first argument the same way that e.g. sum does.

function foo2(a)
    r = nothing
    for y in a
        if r === nothing
            r = y
        else
            r += y
        end
    end
    return r
end

function foo3(f, a)
    r = nothing
    for x in a
        y = f(x)
        if r === nothing
            r = y
        else
            r += y
        end
    end
    return r
end

I personally do not like the pattern, and MappedArrays.jl gives me a way of factoring out that function so that I can only write foo2, even if I, say, need to do the computation only on a specific field of each element.

using MappedArrays: mappedarray

const A = [(field=i,) for i = 1:1000]
const f = Base.Fix2(getfield, :field)

foo2(mappedarray(f, A)) == foo3(f, A) || error()

But the performance is not equivalent:

julia> using BenchmarkTools

julia> @benchmark foo2(mappedarray(f, A))

BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     325.878 ns (0.00% GC)
  median time:      409.595 ns (0.00% GC)
  mean time:        430.838 ns (0.00% GC)
  maximum time:     1.315 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     221

julia> @benchmark foo3(f, A)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     63.348 ns (0.00% GC)
  median time:      68.360 ns (0.00% GC)
  mean time:        71.576 ns (0.00% GC)
  maximum time:     266.343 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     972

The fallback on iterate(::AbstractArray, ...) appears to be responsible. Defining

import MappedArrays

function Base.iterate(A::MappedArrays.ReadonlyMappedArray, args...)
    r = iterate(A.data, args...)
    r === nothing && return r
    return (A.f(r[1]), r[2])
end

eliminates the gap!

Defining iterate on MappedArray is straightforward, too. I'm not sure what to do for the "Multi" types.

(originally from https://github.com/goretkin/FactorPerElementFunctionPerformance and https://discourse.julialang.org/t/factor-out-field-access-of-abstractarray-and-or-iterator-interface/41493)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions