The AbstractSciMLOperator
Interface
SciMLOperators.AbstractSciMLOperator
— Typeabstract type AbstractSciMLOperator{T}
Subtypes of AbstractSciMLOperator
represent linear, nonlinear, time-dependent operators acting on vectors, or matrix column-vectors. A lazy operator algebra is also defined for AbstractSciMLOperator
s.
Mathematical Notation
An AbstractSciMLOperator
$L$ is an operator which is used to represent the following type of equation:
\[w = L(u,p,t)[v]\]
where L[v]
is the operator application of $L$ on the vector $v$.
Interface
An AbstractSciMLOperator
can be called like a function in the following ways:
L(v, u, p, t)
- Out-of-place application wherev
is the action vector andu
is the update vectorL(w, v, u, p, t)
- In-place application wherew
is the destination,v
is the action vector, andu
is the update vectorL(w, v, u, p, t, α, β)
- In-place application with scaling:w = α*(L*v) + β*w
Operator state can be updated separately from application:
update_coefficients!(L, u, p, t)
for in-place operator updateL = update_coefficients(L, u, p, t)
for out-of-place operator update
SciMLOperators also overloads Base.*
, LinearAlgebra.mul!
, LinearAlgebra.ldiv!
for operator evaluation without updating operator state. An AbstractSciMLOperator
behaves like a matrix in these methods. Allocation-free methods, suffixed with a !
often need cache arrays. To precache an AbstractSciMLOperator
, call the function L = cache_operator(L, input_vector)
.
Overloaded Actions
The behavior of a SciMLOperator
is indistinguishable from an AbstractMatrix
. These operators can be passed to linear solver packages, and even to ordinary differential equation solvers. The list of overloads to the AbstractMatrix
interface includes, but is not limited to, the following:
Base: size, zero, one, +, -, *, /, \, ∘, inv, adjoint, transpose, convert
LinearAlgebra: mul!, ldiv!, lmul!, rmul!, factorize, issymmetric, ishermitian, isposdef
SparseArrays: sparse, issparse
Multidimensional arrays and batching
SciMLOperator can also be applied to AbstractMatrix
subtypes where operator-evaluation is done column-wise.
K = 10
u_mat = rand(N, K)
v_mat = F(u_mat, p, t) # == mul!(v_mat, F, u_mat)
size(v_mat) == (N, K) # true
L
can also be applied to AbstractArray
s that are not AbstractVecOrMat
s so long as their size in the first dimension is appropriate for matrix-multiplication. Internally, SciMLOperator
s reshapes an N
-dimensional array to an AbstractMatrix
, and applies the operator via matrix-multiplication.
Operator update
This package can also be used to write state-dependent, time-dependent, and parameter-dependent operators, whose state can be updated per a user-defined function. The updates can be done in-place, i.e. by mutating the object, or out-of-place, i.e. in a non-mutating, Zygote
-compatible way.
For example,
u = rand(N)
p = rand(N)
t = rand()
# out-of-place update
mat_update_func = (A, u, p, t) -> t * (p * u')
sca_update_func = (a, u, p, t) -> t * sum(p)
M = MatrixOperator(zero(N, N); update_func = mat_update_func)
α = ScalarOperator(zero(Float64); update_func = sca_update_func)
L = α * M
L = cache_operator(L, v)
# L is initialized with zero state
L * v == zeros(N) # true
# update operator state with `(u, p, t)`
L = update_coefficients(L, u, p, t)
# and multiply
L * v != zeros(N) # true
# updates state and evaluates L*v at (u, p, t)
L(v, u, p, t) != zeros(N) # true
The out-of-place evaluation function L(v, u, p, t)
calls update_coefficients
under the hood, which recursively calls the update_func
for each component SciMLOperator
. Therefore, the out-of-place evaluation function is equivalent to calling update_coefficients
followed by Base.*
. Notice that the out-of-place evaluation does not return the updated operator.
On the other hand, the in-place evaluation function, L(w, v, u, p, t)
, mutates L
, and is equivalent to calling update_coefficients!
followed by mul!
. The in-place update behavior works the same way, with a few <!>
s appended here and there. For example,
w = rand(N)
v = rand(N)
u = rand(N)
p = rand(N)
t = rand()
# in-place update
_A = rand(N, N)
_d = rand(N)
mat_update_func! = (A, u, p, t) -> (copy!(A, _A); lmul!(t, A); nothing)
diag_update_func! = (diag, u, p, t) -> copy!(diag, N)
M = MatrixOperator(zero(N, N); update_func! = mat_update_func!)
D = DiagonalOperator(zero(N); update_func! = diag_update_func!)
L = D * M
L = cache_operator(L, v)
# L is initialized with zero state
L * v == zeros(N) # true
# update L in-place
update_coefficients!(L, v, p, t)
# and multiply
mul!(w, v, u, p, t) != zero(N) # true
# updates L in-place, and evaluates w=L*v at (u, p, t)
L(w, v, u, p, t) != zero(N) # true
The update behavior makes this package flexible enough to be used in OrdinaryDiffEq
. As the parameter object p
is often reserved for sensitivity computation via automatic-differentiation, a user may prefer to pass in state information via other arguments. For that reason, we allow update functions with arbitrary keyword arguments.
mat_update_func = (A, u, p, t; scale = 0.0) -> scale * (p * u')
M = MatrixOperator(zero(N, N); update_func = mat_update_func,
accepted_kwargs = (:state,))
M(v, u, p, t) == zeros(N) # true
M(v, u, p, t; scale = 1.0) != zero(N)
Interface API Reference
SciMLOperators.update_coefficients
— Functionupdate_coefficients(L, u, p, t; kwargs...)
Update the state of L
based on u
, input vector, p
parameter object, t
, and keyword arguments. Internally, update_coefficients
calls the user-provided update_func
method for every component operator in L
with the positional arguments (u, p, t)
and keyword arguments corresponding to the symbols provided to the operator via kwarg accepted_kwargs
.
This method is out-of-place, i.e. fully non-mutating and Zygote
-compatible.
The user-provided update_func[!]
must not use u
in its computation. Positional argument (u, p, t)
to update_func[!]
are passed down by update_coefficients[!](L, u, p, t)
, where u
is the input-vector to the composite AbstractSciMLOperator
. For that reason, the values of u
, or even shape, may not correspond to the input expected by update_func[!]
. If an operator's state depends on its input vector, then it is, by definition, a nonlinear operator. We recommend sticking such nonlinearities in FunctionOperator.
This topic is further discussed in (this issue)[https://github.com/SciML/SciMLOperators.jl/issues/159].
Example
using SciMLOperators
mat_update_func = (A, u, p, t; scale = 1.0) -> p * p' * scale * t
M = MatrixOperator(zeros(4,4); update_func = mat_update_func,
accepted_kwargs = (:state,))
L = M + IdentityOperator(4)
u = rand(4)
p = rand(4)
t = 1.0
# Update the operator to `(u,p,t)` and apply it to `v`
L = update_coefficients(L, u, p, t; scale = 2.0)
result = L * v
# Or use the interface which separrates the update from the application
result = L(v, u, p, t; scale = 2.0)
SciMLOperators.update_coefficients!
— Functionupdate_coefficients!(L, u, p, t; kwargs...)
Update in-place the state of L
based on u
, input vector, p
parameter object, t
, and keyword arguments. Internally, update_coefficients!
calls the user-provided mutating update_func!
method for every component operator in L
with the positional arguments (u, p, t)
and keyword arguments corresponding to the symbols provided to the operator via kwarg accepted_kwargs
.
The user-provided update_func[!]
must not use u
in its computation. Positional argument (u, p, t)
to update_func[!]
are passed down by update_coefficients[!](L, u, p, t)
, where u
is the input-vector to the composite AbstractSciMLOperator
. For that reason, the values of u
, or even shape, may not correspond to the input expected by update_func[!]
. If an operator's state depends on its input vector, then it is, by definition, a nonlinear operator. We recommend sticking such nonlinearities in FunctionOperator.
This topic is further discussed in (this issue)[https://github.com/SciML/SciMLOperators.jl/issues/159].
Example
using SciMLOperators
_A = rand(4, 4)
mat_update_func! = (L, u, p, t; scale = 1.0) -> copy!(A, _A)
M = MatrixOperator(zeros(4,4); update_func! = mat_update_func!)
L = M + IdentityOperator(4)
u = rand(4)
p = rand(4)
t = 1.0
update_coefficients!(L, u, p, t)
L * v
SciMLOperators.cache_operator
— Functioncache_operator(L, u)
Allocate caches for L
for in-place evaluation with u
-like input vectors.
SciMLOperators.concretize
— Functionconcretize(L) -> AbstractMatrix
concretize(L) -> Number
Convert SciMLOperator
to a concrete type via eager fusion. This method is a no-op for types that are already concrete.
Traits
SciMLOperators.isconstant
— Functionisconstant(_)
Checks if an L
's state is constant or needs to be updated by calling update_coefficients
.
SciMLOperators.iscached
— Functioniscached(L)
Checks whether L
has preallocated caches for inplace evaluations.
Check if SciMLOperator
L
has preallocated cache-arrays for in-place computation.
SciMLOperators.issquare
— FunctionChecks if size(L, 1) == size(L, 2)
.
SciMLOperators.islinear
— Functionislinear(_)
Checks if L
is a linear operator.
SciMLOperators.isconvertible
— Functionisconvertible(L) -> Bool
Checks if L
can be cheaply converted to an AbstractMatrix
via eager fusion.
SciMLOperators.has_adjoint
— Functionhas_adjoint(L)
Check if adjoint(L)
is lazily defined.
SciMLOperators.has_expmv
— Functionhas_expmv(L)
Check if expmv(L, v, t)
, equivalent to exp(t * A) * v
, is defined for Number
t
, and AbstractArray
u
of appropriate size.
SciMLOperators.has_expmv!
— Functionhas_expmv!(L)
Check if expmv!(w, L, v, t)
, equivalent to mul!(w, exp(t * A), v)
, is defined for Number
t
, and AbstractArray
s w, v
of appropriate sizes.
SciMLOperators.has_exp
— Functionhas_exp(L)
Check if exp(L)
is defined lazily defined.
SciMLOperators.has_mul
— Functionhas_mul(L)
Check if L * v
is defined for AbstractArray
u
of appropriate size.
SciMLOperators.has_mul!
— Functionhas_mul!(L)
Check if mul!(w, L, v)
is defined for AbstractArray
s w, v
of appropriate sizes.
SciMLOperators.has_ldiv
— Functionhas_ldiv(L)
Check if L \ v
is defined for AbstractArray
v
of appropriate size.
SciMLOperators.has_ldiv!
— Functionhas_ldiv!(L)
Check if ldiv!(w, L, v)
is defined for AbstractArray
s w, v
of appropriate sizes.
Note About Affine Operators
Affine operators are operators that have the action Q*x = A*x + b
. These operators have no matrix representation, since if there was, it would be a linear operator instead of an affine operator. You can only represent an affine operator as a linear operator in a dimension of one larger via the operation: [A b] * [u;1]
, so it would require something modified to the input as well. As such, affine operators are a distinct generalization of linear operators.
While it seems like it might doom the idea of using matrix-free affine operators, it turns out that affine operators can be used in all cases where matrix-free linear solvers are used due to an easy generalization of the standard convergence proofs. If Q is the affine operator $Q(x) = Ax + b$, then solving $Qx = c$ is equivalent to solving $Ax + b = c$ or $Ax = c-b$. If you now do this same “plug-and-chug” handling of the affine operator into the GMRES/CG/etc. convergence proofs, move the affine part to the rhs residual, and show it converges to solving $Ax = c-b$, and thus GMRES/CG/etc. solves $Q(x) = c$ for an affine operator properly.
That same trick can be used mostly anywhere you would've had a linear operator to extend the proof to affine operators, so then $exp(A*t)*v$ operations via Krylov methods work for A being affine as well, and all sorts of things. Thus, affine operators have no matrix representation, but they are still compatible with essentially any Krylov method, which would otherwise be compatible with matrix-free representations, hence their support in the SciMLOperators interface.
Note about keyword arguments to update_coefficients!
In rare cases, an operator may be used in a context where additional state is expected to be provided to update_coefficients!
beyond u
, p
, and t
. In this case, the operator may accept this additional state through arbitrary keyword arguments to update_coefficients!
. When the caller provides these, they will be recursively propagated downwards through composed operators just like u
, p
, and t
, and provided to the operator. For the premade SciMLOperators, one can specify the keyword arguments used by an operator with an accepted_kwargs
argument (by default, none are passed).
In the below example, we create an operator that gleefully ignores u
, p
, and t
and uses its own special scaling.
using SciMLOperators
γ = ScalarOperator(0.0;
update_func = (a, u, p, t; my_special_scaling) -> my_special_scaling,
accepted_kwargs = (:my_special_scaling,))
# Update coefficients, then apply operator
update_coefficients!(γ, nothing, nothing, nothing; my_special_scaling = 7.0)
@show γ * [2.0]
# Use operator application form
@show γ([2.0], nothing, nothing, nothing; my_special_scaling = 5.0)
γ * [2.0] = [14.0]
γ([2.0], nothing, nothing, nothing; my_special_scaling = 5.0) = [10.0]