OptimizationFunction
SciMLBase.OptimizationFunction
— Typestruct OptimizationFunction{iip, AD, F, G, FG, H, FGH, HV, C, CJ, CJV, CVJ, CH, HP, CJP, CHP, O, EX, CEX, SYS, LH, LHP, HCV, CJCV, CHCV, LHCV, ID} <: SciMLBase.AbstractOptimizationFunction{iip}
A representation of an objective function f
, defined by:
\[\min_{u} f(u,p)\]
and all of its related functions, such as the gradient of f
, its Hessian, and more. For all cases, u
is the state which in this case are the optimization variables and p
are the fixed parameters or data.
Constructor
OptimizationFunction{iip}(f, adtype::AbstractADType = NoAD();
grad = nothing, hess = nothing, hv = nothing,
cons = nothing, cons_j = nothing, cons_jvp = nothing,
cons_vjp = nothing, cons_h = nothing,
hess_prototype = nothing,
cons_jac_prototype = nothing,
cons_hess_prototype = nothing,
observed = __has_observed(f) ? f.observed : DEFAULT_OBSERVED_NO_TIME,
lag_h = nothing,
hess_colorvec = __has_colorvec(f) ? f.colorvec : nothing,
cons_jac_colorvec = __has_colorvec(f) ? f.colorvec : nothing,
cons_hess_colorvec = __has_colorvec(f) ? f.colorvec : nothing,
lag_hess_colorvec = nothing,
sys = __has_sys(f) ? f.sys : nothing)
Positional Arguments
f(u,p)
: the function to optimize.u
are the optimization variables andp
are fixed parameters or data used in the objective, even if no such parameters are used in the objective it should be an argument in the function. For minibatchingp
can be used to pass in a minibatch, take a look at the tutorial here to see how to do it. This should return a scalar, the loss value, as the return output.adtype
: see the Defining Optimization Functions via AD section below.
Keyword Arguments
grad(G,u,p)
orG=grad(u,p)
: the gradient off
with respect tou
.hess(H,u,p)
orH=hess(u,p)
: the Hessian off
with respect tou
.hv(Hv,u,v,p)
orHv=hv(u,v,p)
: the Hessian-vector product $(d^2 f / du^2) v$.cons(res,u,p)
orres=cons(u,p)
: the constraints function, should mutate the passedres
array with value of thei
th constraint, evaluated at the current values of variables inside the optimization routine. This takes just the function evaluations and the equality or inequality assertion is applied by the solver based on the constraint bounds passed aslcons
anducons
toOptimizationProblem
, in case of equality constraintslcons
anducons
should be passed equal values.cons_j(J,u,p)
orJ=cons_j(u,p)
: the Jacobian of the constraints.cons_jvp(Jv,u,v,p)
orJv=cons_jvp(u,v,p)
: the Jacobian-vector product of the constraints.cons_vjp(Jv,u,v,p)
orJv=cons_vjp(u,v,p)
: the Jacobian-vector product of the constraints.cons_h(H,u,p)
orH=cons_h(u,p)
: the Hessian of the constraints, provided as an array of Hessians withres[i]
being the Hessian with respect to thei
th output oncons
.hess_prototype
: a prototype matrix matching the type that matches the Hessian. For example, if the Hessian is tridiagonal, then an appropriately sizedHessian
matrix can be used as the prototype and optimization solvers will specialize on this structure where possible. Non-structured sparsity patterns should use aSparseMatrixCSC
with a correct sparsity pattern for the Hessian. The default isnothing
, which means a dense Hessian.cons_jac_prototype
: a prototype matrix matching the type that matches the constraint Jacobian. The default isnothing
, which means a dense constraint Jacobian.cons_hess_prototype
: a prototype matrix matching the type that matches the constraint Hessian. This is defined as an array of matrices, wherehess[i]
is the Hessian w.r.t. thei
th output. For example, if the Hessian is sparse, thenhess
is aVector{SparseMatrixCSC}
. The default isnothing
, which means a dense constraint Hessian.lag_h(res,u,sigma,mu,p)
orres=lag_h(u,sigma,mu,p)
: the Hessian of the Lagrangian, wheresigma
is a multiplier of the cost function andmu
are the Lagrange multipliers multiplying the constraints. This can be provided instead ofhess
andcons_h
to solvers that directly use the Hessian of the Lagrangian.hess_colorvec
: a color vector according to the SparseDiffTools.jl definition for the sparsity pattern of thehess_prototype
. This specializes the Hessian construction when using finite differences and automatic differentiation to be computed in an accelerated manner based on the sparsity pattern. Defaults tonothing
, which means a color vector will be internally computed on demand when required. The cost of this operation is highly dependent on the sparsity pattern.cons_jac_colorvec
: a color vector according to the SparseDiffTools.jl definition for the sparsity pattern of thecons_jac_prototype
.cons_hess_colorvec
: an array of color vector according to the SparseDiffTools.jl definition for the sparsity pattern of thecons_hess_prototype
.
When Symbolic Problem Building with ModelingToolkit interface is used the following arguments are also relevant:
observed
: an algebraic combination of optimization variables that is of interest to the user which will be available in the solution. This can be single or multiple expressions.sys
: field that stores theOptimizationSystem
.
Defining Optimization Functions via AD
While using the keyword arguments gives the user control over defining all of the possible functions, the simplest way to handle the generation of an OptimizationFunction
is by specifying an option from ADTypes.jl which lets the user choose the Automatic Differentiation backend to use for automatically filling in all of the extra functions. For example,
OptimizationFunction(f,AutoForwardDiff())
will use ForwardDiff.jl to define all of the necessary functions. Note that if any functions are defined directly, the auto-AD definition does not overwrite the user's choice.
Each of the AD-based constructors are documented separately via their own dispatches below in the Automatic Differentiation Construction Choice Recommendations section.
iip: In-Place vs Out-Of-Place
For more details on this argument, see the ODEFunction documentation.
specialize: Controlling Compilation and Specialization
For more details on this argument, see the ODEFunction documentation.
Fields
The fields of the OptimizationFunction type directly match the names of the inputs.