JAXopt with nonlinear optimization and neural networks #177

mhr · 2022-02-23T19:47:34Z

mhr
Feb 23, 2022

I would like to train a neural network architecture with a nonlinear optimization algorithm like IPOPT or SNOPT as a layer in the architecture, similar in spirit to the MPC layer in https://locuslab.github.io/mpc.pytorch/. It seems like JAXopt’s implicit differentiation for custom fixed point solvers (https://jaxopt.github.io/stable/_autosummary/jaxopt.implicit_diff.custom_fixed_point.html#jaxopt.implicit_diff.custom_fixed_point) is relevant, but I'm not too sure how to make use of it exactly. Or perhaps https://jaxopt.github.io/stable/_autosummary/jaxopt.implicit_diff.custom_root.html#jaxopt-implicit-diff-custom-root? Can you give some guidance as to where to start?

Thanks!

Example copied from https://github.com/mechmotum/cyipopt/blob/master/examples/hs071_scipy_jax.py

import jax
from jax import jit, grad, jacfwd, jacrev
import jax.numpy as jnp
from jaxopt import implicit_diff
from cyipopt import minimize_ipopt
import numpy as np

def objective(x):
    return x[0]*x[3]*jnp.sum(x[:3]) + x[2]

def eq_constraints(x):
    return jnp.sum(x**2) - 40

def ineq_constraints(x):
    return jnp.prod(x) - 25

# jit the functions
obj_jit = jit(objective)
con_eq_jit = jit(eq_constraints)
con_ineq_jit = jit(ineq_constraints)

# build the derivatives and jit them
obj_grad = jit(grad(obj_jit))  # objective gradient
obj_hess = jit(jacrev(jacfwd(obj_jit)))  # objective hessian
con_eq_jac = jit(jacfwd(con_eq_jit))  # jacobian
con_ineq_jac = jit(jacfwd(con_ineq_jit))  # jacobian
con_eq_hess = jacrev(jacfwd(con_eq_jit)) # hessian
con_eq_hessvp = jit(lambda x, v: con_eq_hess(x) * v[0]) # hessian vector-product
con_ineq_hess = jacrev(jacfwd(con_ineq_jit))  # hessian
con_ineq_hessvp = jit(lambda x, v: con_ineq_hess(x) * v[0])  # hessian vector-product

# constraints
cons = [
    {'type': 'eq', 'fun': con_eq_jit, 'jac': con_eq_jac, 'hess': con_eq_hessvp},
    {'type': 'ineq', 'fun': con_ineq_jit, 'jac': con_ineq_jac, 'hess': con_ineq_hessvp},
]

# variable bounds: 1 <= x[i] <= 5
bnds = [(1, 5) for _ in range(4)]

# my guess as to what the objective should be
def complete_objective(x):
    return objective(x) + eq_constraints(x) + ineq_constraints(x)

@implicit_diff.custom_root(jax.grad(complete_objective))
def ipopt_solver(x0):
    # solve
    display_opts = 0 # 5 for full info
    tol = 1e-7
    sol = minimize_ipopt(obj_jit, jac=obj_grad, hess=obj_hess, x0=x0, bounds=bnds,
                         constraints=cons, options={"print_level": display_opts, "tol": tol})
    sol = jnp.array(sol.x)
    return sol

# x0 (input to ipopt_solver(x0)) is the initial value given to the solver, it is the output from the neural net

Answered by Algue-Rythme

Feb 28, 2022

Hi,

It's almost that ! But actually you cannot simply add constraints and objective the way you did: objective(x) + eq_constraints(x) + ineq_constraints(x). It does not make sense.

For one to enable implicit differentiation of constrained optimization problems you need both primal and dual variables (see page 5 in https://arxiv.org/pdf/2105.15183.pdf ). There is a way to do that with a function specially made to handle KKT conditions: jaxopt._src.implicit_diff.make_kkt_optimality_fun. This method can be found here:

jaxopt/jaxopt/_src/implicit_diff.py

Line 330 in b77c9f5

def make_kkt_optimality_fun(obj_fun, eq_fun, ineq_fun=None):

You will find example of usage here

As yo…

View full answer

Algue-Rythme · 2022-02-28T15:37:38Z

Algue-Rythme
Feb 28, 2022
Collaborator

Hi,

It's almost that ! But actually you cannot simply add constraints and objective the way you did: objective(x) + eq_constraints(x) + ineq_constraints(x). It does not make sense.

For one to enable implicit differentiation of constrained optimization problems you need both primal and dual variables (see page 5 in https://arxiv.org/pdf/2105.15183.pdf ). There is a way to do that with a function specially made to handle KKT conditions: jaxopt._src.implicit_diff.make_kkt_optimality_fun. This method can be found here:

jaxopt/jaxopt/_src/implicit_diff.py

Line 330 in b77c9f5

def make_kkt_optimality_fun(obj_fun, eq_fun, ineq_fun=None):

You will find example of usage here

As you can see the optimality fun is build from objective, eq_constraints and ineq_constraints. The optimum of your problem consists of primal variable x but also the dual variables. I am not familiar with your lib but the doc says "The method returns the optimal solution and an info dictionary that contains the status of the algorithm, the value of the constraints multipliers at the solution" so you may want to use that.

In the notebook I sent you above the function solve_with_cvxpy is made differentiable with implicit diff, and wraps the non differentiable code of cvxpy. This function takes four parameters in input: the init_params tuple that contains primal/dual variables. It needs to be present for signature reasons, even if it is unused. It can be useful for warm start by probiding an intiial guess. The three others parameters obj_params, eq_params, ineq_params are the ones for which we use implicit differentiation.

The problem you gave in example do not exhibit such parameters; so there is nothing to differentiate currently (implicit diff is useless here).

Maybe consider replacing 40 by params_eq in eq_constraints, and in this case you will be able to compute the derivative of $x*$ (the optimum) wrt params_eq.

3 replies

mhr Mar 15, 2022
Author

Thanks so much!

mhr Mar 17, 2022
Author

Is it somehow possible to find the gradient of the optimization output WRT init_params too, in addition to obj_params, eq_params, ineq_params, etc.? I suspect not, but I figured it wouldn't hurt to ask.

mblondel Mar 17, 2022
Maintainer

You can only do implicit differentiation with respect to terms that appear in the optimality condition, which is not the case of init_params. To differentiate with respect to init_params, you can use backpropagation through unrolled iterates of your solver or you need to add init_params to your objective function through a proximity term like ||params - init_params||^2, so that init_params will appear in the optimality condition.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JAXopt with nonlinear optimization and neural networks #177

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

JAXopt with nonlinear optimization and neural networks #177

mhr Feb 23, 2022

Replies: 1 comment · 3 replies

Algue-Rythme Feb 28, 2022 Collaborator

mhr Mar 15, 2022 Author

mhr Mar 17, 2022 Author

mblondel Mar 17, 2022 Maintainer

mhr
Feb 23, 2022

Replies: 1 comment 3 replies

Algue-Rythme
Feb 28, 2022
Collaborator

mhr Mar 15, 2022
Author

mhr Mar 17, 2022
Author

mblondel Mar 17, 2022
Maintainer