Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design #1

Closed
MikeInnes opened this issue Apr 5, 2019 · 3 comments
Closed

Design #1

MikeInnes opened this issue Apr 5, 2019 · 3 comments

Comments

@MikeInnes
Copy link
Member

Have been discussing with @pkofod how to design optimisers that can be used across Flux, Optim.jl and perhaps others. It seems the basic outline of a design in FluxML/Flux.jl#637 is something that Optim can work with. We're currently looking at splitting this into:

state = init(rule, x)
dx', state' = apply(rule, x, dx, state)
x' = update(x, dx')

Some design goals from my side:

  • It should be easy to e.g. specify that structs are optimised by optimising each field.
  • It should be easy to specify how custom structs like Colors are updated (e.g. clamp the values).
  • apply should support state=nothing optimisers in a generic way.
  • We also need an in-place update!, but at this level we don't need to do any in-place/out-of-place detection.
  • Rules should be composable (e.g. weight decay and ADAM).

The current default for update(x, dx) is to calculate x .- dx; this is convenient for ML but could be changed if it's inconvenient for other things (we'll just do the negation as part of the rule).

@xukai92
Copy link

xukai92 commented May 20, 2019

How different is this design from the current optimisers.jl in Flux? We also need to use optimisers in our VI projects. Maybe we can help with this package?

@MikeInnes
Copy link
Member Author

It's designed to be essentially similar but with explicit state, rather than using IdDicts everywhere. I haven't really figured out how to make it convenient yet, though, so help is welcome. One advantage of this design is that it's flexible enough that e.g. state can actually be a whole history of states for L-BFGS and things like that, so it should be capable of expressing VI too, I would expect.

@CarloLucibello
Copy link
Member

this has been done I guess

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants