Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API] design for generic optimizer #93

Open
fkiraly opened this issue Sep 20, 2024 · 7 comments
Open

[API] design for generic optimizer #93

fkiraly opened this issue Sep 20, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@fkiraly
Copy link
Contributor

fkiraly commented Sep 20, 2024

From our earlier discussion.

I would design a generic interface as follows:

  • there are two (interface) classes, the BaseOptimizer and the BaseExperiment (or BaseEvaluator etc). Both inherit from skbase BaseObject, so provide a dataclass-like, sklearn-like composable interface.
    • in particular, __init__ args always must be explicit, and never use positional or kwargs.
    • the skbase tag system can be used to collect all the tags, e.g., from GFO things like the type of optimizer (particle etc), or whether it is computationally expensive, or soft dependencies required for it.
  • the BaseExperiment has a score method, it has the same signature as your "model" currently; its __call__ also redirects to score, so it can be used with the current signature. That's the "basic" interface, but we could also add an interface for gradients, to also cover gradient-based optimizers!
    • an subclass of BaseExperiment could, for instance, be: evaluating an sklearn classifier by cv on a dataset, so it could be SklearnExperiment(my_randomforest, X, y, KFold(5).
  • the BaseOptimizer has __init__, which passes parameters only, and add_search, which has almost the current signature - it takes a BaseExperiment descendant instance, and one more object which configures the search space. Search behaviour like n_iter would not be passed in add_search, but should be an __init__ arg.
    • to execute the search, I would suggest a fit method, as that would be compliant with multiple API naming choices, though I would not mind run or optimize etc. This method sets attributes to self, ending in _, wo they are visible via get_fitted_params

Thoughts?

@fkiraly fkiraly added the enhancement New feature or request label Sep 20, 2024
@fkiraly
Copy link
Contributor Author

fkiraly commented Sep 20, 2024

PS: I'm happy to try write this if you would like me to try? Not right now due to being busy, but maybe early Oct.

@SimonBlanke
Copy link
Owner

Hello @fkiraly,

I took some time to understand these changes you are proposing. I will show you how interpreted them, so please correct me if I misunderstood something.

It appears, that you want to change the API of Hyperactive, so that it is possible to use different optimization backends. This also necessitates to implement an interface (Experiment), that is adapted to certain optimizers.
For example: a optimizer that uses gradients also requires an experimental setup, that supports gradients.

I would be open to the possibility to optionally select other optimization backends for the experiment.

so it could be SklearnExperiment(my_randomforest, X, y, KFold(5)

I do not understand this example, because it would already be covered by the sklearn integration. A separate experiment-class for each package (sklearn, xgboost, pytorch) would heavily decrease the flexibility of the interface.

I would suggest a fit method, as that would be compliant with multiple API naming choices

Hyperactive does not fit an estimator at that point in the api. It runs the optimization setup. The fit-method makes sense in the sklearn integration.

@fkiraly
Copy link
Contributor Author

fkiraly commented Sep 27, 2024

A separate experiment-class for each package (sklearn, xgboost, pytorch) would heavily decrease the flexibility of the interface.

This would be used only for adaptation inside the sklearn adapter. The optimizer optimises the experiment.

You would need at least one experiment per package or unified API, no? But not one per unified API and optimizer.

Hyperactive does not fit an estimator at that point in the api.

I just mean, why not call optimize, instead, fit. It is just a naming question, since fit is used so often for data ingestion of any kind.

@fkiraly
Copy link
Contributor Author

fkiraly commented Sep 28, 2024

I think there is a small degree of miscommunication - would you like me to write a design document, or a draft PR (for demo purpose only)?

@SimonBlanke
Copy link
Owner

I think there is a small degree of miscommunication - would you like me to write a design document, or a draft PR (for demo purpose only)?

That would be great! :-)

@fkiraly
Copy link
Contributor Author

fkiraly commented Oct 30, 2024

Partially implemented here - feedback appreciated!

#95

@SimonBlanke
Copy link
Owner

Relevant comment: #85 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants