Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimensionality issues of the gradient with linear-multi-fidelity model when evaluating Expected improvement with context variables #469

Open
ToennisStef opened this issue Jan 12, 2025 · 0 comments

Comments

@ToennisStef
Copy link

ToennisStef commented Jan 12, 2025

Hello,
I want to do a Bayesian optimization of a 2-dimensional multi-fidelity test problem. I therefore defined a linear multi-fidelity model and set up the acquisition function. I also set a context variable for the fidelity parameter. I set the fidelity parameter value for the optimization to the highest fidelity. But when trying to optimize the acquisition function an error returned immediately, originating form the scipy l-bfgs algorithm. I noticed that the shape of the gradient of the acquisition function was (1,2,2) instead of (2) or (1,2). I noticed the dimensionality error came from the expected improvement acquisition function definition specificially from this line:

dimprovement_dx = dstandard_deviation_dx * pdf - cdf * dmean_dx

the shape of dmean_dx was (1,2,1), this results in the dimprovement_dx.shape being (1,2,2).
quick fix for the evaluation was to add this line of code:

dimprovement_dx = np.diagonal(dimprovement_dx).T

which worked now for my 2-dimensional test problem. It will hopefully also still work for 1-dimensional test problems, but i think for higher dimensional problems it will not work, so its probably advisable to just fix the dimensionality of dmean_dx.

Code to replicat the error is here:

import numpy as np
import GPy

from emukit.core import ParameterSpace, ContinuousParameter, DiscreteParameter
from emukit.bayesian_optimization.acquisitions import ExpectedImprovement
from emukit.core.optimization import GradientAcquisitionOptimizer
import emukit.multi_fidelity
from emukit.model_wrappers.gpy_model_wrappers import GPyMultiOutputWrapper
from emukit.multi_fidelity.models import GPyLinearMultiFidelityModel


X_train = np.array([[ 0.13473929, -0.32606444,  0.        ],
       [-1.53102551,  1.4245339 ,  0.        ],
       [-0.48998839, -1.29864283,  0.        ],
       [ 1.84718222,  0.45174302,  0.        ],
       [ 1.32512917, -1.85203682,  0.        ],
       [-0.96369396,  0.89760785,  0.        ],
       [-1.04751664, -0.76983995,  0.        ],
       [ 0.66267413,  1.9795312 ,  0.        ],
       [ 0.80437702, -1.13627663,  0.        ],
       [-1.40778594,  0.11410064,  0.        ],
       [-0.5741273 , -0.24182373,  0.        ],
       [ 1.21662989,  1.00876794,  0.        ],
       [ 1.73703451, -0.66999665,  0.        ],
       [-0.0981639 ,  1.57938117,  0.        ],
       [-1.88964734, -1.70531898,  0.        ],
       [ 0.27418341,  0.54433427,  0.        ],
       [ 0.47281786, -1.60549965,  0.        ],
       [-1.86141352,  0.64387054,  0.        ],
       [-0.14885246, -0.51576477,  0.        ],
       [ 1.51398075,  1.73388086,  0.        ],
       [-1.43824131,  0.65202268,  1.        ],
       [ 0.12916456, -1.85884469,  1.        ],
       [ 1.68246282,  1.32535729,  1.        ],
       [-0.99837652, -0.18159998,  1.        ],
       [-0.21917494,  1.69523472,  1.        ],
       [ 1.41106617, -0.80000571,  1.        ],
       [ 0.90085879,  0.26700687,  1.        ],
       [-1.71775573, -1.22432354,  1.        ],
       [-1.8063743 ,  1.13146007,  1.        ],
       [ 0.62229842, -0.36158118,  1.        ],
       [ 1.06610087,  0.82823509,  1.        ],
       [-0.25701166, -1.66090378,  1.        ],
       [-0.52450227,  0.21373441,  1.        ],
       [ 1.84139629, -1.29492998,  1.        ],
       [ 0.34669023,  1.76207188,  1.        ],
       [-1.03858617, -0.74269003,  1.        ],
       [-1.24743496,  1.88256894,  1.        ],
       [ 0.43151265, -0.61242308,  1.        ],
       [ 1.88211933,  0.0793515 ,  1.        ],
       [-0.69120659, -1.41173808,  1.        ],
       [ 1.75531639,  0.01006163,  2.        ],
       [-0.16598199, -1.41704852,  2.        ],
       [-1.39763355,  1.91486136,  2.        ],
       [ 0.55827226, -0.50934682,  2.        ],
       [ 0.28320662,  1.19174326,  2.        ],
       [-1.63918466, -0.34873588,  2.        ],
       [-0.92222783,  0.8503596 ,  2.        ],
       [ 1.02817909, -1.6948546 ,  2.        ],
       [ 1.41036964,  1.53844309,  2.        ],
       [-0.50848947, -0.88097757,  2.        ],
       [-1.80314409,  0.38062028,  2.        ],
       [ 0.15129807, -1.03565238,  2.        ],
       [ 0.87819937,  0.72781515,  2.        ],
       [-1.04663165, -1.82035536,  2.        ],
       [-0.2667013 ,  1.32405309,  2.        ],
       [ 1.68516784, -0.22909473,  2.        ],
       [ 1.51735547,  1.44219314,  2.        ],
       [-0.43482649, -0.09917109,  2.        ],
       [-1.12851415,  0.59992489,  2.        ],
       [ 0.79601932, -1.94440427,  2.        ]])

Y_train = np.array([[1.24180338e+00],
       [9.84373290e+00],
       [2.44074890e+01],
       [8.03973669e+01],
       [1.22073038e+02],
       [4.11819471e-01],
       [3.71351316e+01],
       [2.31060430e+01],
       [3.05597094e+01],
       [3.80808397e+01],
       [3.57512288e+00],
       [2.20424659e+00],
       [1.25195017e+02],
       [2.49581510e+01],
       [3.08139883e+02],
       [2.26366363e+00],
       [3.26536686e+01],
       [8.85821224e+01],
       [3.01479215e+00],
       [3.07271610e+00],
       [1.01034474e+02],
       [1.81278479e+02],
       [1.25356617e+02],
       [7.10193407e+01],
       [1.38096216e+02],
       [4.00845553e+02],
       [2.26572223e+01],
       [8.73085407e+02],
       [2.27545534e+02],
       [3.47838964e+01],
       [1.32073600e+01],
       [1.53116297e+02],
       [2.52078040e+00],
       [1.11225838e+03],
       [1.39240704e+02],
       [1.67680976e+02],
       [5.57808268e+00],
       [3.78929038e+01],
       [6.13716074e+02],
       [1.81275800e+02],
       [9.43720052e+02],
       [2.10046008e+02],
       [5.89701164e+00],
       [6.76016433e+01],
       [1.24065302e+02],
       [9.28489813e+02],
       [3.69496190e+00],
       [7.57354963e+02],
       [2.04814007e+01],
       [1.32130481e+02],
       [8.31954239e+02],
       [1.12771727e+02],
       [2.03356232e-01],
       [8.54373681e+02],
       [1.58586264e+02],
       [9.42275205e+02],
       [7.42576705e+01],
       [1.03672545e+01],
       [4.99068690e+01],
       [6.64676317e+02]])

# Define kernels
kernels = [GPy.kern.Matern52(input_dim=2, ARD=True), GPy.kern.Matern52(input_dim=2, ARD=True), GPy.kern.Matern52(input_dim=2, ARD=True)]
lin_mf_kernel = emukit.multi_fidelity.kernels.LinearMultiFidelityKernel(kernels)

# Define Multi-fidelity model
gpy_lin_mf_model = GPyLinearMultiFidelityModel(X_train, Y_train, lin_mf_kernel, n_fidelities=3)

# Fix noise
gpy_lin_mf_model.mixed_noise.Gaussian_noise.fix(0)
gpy_lin_mf_model.mixed_noise.Gaussian_noise_1.fix(0)
gpy_lin_mf_model.mixed_noise.Gaussian_noise_2.fix(0)

# Wrap the model using the given 'GPyMultiOutputWrapper'
lin_mf_model = model = GPyMultiOutputWrapper(gpy_lin_mf_model, n_outputs=3, n_optimization_restarts=20)

## Fit the model
lin_mf_model.optimize()

# Define parameter space
parameter_space = ParameterSpace([ContinuousParameter('x1', 0, 1), ContinuousParameter('x2', 0, 1), DiscreteParameter('fid', [0, 1, 2])])

# Define acquisition function
ei_acquisition = ExpectedImprovement(model = lin_mf_model)

optimizer = GradientAcquisitionOptimizer(parameter_space, num_anchor=1)

new_x, _  = optimizer.optimize(ei_acquisition, context={'fid': 2})

Sorry for the messy training data. I thought it would be easiest to just copy my current training data.
I hope even my fix was kind of correct for 2-dimensional test problems. I am happy for any input on this matter.
Happy new year and best regards,
Stefan Tönnis

As quick edit:
This is the error message:


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
ValueError: too many axes: 2 (effrank=2), expected rank=1


The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[21], [line 19](vscode-notebook-cell:?execution_count=21&line=19)
     [15](vscode-notebook-cell:?execution_count=21&line=15)             return np.min(self.model.Y[(Problem.n_fidelities-1)*10*Problem.dim:], axis=0)
     [17](vscode-notebook-cell:?execution_count=21&line=17) log_cei_acquisition = LogAcquisition(CustomExpectedImprovement(model = lin_mf_model))
---> [19](vscode-notebook-cell:?execution_count=21&line=19) new_x, _  = optimizer.optimize(Log_ei_acquisition, context={'fid': 2})
     [20](vscode-notebook-cell:?execution_count=21&line=20) new_xc, _ = optimizer.optimize(log_cei_acquisition, context={'fid': 2})
     [22](vscode-notebook-cell:?execution_count=21&line=22) acq_val = log_cei_acquisition.evaluate(X_plot_h)

File c:\Users\StefanT\.conda\envs\ML\Lib\site-packages\emukit\core\optimization\acquisition_optimizer.py:53, in AcquisitionOptimizerBase.optimize(self, acquisition, context)
     [50](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/acquisition_optimizer.py:50)     context = dict()
     [51](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/acquisition_optimizer.py:51) context_manager = ContextManager(self.space, context)
---> [53](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/acquisition_optimizer.py:53) max_x, max_value = self._optimize(acquisition, context_manager)
     [55](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/acquisition_optimizer.py:55) # Optimization might not match any encoding exactly
     [56](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/acquisition_optimizer.py:56) # Rounding operation here finds the closest encoding
     [57](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/acquisition_optimizer.py:57) rounded_max_x = self.space.round(max_x)

File c:\Users\StefanT\.conda\envs\ML\Lib\site-packages\emukit\core\optimization\gradient_acquisition_optimizer.py:72, in GradientAcquisitionOptimizer._optimize(self, acquisition, context_manager)
     [70](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/gradient_acquisition_optimizer.py:70) optimized_points = []
     [71](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/gradient_acquisition_optimizer.py:71) for a in anchor_points:
---> [72](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/gradient_acquisition_optimizer.py:72)     optimized_point = apply_optimizer(
     [73](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/gradient_acquisition_optimizer.py:73)         optimizer, a, space=self.space, f=f, df=None, f_df=f_df, context_manager=context_manager
     [74](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/gradient_acquisition_optimizer.py:74)     )
     [75](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/gradient_acquisition_optimizer.py:75)     optimized_points.append(optimized_point)
     [77](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/gradient_acquisition_optimizer.py:77) x_min, fx_min = min(optimized_points, key=lambda t: t[1])

File c:\Users\StefanT\.conda\envs\ML\Lib\site-packages\emukit\core\optimization\optimizer.py:134, in apply_optimizer(optimizer, x0, space, f, df, f_df, context_manager)
    [130](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:130) else:
    [131](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:131)     f_df_no_context = problem.f_df_no_context
--> [134](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:134) optimized_x, _ = optimizer.optimize(problem.x0_no_context, f_no_context, df_no_context, f_df_no_context)
    [136](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:136) # Add context and round according to the type of variables of the design space
    [137](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:137) suggested_x_with_context = add_context(optimized_x)

File c:\Users\StefanT\.conda\envs\ML\Lib\site-packages\emukit\core\optimization\optimizer.py:73, in OptLbfgs.optimize(self, x0, f, df, f_df)
     [69](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:69)     res = scipy.optimize.fmin_l_bfgs_b(
     [70](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:70)         f, x0=x0, bounds=self.bounds, approx_grad=True, maxiter=self.max_iterations
     [71](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:71)     )
     [72](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:72) else:
---> [73](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:73)     res = scipy.optimize.fmin_l_bfgs_b(_f_df, x0=x0, bounds=self.bounds, maxiter=self.max_iterations)
     [75](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:75) # We check here if the the optimizer moved. It it didn't we report x0 and f(x0) as scipy can return NaNs
     [76](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/emukit/core/optimization/optimizer.py:76) if res[2]["task"] == b"ABNORMAL_TERMINATION_IN_LNSRCH":

File c:\Users\StefanT\.conda\envs\ML\Lib\site-packages\scipy\optimize\_lbfgsb_py.py:199, in fmin_l_bfgs_b(func, x0, fprime, args, approx_grad, bounds, m, factr, pgtol, epsilon, iprint, maxfun, maxiter, disp, callback, maxls)
    [187](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:187) callback = _wrap_callback(callback)
    [188](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:188) opts = {'disp': disp,
    [189](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:189)         'iprint': iprint,
    [190](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:190)         'maxcor': m,
   (...)
    [196](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:196)         'callback': callback,
    [197](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:197)         'maxls': maxls}
--> [199](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:199) res = _minimize_lbfgsb(fun, x0, args=args, jac=jac, bounds=bounds,
    [200](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:200)                        **opts)
    [201](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:201) d = {'grad': res['jac'],
    [202](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:202)      'task': res['message'],
    [203](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:203)      'funcalls': res['nfev'],
    [204](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:204)      'nit': res['nit'],
    [205](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:205)      'warnflag': res['status']}
    [206](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:206) f = res['fun']

File c:\Users\StefanT\.conda\envs\ML\Lib\site-packages\scipy\optimize\_lbfgsb_py.py:360, in _minimize_lbfgsb(fun, x0, args, jac, bounds, disp, maxcor, ftol, gtol, eps, maxfun, maxiter, iprint, callback, maxls, finite_diff_rel_step, **unknown_options)
    [358](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:358) g = g.astype(np.float64)
    [359](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:359) # x, f, g, wa, iwa, task, csave, lsave, isave, dsave = \
--> [360](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:360) _lbfgsb.setulb(m, x, low_bnd, upper_bnd, nbd, f, g, factr,
    [361](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:361)                pgtol, wa, iwa, task, iprint, csave, lsave,
    [362](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:362)                isave, dsave, maxls)
    [363](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:363) task_str = task.tobytes()
    [364](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:364) if task_str.startswith(b'FG'):
    [365](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:365)     # The minimization routine wants f and g at the current x.
    [366](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:366)     # Note that interruptions due to maxfun are postponed
    [367](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:367)     # until the completion of the current minimization iteration.
    [368](file:///C:/Users/StefanT/.conda/envs/ML/Lib/site-packages/scipy/optimize/_lbfgsb_py.py:368)     # Overwrite f and g:

ValueError: failed in converting 7th argument `g' of _lbfgsb.setulb to C/Fortran array

The function call is a bit different also i used the log expected improvement but the problem/error is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant