Forecasting Mauna Loa CO2 dataset with `SpectralMixtureKernel` #1825

patel-zeel · 2021-11-16T09:20:06Z

patel-zeel
Nov 16, 2021

I was reading the paper Gaussian Process Kernels for Pattern Discovery and Extrapolation which is implemented as SpectralMixtureKernel in gpytorch. I am trying to fit the Mauna Loa dataset as shown in Figure 1a in the paper, but getting the following fit as of now:

My fit
Figure 1a:

I believe I am missing some steps to reproduce Figure 1a. exactly. Can someone help me with this?

Attaching the code I have tried for this:

import numpy as np
import pandas as pd
import torch
import gpytorch
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

# Config
n_restarts = 10
n_iter = 100

# Load dataset
# data = pd.read_csv(
#     "https://raw.githubusercontent.com/patel-zeel/Adhoc/master/data/co2.csv"
# )
data = pd.read_csv("co2.csv")
print("Data is loaded")

# Train test split
X = data["0"].iloc[:290].values.reshape(-1, 1)
X_test = data["0"].iloc[290:].values.reshape(-1, 1)
y = data["1"].iloc[:290].values
y_test = data["1"].iloc[290:].values

# Scaling the dataset
Xscaler = StandardScaler()
X = Xscaler.fit_transform(X)
X_test = Xscaler.transform(X_test)

yscaler = StandardScaler()
y = yscaler.fit_transform(y.reshape(-1, 1)).ravel()
y_test = yscaler.transform(y_test.reshape(-1, 1)).ravel()

# convert to torch tensors
X = torch.tensor(X).to(torch.float32)
X_test = torch.tensor(X_test).to(torch.float32)
y = torch.tensor(y).to(torch.float32)
y_test = torch.tensor(y_test).to(torch.float32)


# Spectral mixture model from gpytorch
class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.SpectralMixtureKernel(num_mixtures=10)

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)


# Defining the model and optimizer
likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = ExactGPModel(X, y, likelihood)
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

# Training the model
model.train()
likelihood.train()
best_loss = np.inf
for restart in range(n_restarts):
    torch.manual_seed(restart)
    # Initializing the model
    for param in model.parameters():
        torch.nn.init.normal_(param, mean=0, std=0.1)

    for i in range(n_iter):
        optimizer.zero_grad()
        output = model(X)
        loss = -mll(output, y)
        # print(loss.item())
        loss.backward()
        optimizer.step()

    if loss.item() < best_loss:
        best_loss = loss.item()
        best_state = model.state_dict()
    print("restart", restart, "loss", loss.item(), "best_loss", best_loss)

# Activate best model state
model.load_state_dict(best_state)

# test the model
model.eval()
likelihood.eval()
with torch.no_grad(), gpytorch.settings.fast_pred_var():
    preds = likelihood(model(X_test)).mean.cpu().numpy()

# plot the data

plt.plot(X, y, label="train")
plt.plot(X_test, y_test, label="test")
plt.plot(X_test, preds, label="prediction")
plt.legend()
plt.show()

Answered by wjmaddox

Nov 16, 2021

Ugh, this is pretty annoying because the optimization is so unstable, see Appendix D of this work. I was ultimately able to reproduce the results (but inconsistently, not sure why this is the case) with the following changes:

switched everything to double
using 4 mixtures (which is what Andrew originally used, see the code here )
the empirical spectrum initialization in the SM kernel self.covar_module.initialize_from_data_empspect(train_x, train_y) in the initialization (also important)
using a second order optimizer with 10 random restarts (also used to some extent by Andrew)

from botorch import fit_gpytorch_model
fit_gpytorch_model(mll, max_retries=10);

View full answer

wjmaddox · 2021-11-16T13:49:39Z

wjmaddox
Nov 16, 2021
Collaborator

Ugh, this is pretty annoying because the optimization is so unstable, see Appendix D of this work. I was ultimately able to reproduce the results (but inconsistently, not sure why this is the case) with the following changes:

switched everything to double
using 4 mixtures (which is what Andrew originally used, see the code here )
the empirical spectrum initialization in the SM kernel self.covar_module.initialize_from_data_empspect(train_x, train_y) in the initialization (also important)
using a second order optimizer with 10 random restarts (also used to some extent by Andrew)

from botorch import fit_gpytorch_model
fit_gpytorch_model(mll, max_retries=10);

3 replies

patel-zeel Nov 17, 2021
Author

Thank you for the detailed clarification @wjmaddox. I applied the following changes but was still not able to reproduce the results:

Converted everything to double
Using 4 mixtures
Using self.covar_module.initialize_from_data_empspect(train_x, train_y)
Using fit_gpytorch_model

I am attaching the revised code here. Can you please help me with this or share your code if possible?

import numpy as np
import pandas as pd
import torch
import gpytorch
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
from botorch import fit_gpytorch_model

# Config
n_restarts = 1
n_iter = 100

# Load dataset
# data = pd.read_csv(
#     "https://raw.githubusercontent.com/patel-zeel/Adhoc/master/data/co2.csv"
# )
data = pd.read_csv("co2.csv")
print("Data is loaded")

# Train test split
X = data["0"].iloc[:290].values.reshape(-1, 1)
X_test = data["0"].iloc[290:].values.reshape(-1, 1)
y = data["1"].iloc[:290].values
y_test = data["1"].iloc[290:].values

# Scaling the dataset
Xscaler = StandardScaler()
X = Xscaler.fit_transform(X)
X_test = Xscaler.transform(X_test)

yscaler = StandardScaler()
y = yscaler.fit_transform(y.reshape(-1, 1)).ravel()
y_test = yscaler.transform(y_test.reshape(-1, 1)).ravel()

# Convert to torch doubles
X = torch.tensor(X).double()
X_test = torch.tensor(X_test).double()
y = torch.tensor(y).double()
y_test = torch.tensor(y_test).double()

# Spectral mixture model from gpytorch
class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.SpectralMixtureKernel(num_mixtures=4)

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)


# Defining the model and optimizer
likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = ExactGPModel(X, y, likelihood)
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
model = model.double()

# Initialize the model covariance hyperparameters
model.covar_module.initialize_from_data_empspect(X, y)

# Train the model
model.train()
likelihood.train()
fit_gpytorch_model(mll, max_retries=50)

# Test the model
model.eval()
likelihood.eval()
with torch.no_grad(), gpytorch.settings.fast_pred_var():
    preds = likelihood(model(X_test)).mean.cpu().numpy()

# Plot the data
plt.figure(figsize=(6, 4))
plt.plot(X, y, label="train", color="blue")
plt.plot(X_test, y_test, label="test", color="lawngreen")
plt.plot(X_test, preds, label="prediction", color="black")
plt.legend()
plt.show()

wjmaddox Nov 17, 2021
Collaborator

Yeah, this is what I used (again, it was pretty inconsistent, like maybe one out of three or four tries that I got the result):

class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.SpectralMixtureKernel(num_mixtures=4)
        self.covar_module.initialize_from_data_empspect(train_x.view(-1), train_y.view(-1))
        # self.covar_module.initialize_from_data(train_x.view(-1), train_y.view(-1))

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

from gpytorch.constraints import GreaterThan

likelihood = gpytorch.likelihoods.GaussianLikelihood(
    noise_constraint=GreaterThan(
        1e-4,
        transform=None,
        initial_value=0.1,
    ),
)

# Defining the model and optimizer
model = ExactGPModel(X.double(), y.double(), likelihood.double()).double()
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)

from botorch import fit_gpytorch_model
fit_gpytorch_model(mll, max_retries=10);

wjmaddox Nov 17, 2021
Collaborator

I suspect that paying more attention to initialization would help but I'm not entirely sure how / why.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forecasting Mauna Loa CO2 dataset with `SpectralMixtureKernel` #1825

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Forecasting Mauna Loa CO2 dataset with SpectralMixtureKernel #1825

patel-zeel Nov 16, 2021

Replies: 1 comment · 3 replies

wjmaddox Nov 16, 2021 Collaborator

patel-zeel Nov 17, 2021 Author

wjmaddox Nov 17, 2021 Collaborator

wjmaddox Nov 17, 2021 Collaborator

Forecasting Mauna Loa CO2 dataset with `SpectralMixtureKernel` #1825

patel-zeel
Nov 16, 2021

Replies: 1 comment 3 replies

wjmaddox
Nov 16, 2021
Collaborator

patel-zeel Nov 17, 2021
Author

wjmaddox Nov 17, 2021
Collaborator

wjmaddox Nov 17, 2021
Collaborator