Gaussian processes with explicit basis functions.
For details on the model, see "Gaussian Processes for Machine Learning" section 2.7 "Incorporating Explicit Basis Functions", particularly page 29 equation 2.45.
The coefficients of the basis are integrated out. Optionally, the kernel amplitude can also be integrated out (e.g. see "Bayesian emulation of complex multi-output and dynamic computer models", Conti and O'Hagan, 2009).
Run the following command from the top directory:
python setup.py install
from basisgp.emulator import Emulator
from basisgp.basis import Linear, Constant
from basisgp.kernel import RBF
# load X and Y data...
model = Emulator(kernel = RBF(dim = X.shape[1]), basis = Linear)
model.set_data(X, Y)
model.optimize(nugget = None, restarts = 10, mucm = True) # mucm = True -> integrate out sigma^2
# mucm = False -> explicitly optimize sigma
# nugget = 1e-4 -> fix nugget to 1e-4
# covariance is defined as `sigma^2 ( k(x,x) + nugget^2 I )`
print("noise stdev:", model.unscale(model.sigma*model.nugget, stdev = True) )
# predict for test set
mean, stdev = model.posterior(Xnew, False) # pointwise stdev
mean, var = model.posterior(Xnew, True) # full covariance
Notes:
- outputs are internally centred and scaled
- the noise stdev (in scaled units) is sigma * nugget
- the covariance definition
sigma^2 ( k(x,x) + nugget^2 I )
allows for sigma to be integrated out. - it may be necessary to fix the nugget to a very small value for numerical stability even when observations are noiseless.
- 09/04/2021: A fairly recent issue in JAX (which may now be solved) may cause issues, solveable by installing an earlier version of JAX. See here: google-research/long-range-arena#7