项目作者: wesselb

项目描述 :
Gaussian process modelling in Python
高级语言: Python
项目地址: git://github.com/wesselb/stheno.git
创建时间: 2017-06-13T11:15:22Z
项目社区:https://github.com/wesselb/stheno

开源协议:MIT License

下载


Stheno

CI
Coverage Status
Latest Docs
Code style: black

Stheno is an implementation of Gaussian process modelling in Python. See
also Stheno.jl.

Check out our post about linear models with Stheno and JAX.

Contents:

Nonlinear Regression in 20 Seconds

  1. >>> import numpy as np
  2. >>> from stheno import GP, EQ
  3. >>> x = np.linspace(0, 2, 10) # Some points to predict at
  4. >>> y = x ** 2 # Some observations
  5. >>> f = GP(EQ()) # Construct Gaussian process.
  6. >>> f_post = f | (f(x), y) # Compute the posterior.
  7. >>> pred = f_post(np.array([1, 2, 3])) # Predict!
  8. >>> pred.mean
  9. <dense matrix: shape=3x1, dtype=float64
  10. mat=[[1. ]
  11. [4. ]
  12. [8.483]]>
  13. >>> pred.var
  14. <dense matrix: shape=3x3, dtype=float64
  15. mat=[[ 8.032e-13 7.772e-16 -4.577e-09]
  16. [ 7.772e-16 9.999e-13 2.773e-10]
  17. [-4.577e-09 2.773e-10 3.313e-03]]>

These custom matrix types are there to accelerate the underlying linear algebra.
To get vanilla NumPy/AutoGrad/TensorFlow/PyTorch/JAX arrays, use B.dense:

  1. >>> from lab import B
  2. >>> B.dense(pred.mean)
  3. array([[1.00000068],
  4. [3.99999999],
  5. [8.4825932 ]])
  6. >>> B.dense(pred.var)
  7. array([[ 8.03246358e-13, 7.77156117e-16, -4.57690943e-09],
  8. [ 7.77156117e-16, 9.99866856e-13, 2.77333267e-10],
  9. [-4.57690943e-09, 2.77333267e-10, 3.31283378e-03]])

Moar?! Then read on!

Installation

  1. pip install stheno

Manual

Note: here is a nicely rendered and more
readable version of the docs.

AutoGrad, TensorFlow, PyTorch, or JAX? Your Choice!

  1. from stheno.autograd import GP, EQ
  1. from stheno.tensorflow import GP, EQ
  1. from stheno.torch import GP, EQ
  1. from stheno.jax import GP, EQ

Model Design

The basic building block is a f = GP(mean=0, kernel, measure=prior), which takes
in a mean, a kernel, and a measure.
The mean and kernel of a GP can be extracted with f.mean and f.kernel.
The measure should be thought of as a big joint distribution that assigns a mean and
a kernel to every variable f.
A measure can be created with prior = Measure().
A GP f can have different means and kernels under different measures.
For example, under some prior measure, f can have an EQ() kernel; but, under some
posterior measure, f has a kernel that is determined by the posterior distribution
of a GP.
We will see later how posterior measures can be constructed.
The measure with which a f = GP(kernel, measure=prior) is constructed can be
extracted with f.measure == prior.
If the keyword argument measure is not set, then automatically a new measure is
created, which afterwards can be extracted with f.measure.

Definition, where prior = Measure():

  1. f = GP(kernel)
  2. f = GP(mean, kernel)
  3. f = GP(kernel, measure=prior)
  4. f = GP(mean, kernel, measure=prior)

GPs that are associated to the same measure can be combined into new GPs, which is
the primary mechanism used to build cool models.

Here’s an example model:

  1. >>> prior = Measure()
  2. >>> f1 = GP(lambda x: x ** 2, EQ(), measure=prior)
  3. >>> f1
  4. GP(<lambda>, EQ())
  5. >>> f2 = GP(Linear(), measure=prior)
  6. >>> f2
  7. GP(0, Linear())
  8. >>> f_sum = f1 + f2
  9. >>> f_sum
  10. GP(<lambda>, EQ() + Linear())
  11. >>> f_sum + GP(EQ()) # Not valid: `GP(EQ())` belongs to a new measure!
  12. AssertionError: Processes GP(<lambda>, EQ() + Linear()) and GP(0, EQ()) are associated to different measures.

To avoid setting the keyword measure for every GP that you create, you can enter
a measure as a context:

  1. >>> with Measure() as prior:
  2. f1 = GP(lambda x: x ** 2, EQ())
  3. f2 = GP(Linear())
  4. f_sum = f1 + f2
  5. >>> prior == f1.measure == f2.measure == f_sum.measure
  6. True

Compositional Design

  • Add and subtract GPs and other objects.

    Example:

    1. >>> GP(EQ(), measure=prior) + GP(Exp(), measure=prior)
    2. GP(0, EQ() + Exp())
    3. >>> GP(EQ(), measure=prior) + GP(EQ(), measure=prior)
    4. GP(0, 2 * EQ())
    5. >>> GP(EQ()) + 1
    6. GP(1, EQ())
    7. >>> GP(EQ()) + 0
    8. GP(0, EQ())
    9. >>> GP(EQ()) + (lambda x: x ** 2)
    10. GP(<lambda>, EQ())
    11. >>> GP(2, EQ(), measure=prior) - GP(1, EQ(), measure=prior)
    12. GP(1, 2 * EQ())
  • Multiply GPs and other objects.

    Warning:
    The product of two GPs it not a Gaussian process.
    Stheno approximates the resulting process by moment matching.

    Example:

    1. >>> GP(1, EQ(), measure=prior) * GP(1, Exp(), measure=prior)
    2. GP(<lambda> + <lambda> + -1 * 1, <lambda> * Exp() + <lambda> * EQ() + EQ() * Exp())
    3. >>> 2 * GP(EQ())
    4. GP(2, 2 * EQ())
    5. >>> 0 * GP(EQ())
    6. GP(0, 0)
    7. >>> (lambda x: x) * GP(EQ())
    8. GP(0, <lambda> * EQ())
  • Shift GPs.

    Example:

    1. >>> GP(EQ()).shift(1)
    2. GP(0, EQ() shift 1)
  • Stretch GPs.

    Example:

    1. >>> GP(EQ()).stretch(2)
    2. GP(0, EQ() > 2)
  • Select particular input dimensions.

    Example:

    1. >>> GP(EQ()).select(1, 3)
    2. GP(0, EQ() : [1, 3])
  • Transform the input.

    Example:

    1. >>> GP(EQ()).transform(f)
    2. GP(0, EQ() transform f)
  • Numerically take the derivative of a GP.
    The argument specifies which dimension to take the derivative with respect
    to.

    Example:

    1. >>> GP(EQ()).diff(1)
    2. GP(0, d(1) EQ())
  • Construct a finite difference estimate of the derivative of a GP.
    See Measure.diff_approx for a description of the arguments.

    Example:

    1. >>> GP(EQ()).diff_approx(deriv=1, order=2)
    2. GP(50000000.0 * (0.5 * EQ() + 0.5 * ((-0.5 * (EQ() shift (0.0001414213562373095, 0))) shift (0, -0.0001414213562373095)) + 0.5 * ((-0.5 * (EQ() shift (0, 0.0001414213562373095))) shift (-0.0001414213562373095, 0))), 0)
  • Construct the Cartesian product of a collection of GPs.

    Example:

    1. >>> prior = Measure()
    2. >>> f1, f2 = GP(EQ(), measure=prior), GP(EQ(), measure=prior)
    3. >>> cross(f1, f2)
    4. GP(MultiOutputMean(0, 0), MultiOutputKernel(EQ(), EQ()))

Displaying GPs

GPs have a display method that accepts a formatter.

Example:

  1. >>> print(GP(2.12345 * EQ()).display(lambda x: f"{x:.2f}"))
  2. GP(2.12 * EQ(), 0)

Properties of GPs

Properties of kernels
can be queried on GPs directly.

Example:

  1. >>> GP(EQ()).stationary
  2. True

Naming GPs

It is possible to give a name to a GP.
Names must be strings.
A measure then behaves like a two-way dictionary between GPs and their names.

Example:

  1. >>> prior = Measure()
  2. >>> p = GP(EQ(), name="name", measure=prior)
  3. >>> p.name
  4. 'name'
  5. >>> p.name = "alternative_name"
  6. >>> prior["alternative_name"]
  7. GP(0, EQ())
  8. >>> prior[p]
  9. 'alternative_name'

Finite-Dimensional Distributions

Simply call a GP to construct a finite-dimensional distribution at some inputs.
You can give a second argument, which specifies the variance of additional additive
noise.
After constructing a finite-dimensional distribution, you can compute the mean,
the variance, sample, or compute a logpdf.

Definition, where f is a GP:

  1. f(x) # No additional noise
  2. f(x, noise) # Additional noise with variance `noise`

Things you can do with a finite-dimensional distribution:

  • Use f(x).mean to compute the mean.

  • Use f(x).var to compute the variance.

  • Use f(x).mean_var to compute simultaneously compute the mean and variance.
    This can be substantially more efficient than calling first f(x).mean and then
    f(x).var.

  • Use Normal.sample to sample.

    Definition:

    1. f(x).sample() # Produce one sample.
    2. f(x).sample(n) # Produce `n` samples.
    3. f(x).sample(noise=noise) # Produce one samples with additional noise variance `noise`.
    4. f(x).sample(n, noise=noise) # Produce `n` samples with additional noise variance `noise`.
  • Use f(x).logpdf(y) to compute the logpdf of some data y.

  • Use means, variances = f(x).marginals() to efficiently compute the marginal means
    and marginal variances.

    Example:

    1. >>> f(x).marginals()
    2. (array([0., 0., 0.]), np.array([1., 1., 1.]))
  • Use means, lowers, uppers = f(x).marginal_credible_bounds() to efficiently compute
    the means and the marginal lower and upper 95% central credible region bounds.

    Example:

    1. >>> f(x).marginal_credible_bounds()
    2. (array([0., 0., 0.]), array([-1.96, -1.96, -1.96]), array([1.96, 1.96, 1.96]))
  • Use Measure.logpdf to compute the joint logpdf of multiple observations.

    Definition, where prior = Measure():

    1. prior.logpdf(f(x), y)
    2. prior.logpdf((f1(x1), y1), (f2(x2), y2), ...)
  • Use Measure.sample to jointly sample multiple observations.

    Definition, where prior = Measure():

    1. sample = prior.sample(f(x))
    2. sample1, sample2, ... = prior.sample(f1(x1), f2(x2), ...)

Example:

  1. >>> prior = Measure()
  2. >>> f = GP(EQ(), measure=prior)
  3. >>> x = np.array([0., 1., 2.])
  4. >>> f(x) # FDD without noise.
  5. <FDD:
  6. process=GP(0, EQ()),
  7. input=array([0., 1., 2.]),
  8. noise=<zero matrix: shape=3x3, dtype=float64>
  9. >>> f(x, 0.1) # FDD with noise.
  10. <FDD:
  11. process=GP(0, EQ()),
  12. input=array([0., 1., 2.]),
  13. noise=<diagonal matrix: shape=3x3, dtype=float64
  14. diag=[0.1 0.1 0.1]>>
  15. >>> f(x).mean
  16. array([[0.],
  17. [0.],
  18. [0.]])
  19. >>> f(x).var
  20. <dense matrix: shape=3x3, dtype=float64
  21. mat=[[1. 0.607 0.135]
  22. [0.607 1. 0.607]
  23. [0.135 0.607 1. ]]>
  24. >>> y1 = f(x).sample()
  25. >>> y1
  26. array([[-0.45172746],
  27. [ 0.46581948],
  28. [ 0.78929767]])
  29. >>> f(x).logpdf(y1)
  30. -2.811609567720761
  31. >>> y2 = f(x).sample(2)
  32. array([[-0.43771276, -2.36741858],
  33. [ 0.86080043, -1.22503079],
  34. [ 2.15779126, -0.75319405]]
  35. >>> f(x).logpdf(y2)
  36. array([-4.82949038, -5.40084225])

Prior and Posterior Measures

Conditioning a prior measure on observations gives a posterior measure.
To condition a measure on observations, use Measure.__or__.

Definition, where prior = Measure() and f* are GPs:

  1. post = prior | (f(x, [noise]), y)
  2. post = prior | ((f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)

You can then obtain a posterior process with post(f) and a finite-dimensional
distribution under the posterior with post(f(x)).
Alternatively, the posterior of a process f can be obtained by conditioning f
directly.

Definition, where and f* are GPs:

  1. f_post = f | (f(x, [noise]), y)
  2. f_post = f | ((f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)

Let’s consider an example.
First, build a model and sample some values.

  1. >>> prior = Measure()
  2. >>> f = GP(EQ(), measure=prior)
  3. >>> x = np.array([0., 1., 2.])
  4. >>> y = f(x).sample()

Then compute the posterior measure.

  1. >>> post = prior | (f(x), y)
  2. >>> post(f)
  3. GP(PosteriorMean(), PosteriorKernel())
  4. >>> post(f).mean(x)
  5. <dense matrix: shape=3x1, dtype=float64
  6. mat=[[ 0.412]
  7. [-0.811]
  8. [-0.933]]>
  9. >>> post(f).kernel(x)
  10. <dense matrix: shape=3x3, dtype=float64
  11. mat=[[1.e-12 0.e+00 0.e+00]
  12. [0.e+00 1.e-12 0.e+00]
  13. [0.e+00 0.e+00 1.e-12]]>
  14. >>> post(f(x))
  15. <FDD:
  16. process=GP(PosteriorMean(), PosteriorKernel()),
  17. input=array([0., 1., 2.]),
  18. noise=<zero matrix: shape=3x3, dtype=float64>>
  19. >>> post(f(x)).mean
  20. <dense matrix: shape=3x1, dtype=float64
  21. mat=[[ 0.412]
  22. [-0.811]
  23. [-0.933]]>
  24. >>> post(f(x)).var
  25. <dense matrix: shape=3x3, dtype=float64
  26. mat=[[1.e-12 0.e+00 0.e+00]
  27. [0.e+00 1.e-12 0.e+00]
  28. [0.e+00 0.e+00 1.e-12]]>

We can also obtain the posterior by conditioning f directly:

  1. >>> f_post = f | (f(x), y)
  2. >>> f_post
  3. GP(PosteriorMean(), PosteriorKernel())
  4. >>> f_post.mean(x)
  5. <dense matrix: shape=3x1, dtype=float64
  6. mat=[[ 0.412]
  7. [-0.811]
  8. [-0.933]]>
  9. >>> f_post.kernel(x)
  10. <dense matrix: shape=3x3, dtype=float64
  11. mat=[[1.e-12 0.e+00 0.e+00]
  12. [0.e+00 1.e-12 0.e+00]
  13. [0.e+00 0.e+00 1.e-12]]>
  14. >>> f_post(x)
  15. <FDD:
  16. process=GP(PosteriorMean(), PosteriorKernel()),
  17. input=array([0., 1., 2.]),
  18. noise=<zero matrix: shape=3x3, dtype=float64>>
  19. >>> f_post(x).mean
  20. <dense matrix: shape=3x1, dtype=float64
  21. mat=[[ 0.412]
  22. [-0.811]
  23. [-0.933]]>
  24. >>> f_post(x).var
  25. <dense matrix: shape=3x3, dtype=float64
  26. mat=[[1.e-12 0.e+00 0.e+00]
  27. [0.e+00 1.e-12 0.e+00]
  28. [0.e+00 0.e+00 1.e-12]]>

We can further extend our model by building on the posterior.

  1. >>> g = GP(Linear(), measure=post)
  2. >>> f_sum = post(f) + g
  3. >>> f_sum
  4. GP(PosteriorMean(), PosteriorKernel() + Linear())

However, what we cannot do is mixing the prior and posterior.

  1. >>> f + g
  2. AssertionError: Processes GP(0, EQ()) and GP(0, Linear()) are associated to different measures.

Inducing Points

Stheno supports pseudo-point approximations of posterior distributions with
various approximation methods:

  1. The Variational Free Energy (VFE;
    Titsias, 2009)
    approximation.
    To use the VFE approximation, use PseudoObs.

  2. The Fully Independent Training Conditional (FITC;
    Snelson & Ghahramani, 2006)
    approximation.
    To use the FITC approximation, use PseudoObsFITC.

  3. The Deterministic Training Conditional (DTC;
    Csato & Opper, 2002;
    Seeger et al., 2003)
    approximation.
    To use the DTC approximation, use PseudoObsDTC.

The VFE approximation (PseudoObs) is the approximation recommended to use.
The following definitions and examples will use the VFE approximation with PseudoObs,
but every instance of PseudoObs can be swapped out for PseudoObsFITC or
PseudoObsDTC.

Definition:

  1. obs = PseudoObs(
  2. u(z), # FDD of inducing points
  3. (f(x, [noise]), y) # Observed data
  4. )
  5. obs = PseudoObs(u(z), f(x, [noise]), y)
  6. obs = PseudoObs(u(z), (f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)
  7. obs = PseudoObs((u1(z1), u2(z2), ...), f(x, [noise]), y)
  8. obs = PseudoObs((u1(z1), u2(z2), ...), (f1(x1, [noise1]), y1), (f2(x2, [noise2]), y2), ...)

The approximate posterior measure can be constructed with prior | obs
where prior = Measure() is the measure of your model.
To quantify the quality of the approximation, you can compute the ELBO with
obs.elbo(prior).

Let’s consider an example.
First, build a model and sample some noisy observations.

  1. >>> prior = Measure()
  2. >>> f = GP(EQ(), measure=prior)
  3. >>> x_obs = np.linspace(0, 10, 2000)
  4. >>> y_obs = f(x_obs, 1).sample()

Ouch, computing the logpdf is quite slow:

  1. >>> %timeit f(x_obs, 1).logpdf(y_obs)
  2. 219 ms ± 35.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Let’s try to use inducing points to speed this up.

  1. >>> x_ind = np.linspace(0, 10, 100)
  2. >>> u = f(x_ind) # FDD of inducing points.
  3. >>> %timeit PseudoObs(u, f(x_obs, 1), y_obs).elbo(prior)
  4. 9.8 ms ± 181 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Much better.
And the approximation is good:

  1. >>> PseudoObs(u, f(x_obs, 1), y_obs).elbo(prior) - f(x_obs, 1).logpdf(y_obs)
  2. -3.537934389896691e-10

We finally construct the approximate posterior measure:

  1. >>> post_approx = prior | PseudoObs(u, f(x_obs, 1), y_obs)
  2. >>> post_approx(f(x_obs)).mean
  3. <dense matrix: shape=2000x1, dtype=float64
  4. mat=[[0.469]
  5. [0.468]
  6. [0.467]
  7. ...
  8. [1.09 ]
  9. [1.09 ]
  10. [1.091]]>

Kernels and Means

See MLKernels.

Batched Computation

Stheno supports batched computation.
See MLKernels for a description of how
means and kernels work with batched computation.

Example:

  1. >>> f = GP(EQ())
  2. >>> x = np.random.randn(16, 100, 1)
  3. >>> y = f(x, 1).sample()
  4. >>> logpdf = f(x, 1).logpdf(y)
  5. >>> y.shape
  6. (16, 100, 1)
  7. >>> f(x, 1).logpdf(y).shape
  8. (16,)

Important Remarks

Stheno uses LAB to provide an implementation that is
backend agnostic.
Moreover, Stheno uses an extension of LAB to
accelerate linear algebra with structured linear algebra primitives.
You will encounter these primitives:

  1. >>> k = 2 * Delta()
  2. >>> x = np.linspace(0, 5, 10)
  3. >>> k(x)
  4. <diagonal matrix: shape=10x10, dtype=float64
  5. diag=[2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]>

If you’re using LAB to further process these matrices,
then there is absolutely no need to worry:
these structured matrix types know how to add, multiply, and do other linear algebra
operations.

  1. >>> import lab as B
  2. >>> B.matmul(k(x), k(x))
  3. <diagonal matrix: shape=10x10, dtype=float64
  4. diag=[4. 4. 4. 4. 4. 4. 4. 4. 4. 4.]>

If you’re not using LAB, you can convert these
structured primitives to regular NumPy/TensorFlow/PyTorch/JAX arrays by calling
B.dense (B is from LAB):

  1. >>> import lab as B
  2. >>> B.dense(k(x))
  3. array([[2., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
  4. [0., 2., 0., 0., 0., 0., 0., 0., 0., 0.],
  5. [0., 0., 2., 0., 0., 0., 0., 0., 0., 0.],
  6. [0., 0., 0., 2., 0., 0., 0., 0., 0., 0.],
  7. [0., 0., 0., 0., 2., 0., 0., 0., 0., 0.],
  8. [0., 0., 0., 0., 0., 2., 0., 0., 0., 0.],
  9. [0., 0., 0., 0., 0., 0., 2., 0., 0., 0.],
  10. [0., 0., 0., 0., 0., 0., 0., 2., 0., 0.],
  11. [0., 0., 0., 0., 0., 0., 0., 0., 2., 0.],
  12. [0., 0., 0., 0., 0., 0., 0., 0., 0., 2.]])

Furthermore, before computing a Cholesky decomposition, Stheno always adds a minuscule
diagonal to prevent the Cholesky decomposition from failing due to positive
indefiniteness caused by numerical noise.
You can change the magnitude of this diagonal by changing B.epsilon:

  1. >>> import lab as B
  2. >>> B.epsilon = 1e-12 # Default regularisation
  3. >>> B.epsilon = 1e-8 # Strong regularisation

Examples

The examples make use of Varz and some
utility from WBML.

Simple Regression

Prediction

  1. import matplotlib.pyplot as plt
  2. from wbml.plot import tweak
  3. from stheno import B, GP, EQ
  4. # Define points to predict at.
  5. x = B.linspace(0, 10, 100)
  6. x_obs = B.linspace(0, 7, 20)
  7. # Construct a prior.
  8. f = GP(EQ().periodic(5.0))
  9. # Sample a true, underlying function and noisy observations.
  10. f_true, y_obs = f.measure.sample(f(x), f(x_obs, 0.5))
  11. # Now condition on the observations to make predictions.
  12. f_post = f | (f(x_obs, 0.5), y_obs)
  13. mean, lower, upper = f_post(x).marginal_credible_bounds()
  14. # Plot result.
  15. plt.plot(x, f_true, label="True", style="test")
  16. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  17. plt.plot(x, mean, label="Prediction", style="pred")
  18. plt.fill_between(x, lower, upper, style="pred")
  19. tweak()
  20. plt.savefig("readme_example1_simple_regression.png")
  21. plt.show()

Hyperparameter Optimisation with Varz

Prediction

  1. import lab as B
  2. import matplotlib.pyplot as plt
  3. import torch
  4. from varz import Vars, minimise_l_bfgs_b, parametrised, Positive
  5. from wbml.plot import tweak
  6. from stheno.torch import EQ, GP
  7. # Increase regularisation because PyTorch defaults to 32-bit floats.
  8. B.epsilon = 1e-6
  9. # Define points to predict at.
  10. x = torch.linspace(0, 2, 100)
  11. x_obs = torch.linspace(0, 2, 50)
  12. # Sample a true, underlying function and observations with observation noise `0.05`.
  13. f_true = torch.sin(5 * x)
  14. y_obs = torch.sin(5 * x_obs) + 0.05**0.5 * torch.randn(50)
  15. def model(vs):
  16. """Construct a model with learnable parameters."""
  17. p = vs.struct # Varz handles positivity (and other) constraints.
  18. kernel = p.variance.positive() * EQ().stretch(p.scale.positive())
  19. return GP(kernel), p.noise.positive()
  20. @parametrised
  21. def model_alternative(vs, scale: Positive, variance: Positive, noise: Positive):
  22. """Equivalent to :func:`model`, but with `@parametrised`."""
  23. kernel = variance * EQ().stretch(scale)
  24. return GP(kernel), noise
  25. vs = Vars(torch.float32)
  26. f, noise = model(vs)
  27. # Condition on observations and make predictions before optimisation.
  28. f_post = f | (f(x_obs, noise), y_obs)
  29. prior_before = f, noise
  30. pred_before = f_post(x, noise).marginal_credible_bounds()
  31. def objective(vs):
  32. f, noise = model(vs)
  33. evidence = f(x_obs, noise).logpdf(y_obs)
  34. return -evidence
  35. # Learn hyperparameters.
  36. minimise_l_bfgs_b(objective, vs)
  37. f, noise = model(vs)
  38. # Condition on observations and make predictions after optimisation.
  39. f_post = f | (f(x_obs, noise), y_obs)
  40. prior_after = f, noise
  41. pred_after = f_post(x, noise).marginal_credible_bounds()
  42. def plot_prediction(prior, pred):
  43. f, noise = prior
  44. mean, lower, upper = pred
  45. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  46. plt.plot(x, f_true, label="True", style="test")
  47. plt.plot(x, mean, label="Prediction", style="pred")
  48. plt.fill_between(x, lower, upper, style="pred")
  49. plt.ylim(-2, 2)
  50. plt.text(
  51. 0.02,
  52. 0.02,
  53. f"var = {f.kernel.factor(0):.2f}, "
  54. f"scale = {f.kernel.factor(1).stretches[0]:.2f}, "
  55. f"noise = {noise:.2f}",
  56. transform=plt.gca().transAxes,
  57. )
  58. tweak()
  59. # Plot result.
  60. plt.figure(figsize=(10, 4))
  61. plt.subplot(1, 2, 1)
  62. plt.title("Before optimisation")
  63. plot_prediction(prior_before, pred_before)
  64. plt.subplot(1, 2, 2)
  65. plt.title("After optimisation")
  66. plot_prediction(prior_after, pred_after)
  67. plt.savefig("readme_example12_optimisation_varz.png")
  68. plt.show()

Hyperparameter Optimisation with PyTorch

Prediction

  1. import lab as B
  2. import matplotlib.pyplot as plt
  3. import torch
  4. from wbml.plot import tweak
  5. from stheno.torch import EQ, GP
  6. # Increase regularisation because PyTorch defaults to 32-bit floats.
  7. B.epsilon = 1e-6
  8. # Define points to predict at.
  9. x = torch.linspace(0, 2, 100)
  10. x_obs = torch.linspace(0, 2, 50)
  11. # Sample a true, underlying function and observations with observation noise `0.05`.
  12. f_true = torch.sin(5 * x)
  13. y_obs = torch.sin(5 * x_obs) + 0.05**0.5 * torch.randn(50)
  14. class Model(torch.nn.Module):
  15. """A GP model with learnable parameters."""
  16. def __init__(self, init_var=0.3, init_scale=1, init_noise=0.2):
  17. super().__init__()
  18. # Ensure that the parameters are positive and make them learnable.
  19. self.log_var = torch.nn.Parameter(torch.log(torch.tensor(init_var)))
  20. self.log_scale = torch.nn.Parameter(torch.log(torch.tensor(init_scale)))
  21. self.log_noise = torch.nn.Parameter(torch.log(torch.tensor(init_noise)))
  22. def construct(self):
  23. self.var = torch.exp(self.log_var)
  24. self.scale = torch.exp(self.log_scale)
  25. self.noise = torch.exp(self.log_noise)
  26. kernel = self.var * EQ().stretch(self.scale)
  27. return GP(kernel), self.noise
  28. model = Model()
  29. f, noise = model.construct()
  30. # Condition on observations and make predictions before optimisation.
  31. f_post = f | (f(x_obs, noise), y_obs)
  32. prior_before = f, noise
  33. pred_before = f_post(x, noise).marginal_credible_bounds()
  34. # Perform optimisation.
  35. opt = torch.optim.Adam(model.parameters(), lr=5e-2)
  36. for _ in range(1000):
  37. opt.zero_grad()
  38. f, noise = model.construct()
  39. loss = -f(x_obs, noise).logpdf(y_obs)
  40. loss.backward()
  41. opt.step()
  42. f, noise = model.construct()
  43. # Condition on observations and make predictions after optimisation.
  44. f_post = f | (f(x_obs, noise), y_obs)
  45. prior_after = f, noise
  46. pred_after = f_post(x, noise).marginal_credible_bounds()
  47. def plot_prediction(prior, pred):
  48. f, noise = prior
  49. mean, lower, upper = pred
  50. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  51. plt.plot(x, f_true, label="True", style="test")
  52. plt.plot(x, mean, label="Prediction", style="pred")
  53. plt.fill_between(x, lower, upper, style="pred")
  54. plt.ylim(-2, 2)
  55. plt.text(
  56. 0.02,
  57. 0.02,
  58. f"var = {f.kernel.factor(0):.2f}, "
  59. f"scale = {f.kernel.factor(1).stretches[0]:.2f}, "
  60. f"noise = {noise:.2f}",
  61. transform=plt.gca().transAxes,
  62. )
  63. tweak()
  64. # Plot result.
  65. plt.figure(figsize=(10, 4))
  66. plt.subplot(1, 2, 1)
  67. plt.title("Before optimisation")
  68. plot_prediction(prior_before, pred_before)
  69. plt.subplot(1, 2, 2)
  70. plt.title("After optimisation")
  71. plot_prediction(prior_after, pred_after)
  72. plt.savefig("readme_example13_optimisation_torch.png")
  73. plt.show()

Decomposition of Prediction

Prediction

  1. import matplotlib.pyplot as plt
  2. from wbml.plot import tweak
  3. from stheno import Measure, GP, EQ, RQ, Linear, Delta, Exp, B
  4. B.epsilon = 1e-10
  5. # Define points to predict at.
  6. x = B.linspace(0, 10, 200)
  7. x_obs = B.linspace(0, 7, 50)
  8. with Measure() as prior:
  9. # Construct a latent function consisting of four different components.
  10. f_smooth = GP(EQ())
  11. f_wiggly = GP(RQ(1e-1).stretch(0.5))
  12. f_periodic = GP(EQ().periodic(1.0))
  13. f_linear = GP(Linear())
  14. f = f_smooth + f_wiggly + f_periodic + 0.2 * f_linear
  15. # Let the observation noise consist of a bit of exponential noise.
  16. e_indep = GP(Delta())
  17. e_exp = GP(Exp())
  18. e = e_indep + 0.3 * e_exp
  19. # Sum the latent function and observation noise to get a model for the observations.
  20. y = f + 0.5 * e
  21. # Sample a true, underlying function and observations.
  22. (
  23. f_true_smooth,
  24. f_true_wiggly,
  25. f_true_periodic,
  26. f_true_linear,
  27. f_true,
  28. y_obs,
  29. ) = prior.sample(f_smooth(x), f_wiggly(x), f_periodic(x), f_linear(x), f(x), y(x_obs))
  30. # Now condition on the observations and make predictions for the latent function and
  31. # its various components.
  32. post = prior | (y(x_obs), y_obs)
  33. pred_smooth = post(f_smooth(x))
  34. pred_wiggly = post(f_wiggly(x))
  35. pred_periodic = post(f_periodic(x))
  36. pred_linear = post(f_linear(x))
  37. pred_f = post(f(x))
  38. # Plot results.
  39. def plot_prediction(x, f, pred, x_obs=None, y_obs=None):
  40. plt.plot(x, f, label="True", style="test")
  41. if x_obs is not None:
  42. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  43. mean, lower, upper = pred.marginal_credible_bounds()
  44. plt.plot(x, mean, label="Prediction", style="pred")
  45. plt.fill_between(x, lower, upper, style="pred")
  46. tweak()
  47. plt.figure(figsize=(10, 6))
  48. plt.subplot(3, 1, 1)
  49. plt.title("Prediction")
  50. plot_prediction(x, f_true, pred_f, x_obs, y_obs)
  51. plt.subplot(3, 2, 3)
  52. plt.title("Smooth Component")
  53. plot_prediction(x, f_true_smooth, pred_smooth)
  54. plt.subplot(3, 2, 4)
  55. plt.title("Wiggly Component")
  56. plot_prediction(x, f_true_wiggly, pred_wiggly)
  57. plt.subplot(3, 2, 5)
  58. plt.title("Periodic Component")
  59. plot_prediction(x, f_true_periodic, pred_periodic)
  60. plt.subplot(3, 2, 6)
  61. plt.title("Linear Component")
  62. plot_prediction(x, f_true_linear, pred_linear)
  63. plt.savefig("readme_example2_decomposition.png")
  64. plt.show()

Learn a Function, Incorporating Prior Knowledge About Its Form

Prediction

  1. import matplotlib.pyplot as plt
  2. import tensorflow as tf
  3. import wbml.out as out
  4. from varz.spec import parametrised, Positive
  5. from varz.tensorflow import Vars, minimise_l_bfgs_b
  6. from wbml.plot import tweak
  7. from stheno.tensorflow import B, Measure, GP, EQ, Delta
  8. # Define points to predict at.
  9. x = B.linspace(tf.float64, 0, 5, 100)
  10. x_obs = B.linspace(tf.float64, 0, 3, 20)
  11. @parametrised
  12. def model(
  13. vs,
  14. u_var: Positive = 0.5,
  15. u_scale: Positive = 0.5,
  16. noise: Positive = 0.5,
  17. alpha: Positive = 1.2,
  18. ):
  19. with Measure():
  20. # Random fluctuation:
  21. u = GP(u_var * EQ().stretch(u_scale))
  22. # Construct model.
  23. f = u + (lambda x: x**alpha)
  24. return f, noise
  25. # Sample a true, underlying function and observations.
  26. vs = Vars(tf.float64)
  27. f_true = x**1.8 + B.sin(2 * B.pi * x)
  28. f, y = model(vs)
  29. post = f.measure | (f(x), f_true)
  30. y_obs = post(f(x_obs)).sample()
  31. def objective(vs):
  32. f, noise = model(vs)
  33. evidence = f(x_obs, noise).logpdf(y_obs)
  34. return -evidence
  35. # Learn hyperparameters.
  36. minimise_l_bfgs_b(objective, vs, jit=True)
  37. f, noise = model(vs)
  38. # Print the learned parameters.
  39. out.kv("Prior", f.display(out.format))
  40. vs.print()
  41. # Condition on the observations to make predictions.
  42. f_post = f | (f(x_obs, noise), y_obs)
  43. mean, lower, upper = f_post(x).marginal_credible_bounds()
  44. # Plot result.
  45. plt.plot(x, B.squeeze(f_true), label="True", style="test")
  46. plt.scatter(x_obs, B.squeeze(y_obs), label="Observations", style="train", s=20)
  47. plt.plot(x, mean, label="Prediction", style="pred")
  48. plt.fill_between(x, lower, upper, style="pred")
  49. tweak()
  50. plt.savefig("readme_example3_parametric.png")
  51. plt.show()

Multi-Output Regression

Prediction

  1. import matplotlib.pyplot as plt
  2. from wbml.plot import tweak
  3. from stheno import B, Measure, GP, EQ, Delta
  4. class VGP:
  5. """A vector-valued GP."""
  6. def __init__(self, ps):
  7. self.ps = ps
  8. def __add__(self, other):
  9. return VGP([f + g for f, g in zip(self.ps, other.ps)])
  10. def lmatmul(self, A):
  11. m, n = A.shape
  12. ps = [0 for _ in range(m)]
  13. for i in range(m):
  14. for j in range(n):
  15. ps[i] += A[i, j] * self.ps[j]
  16. return VGP(ps)
  17. # Define points to predict at.
  18. x = B.linspace(0, 10, 100)
  19. x_obs = B.linspace(0, 10, 10)
  20. # Model parameters:
  21. m = 2
  22. p = 4
  23. H = B.randn(p, m)
  24. with Measure() as prior:
  25. # Construct latent functions.
  26. us = VGP([GP(EQ()) for _ in range(m)])
  27. # Construct multi-output prior.
  28. fs = us.lmatmul(H)
  29. # Construct noise.
  30. e = VGP([GP(0.5 * Delta()) for _ in range(p)])
  31. # Construct observation model.
  32. ys = e + fs
  33. # Sample a true, underlying function and observations.
  34. samples = prior.sample(*(p(x) for p in fs.ps), *(p(x_obs) for p in ys.ps))
  35. fs_true, ys_obs = samples[:p], samples[p:]
  36. # Compute the posterior and make predictions.
  37. post = prior.condition(*((p(x_obs), y_obs) for p, y_obs in zip(ys.ps, ys_obs)))
  38. preds = [post(p(x)) for p in fs.ps]
  39. # Plot results.
  40. def plot_prediction(x, f, pred, x_obs=None, y_obs=None):
  41. plt.plot(x, f, label="True", style="test")
  42. if x_obs is not None:
  43. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  44. mean, lower, upper = pred.marginal_credible_bounds()
  45. plt.plot(x, mean, label="Prediction", style="pred")
  46. plt.fill_between(x, lower, upper, style="pred")
  47. tweak()
  48. plt.figure(figsize=(10, 6))
  49. for i in range(4):
  50. plt.subplot(2, 2, i + 1)
  51. plt.title(f"Output {i + 1}")
  52. plot_prediction(x, fs_true[i], preds[i], x_obs, ys_obs[i])
  53. plt.savefig("readme_example4_multi-output.png")
  54. plt.show()

Approximate Integration

Prediction

  1. import matplotlib.pyplot as plt
  2. import numpy as np
  3. import tensorflow as tf
  4. import wbml.plot
  5. from stheno.tensorflow import B, Measure, GP, EQ, Delta
  6. # Define points to predict at.
  7. x = B.linspace(tf.float64, 0, 10, 200)
  8. x_obs = B.linspace(tf.float64, 0, 10, 10)
  9. with Measure() as prior:
  10. # Construct a model.
  11. f = 0.7 * GP(EQ()).stretch(1.5)
  12. e = 0.2 * GP(Delta())
  13. # Construct derivatives.
  14. df = f.diff()
  15. ddf = df.diff()
  16. dddf = ddf.diff() + e
  17. # Fix the integration constants.
  18. zero = B.cast(tf.float64, 0)
  19. one = B.cast(tf.float64, 1)
  20. prior = prior | ((f(zero), one), (df(zero), zero), (ddf(zero), -one))
  21. # Sample observations.
  22. y_obs = B.sin(x_obs) + 0.2 * B.randn(*x_obs.shape)
  23. # Condition on the observations to make predictions.
  24. post = prior | (dddf(x_obs), y_obs)
  25. # And make predictions.
  26. pred_iiif = post(f)(x)
  27. pred_iif = post(df)(x)
  28. pred_if = post(ddf)(x)
  29. pred_f = post(dddf)(x)
  30. # Plot result.
  31. def plot_prediction(x, f, pred, x_obs=None, y_obs=None):
  32. plt.plot(x, f, label="True", style="test")
  33. if x_obs is not None:
  34. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  35. mean, lower, upper = pred.marginal_credible_bounds()
  36. plt.plot(x, mean, label="Prediction", style="pred")
  37. plt.fill_between(x, lower, upper, style="pred")
  38. wbml.plot.tweak()
  39. plt.figure(figsize=(10, 6))
  40. plt.subplot(2, 2, 1)
  41. plt.title("Function")
  42. plot_prediction(x, np.sin(x), pred_f, x_obs=x_obs, y_obs=y_obs)
  43. plt.subplot(2, 2, 2)
  44. plt.title("Integral of Function")
  45. plot_prediction(x, -np.cos(x), pred_if)
  46. plt.subplot(2, 2, 3)
  47. plt.title("Second Integral of Function")
  48. plot_prediction(x, -np.sin(x), pred_iif)
  49. plt.subplot(2, 2, 4)
  50. plt.title("Third Integral of Function")
  51. plot_prediction(x, np.cos(x), pred_iiif)
  52. plt.savefig("readme_example5_integration.png")
  53. plt.show()

Bayesian Linear Regression

Prediction

  1. import matplotlib.pyplot as plt
  2. import wbml.out as out
  3. from wbml.plot import tweak
  4. from stheno import B, Measure, GP
  5. B.epsilon = 1e-10 # Very slightly regularise.
  6. # Define points to predict at.
  7. x = B.linspace(0, 10, 200)
  8. x_obs = B.linspace(0, 10, 10)
  9. with Measure() as prior:
  10. # Construct a linear model.
  11. slope = GP(1)
  12. intercept = GP(5)
  13. f = slope * (lambda x: x) + intercept
  14. # Sample a slope, intercept, underlying function, and observations.
  15. true_slope, true_intercept, f_true, y_obs = prior.sample(
  16. slope(0), intercept(0), f(x), f(x_obs, 0.2)
  17. )
  18. # Condition on the observations to make predictions.
  19. post = prior | (f(x_obs, 0.2), y_obs)
  20. mean, lower, upper = post(f(x)).marginal_credible_bounds()
  21. out.kv("True slope", true_slope[0, 0])
  22. out.kv("Predicted slope", post(slope(0)).mean[0, 0])
  23. out.kv("True intercept", true_intercept[0, 0])
  24. out.kv("Predicted intercept", post(intercept(0)).mean[0, 0])
  25. # Plot result.
  26. plt.plot(x, f_true, label="True", style="test")
  27. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  28. plt.plot(x, mean, label="Prediction", style="pred")
  29. plt.fill_between(x, lower, upper, style="pred")
  30. tweak()
  31. plt.savefig("readme_example6_blr.png")
  32. plt.show()

GPAR

Prediction

  1. import matplotlib.pyplot as plt
  2. import numpy as np
  3. import tensorflow as tf
  4. from varz.spec import parametrised, Positive
  5. from varz.tensorflow import Vars, minimise_l_bfgs_b
  6. from wbml.plot import tweak
  7. from stheno.tensorflow import B, GP, EQ
  8. # Define points to predict at.
  9. x = B.linspace(tf.float64, 0, 10, 200)
  10. x_obs1 = B.linspace(tf.float64, 0, 10, 30)
  11. inds2 = np.random.permutation(len(x_obs1))[:10]
  12. x_obs2 = B.take(x_obs1, inds2)
  13. # Construction functions to predict and observations.
  14. f1_true = B.sin(x)
  15. f2_true = B.sin(x) ** 2
  16. y1_obs = B.sin(x_obs1) + 0.1 * B.randn(*x_obs1.shape)
  17. y2_obs = B.sin(x_obs2) ** 2 + 0.1 * B.randn(*x_obs2.shape)
  18. @parametrised
  19. def model(
  20. vs,
  21. var1: Positive = 1,
  22. scale1: Positive = 1,
  23. noise1: Positive = 0.1,
  24. var2: Positive = 1,
  25. scale2: Positive = 1,
  26. noise2: Positive = 0.1,
  27. ):
  28. # Build layers:
  29. f1 = GP(var1 * EQ().stretch(scale1))
  30. f2 = GP(var2 * EQ().stretch(scale2))
  31. return (f1, noise1), (f2, noise2)
  32. def objective(vs):
  33. (f1, noise1), (f2, noise2) = model(vs)
  34. x1 = x_obs1
  35. x2 = B.stack(x_obs2, B.take(y1_obs, inds2), axis=1)
  36. evidence = f1(x1, noise1).logpdf(y1_obs) + f2(x2, noise2).logpdf(y2_obs)
  37. return -evidence
  38. # Learn hyperparameters.
  39. vs = Vars(tf.float64)
  40. minimise_l_bfgs_b(objective, vs)
  41. # Compute posteriors.
  42. (f1, noise1), (f2, noise2) = model(vs)
  43. x1 = x_obs1
  44. x2 = B.stack(x_obs2, B.take(y1_obs, inds2), axis=1)
  45. f1_post = f1 | (f1(x1, noise1), y1_obs)
  46. f2_post = f2 | (f2(x2, noise2), y2_obs)
  47. # Predict first output.
  48. mean1, lower1, upper1 = f1_post(x).marginal_credible_bounds()
  49. # Predict second output with Monte Carlo.
  50. samples = [
  51. f2_post(B.stack(x, f1_post(x).sample()[:, 0], axis=1)).sample()[:, 0]
  52. for _ in range(100)
  53. ]
  54. mean2 = np.mean(samples, axis=0)
  55. lower2 = np.percentile(samples, 2.5, axis=0)
  56. upper2 = np.percentile(samples, 100 - 2.5, axis=0)
  57. # Plot result.
  58. plt.figure()
  59. plt.subplot(2, 1, 1)
  60. plt.title("Output 1")
  61. plt.plot(x, f1_true, label="True", style="test")
  62. plt.scatter(x_obs1, y1_obs, label="Observations", style="train", s=20)
  63. plt.plot(x, mean1, label="Prediction", style="pred")
  64. plt.fill_between(x, lower1, upper1, style="pred")
  65. tweak()
  66. plt.subplot(2, 1, 2)
  67. plt.title("Output 2")
  68. plt.plot(x, f2_true, label="True", style="test")
  69. plt.scatter(x_obs2, y2_obs, label="Observations", style="train", s=20)
  70. plt.plot(x, mean2, label="Prediction", style="pred")
  71. plt.fill_between(x, lower2, upper2, style="pred")
  72. tweak()
  73. plt.savefig("readme_example7_gpar.png")
  74. plt.show()

A GP-RNN Model

Prediction

  1. import matplotlib.pyplot as plt
  2. import numpy as np
  3. import tensorflow as tf
  4. from varz.spec import parametrised, Positive
  5. from varz.tensorflow import Vars, minimise_adam
  6. from wbml.net import rnn as rnn_constructor
  7. from wbml.plot import tweak
  8. from stheno.tensorflow import B, Measure, GP, EQ
  9. # Increase regularisation because we are dealing with `tf.float32`s.
  10. B.epsilon = 1e-6
  11. # Construct points which to predict at.
  12. x = B.linspace(tf.float32, 0, 1, 100)[:, None]
  13. inds_obs = B.range(0, int(0.75 * len(x))) # Train on the first 75% only.
  14. x_obs = B.take(x, inds_obs)
  15. # Construct function and observations.
  16. # Draw random modulation functions.
  17. a_true = GP(1e-2 * EQ().stretch(0.1))(x).sample()
  18. b_true = GP(1e-2 * EQ().stretch(0.1))(x).sample()
  19. # Construct the true, underlying function.
  20. f_true = (1 + a_true) * B.sin(2 * np.pi * 7 * x) + b_true
  21. # Add noise.
  22. y_true = f_true + 0.1 * B.randn(*f_true.shape)
  23. # Normalise and split.
  24. f_true = (f_true - B.mean(y_true)) / B.std(y_true)
  25. y_true = (y_true - B.mean(y_true)) / B.std(y_true)
  26. y_obs = B.take(y_true, inds_obs)
  27. @parametrised
  28. def model(vs, a_scale: Positive = 0.1, b_scale: Positive = 0.1, noise: Positive = 0.01):
  29. # Construct an RNN.
  30. f_rnn = rnn_constructor(
  31. output_size=1, widths=(10,), nonlinearity=B.tanh, final_dense=True
  32. )
  33. # Set the weights for the RNN.
  34. num_weights = f_rnn.num_weights(input_size=1)
  35. weights = Vars(tf.float32, source=vs.get(shape=(num_weights,), name="rnn"))
  36. f_rnn.initialise(input_size=1, vs=weights)
  37. with Measure():
  38. # Construct GPs that modulate the RNN.
  39. a = GP(1e-2 * EQ().stretch(a_scale))
  40. b = GP(1e-2 * EQ().stretch(b_scale))
  41. # GP-RNN model:
  42. f_gp_rnn = (1 + a) * (lambda x: f_rnn(x)) + b
  43. return f_rnn, f_gp_rnn, noise, a, b
  44. def objective_rnn(vs):
  45. f_rnn, _, _, _, _ = model(vs)
  46. return B.mean((f_rnn(x_obs) - y_obs) ** 2)
  47. def objective_gp_rnn(vs):
  48. _, f_gp_rnn, noise, _, _ = model(vs)
  49. evidence = f_gp_rnn(x_obs, noise).logpdf(y_obs)
  50. return -evidence
  51. # Pretrain the RNN.
  52. vs = Vars(tf.float32)
  53. minimise_adam(objective_rnn, vs, rate=5e-3, iters=1000, trace=True, jit=True)
  54. # Jointly train the RNN and GPs.
  55. minimise_adam(objective_gp_rnn, vs, rate=1e-3, iters=1000, trace=True, jit=True)
  56. _, f_gp_rnn, noise, a, b = model(vs)
  57. # Condition.
  58. post = f_gp_rnn.measure | (f_gp_rnn(x_obs, noise), y_obs)
  59. # Predict and plot results.
  60. plt.figure(figsize=(10, 6))
  61. plt.subplot(2, 1, 1)
  62. plt.title("$(1 + a)\\cdot {}$RNN${} + b$")
  63. plt.plot(x, f_true, label="True", style="test")
  64. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  65. mean, lower, upper = post(f_gp_rnn(x)).marginal_credible_bounds()
  66. plt.plot(x, mean, label="Prediction", style="pred")
  67. plt.fill_between(x, lower, upper, style="pred")
  68. tweak()
  69. plt.subplot(2, 2, 3)
  70. plt.title("$a$")
  71. mean, lower, upper = post(a(x)).marginal_credible_bounds()
  72. plt.plot(x, mean, label="Prediction", style="pred")
  73. plt.fill_between(x, lower, upper, style="pred")
  74. tweak()
  75. plt.subplot(2, 2, 4)
  76. plt.title("$b$")
  77. mean, lower, upper = post(b(x)).marginal_credible_bounds()
  78. plt.plot(x, mean, label="Prediction", style="pred")
  79. plt.fill_between(x, lower, upper, style="pred")
  80. tweak()
  81. plt.savefig(f"readme_example8_gp-rnn.png")
  82. plt.show()

Approximate Multiplication Between GPs

Prediction

  1. import matplotlib.pyplot as plt
  2. from wbml.plot import tweak
  3. from stheno import B, Measure, GP, EQ
  4. # Define points to predict at.
  5. x = B.linspace(0, 10, 100)
  6. with Measure() as prior:
  7. f1 = GP(3, EQ())
  8. f2 = GP(3, EQ())
  9. # Compute the approximate product.
  10. f_prod = f1 * f2
  11. # Sample two functions.
  12. s1, s2 = prior.sample(f1(x), f2(x))
  13. # Predict.
  14. f_prod_post = f_prod | ((f1(x), s1), (f2(x), s2))
  15. mean, lower, upper = f_prod_post(x).marginal_credible_bounds()
  16. # Plot result.
  17. plt.plot(x, s1, label="Sample 1", style="train")
  18. plt.plot(x, s2, label="Sample 2", style="train", ls="--")
  19. plt.plot(x, s1 * s2, label="True product", style="test")
  20. plt.plot(x, mean, label="Approximate posterior", style="pred")
  21. plt.fill_between(x, lower, upper, style="pred")
  22. tweak()
  23. plt.savefig("readme_example9_product.png")
  24. plt.show()

Sparse Regression

Prediction

  1. import matplotlib.pyplot as plt
  2. import wbml.out as out
  3. from wbml.plot import tweak
  4. from stheno import B, GP, EQ, PseudoObs
  5. # Define points to predict at.
  6. x = B.linspace(0, 10, 100)
  7. x_obs = B.linspace(0, 7, 50_000)
  8. x_ind = B.linspace(0, 10, 20)
  9. # Construct a prior.
  10. f = GP(EQ().periodic(2 * B.pi))
  11. # Sample a true, underlying function and observations.
  12. f_true = B.sin(x)
  13. y_obs = B.sin(x_obs) + B.sqrt(0.5) * B.randn(*x_obs.shape)
  14. # Compute a pseudo-point approximation of the posterior.
  15. obs = PseudoObs(f(x_ind), (f(x_obs, 0.5), y_obs))
  16. # Compute the ELBO.
  17. out.kv("ELBO", obs.elbo(f.measure))
  18. # Compute the approximate posterior.
  19. f_post = f | obs
  20. # Make predictions with the approximate posterior.
  21. mean, lower, upper = f_post(x).marginal_credible_bounds()
  22. # Plot result.
  23. plt.plot(x, f_true, label="True", style="test")
  24. plt.scatter(
  25. x_obs,
  26. y_obs,
  27. label="Observations",
  28. style="train",
  29. c="tab:green",
  30. alpha=0.35,
  31. )
  32. plt.scatter(
  33. x_ind,
  34. obs.mu(f.measure)[:, 0],
  35. label="Inducing Points",
  36. style="train",
  37. s=20,
  38. )
  39. plt.plot(x, mean, label="Prediction", style="pred")
  40. plt.fill_between(x, lower, upper, style="pred")
  41. tweak()
  42. plt.savefig("readme_example10_sparse.png")
  43. plt.show()

Smoothing with Nonparametric Basis Functions

Prediction

  1. import matplotlib.pyplot as plt
  2. from wbml.plot import tweak
  3. from stheno import B, Measure, GP, EQ
  4. # Define points to predict at.
  5. x = B.linspace(0, 10, 100)
  6. x_obs = B.linspace(0, 10, 20)
  7. with Measure() as prior:
  8. w = lambda x: B.exp(-(x**2) / 0.5) # Basis function
  9. b = [(w * GP(EQ())).shift(xi) for xi in x_obs] # Weighted basis functions
  10. f = sum(b)
  11. # Sample a true, underlying function and observations.
  12. f_true, y_obs = prior.sample(f(x), f(x_obs, 0.2))
  13. # Condition on the observations to make predictions.
  14. post = prior | (f(x_obs, 0.2), y_obs)
  15. # Plot result.
  16. for i, bi in enumerate(b):
  17. mean, lower, upper = post(bi(x)).marginal_credible_bounds()
  18. kw_args = {"label": "Basis functions"} if i == 0 else {}
  19. plt.plot(x, mean, style="pred2", **kw_args)
  20. plt.plot(x, f_true, label="True", style="test")
  21. plt.scatter(x_obs, y_obs, label="Observations", style="train", s=20)
  22. mean, lower, upper = post(f(x)).marginal_credible_bounds()
  23. plt.plot(x, mean, label="Prediction", style="pred")
  24. plt.fill_between(x, lower, upper, style="pred")
  25. tweak()
  26. plt.savefig("readme_example11_nonparametric_basis.png")
  27. plt.show()