Gaussian processes ****************** Gaussian processes play a fundamental role in the calibration tasks of ACBICI. We provide next a brief summary of their definition and properties and explain how they are employed in the calibration. Definition ========== According to the `Wikipedia `_, "A GP is a stochastic process, that is, a collection of random variables indexed by a scalar, often interpreted as time, such that every finite collection of these variables has a multivariate normal distribution. Equivalently, every finite linear combination of these variables is normally distributed." A Gaussian process on $\mathbb{R}^n$ is completely defined by a mean function $m:\mathbb{R}^n\to\mathbb{R}$ and a covariance function $c:\mathbb{R}^n\times\mathbb{R}^n\to\mathbb{R}$. We write $f(x)\sim \mathcal{GP}(m(\cdot),c(\cdot,\cdot))$. A per its definition, the fundamental property of a GP is that, when restricted to a finite number $N$ of variables, their joint distribution is a normal multivariate, that is, if we define $\mathbf{x}=\{x_1,x_2,\ldots,x_N\}$, then .. math:: $$\mathbf{x} \sim \mathcal{N}(m(\mathbf{x}), c(\mathbf{x},\mathbf{x}'))$$ Covariance and kernel ===================== When considering a finite number of random variables as before, the covariance matrix of the multivariate distribution is .. math:: $$\Sigma = c(\mathbf{x},\mathbf{x}')$$ This means that, for every pair $x_i,x_j$ of random variables in $\mathbf{x}$, the $(i,j)$ component of the covariance matrix is calculated by evaluating the *covariance function* $c$ at these two points. To ensure certain properties of the Gaussian process this function can not be arbitrary. Most often the covariance function is defined in terms of an isotropic *kernel* $k:\mathbb{R}^+\to\mathbb{R}$ such that $c(\mathbf{x},\mathbf{y}) = k(\|\mathbf{x}-\mathbf{y}\|)$. Note that given a vector $\mathbf{x}\in\mathbb{R}^n$ its Euclidean norm $\|x\| = (\sum_{i=1}^n x_i^2)^{1/2}$ is only well-defined when all the components of the vector are either dimensionless or have the same physical dimensions. If this is not the case, a symmetric positive definite metric $\mathbf{M}$ must be defined such that .. math:: \| \mathbf{x} \|^2 = \mathbf{x}^T\; \mathbf{M} \mathbf{x} . In ACBICI, it is assumed that all variables are dimensionless, which simplifies the kernel definitions but also entails certain limitations, including: - **Loss of physical interpretability:** By treating all variables as dimensionless, the connection to their original physical units is lost. This can obscure meaningful relationships that depend on units, making it harder to interpret results in a physical context. - **Reduced flexibility:** Forcing all variables into a common, dimensionless framework may be inappropriate for heterogeneous data combining quantities with fundamentally different units (e.g., length, time, energy). This can limit the applicability of the model to complex, multi-physical problems. - **Scaling sensitivity:** Without explicitly accounting for the physical dimensions, improper or inconsistent scaling of variables can bias distance-based methods such as kernels. The natural geometry of the data, which could be captured by a positive definite metric matrix :math:`\mathbf{M}`, is neglected. Some common isotropic kernels available in ACBICI are: - The squared exponential kernel or RBF kernel, that depends on a lengthscale parameter $\beta$ and a signal variance $\lambda$ .. math:: k_{sqexp}(r;\lambda,\beta) = \lambda \exp[ -\frac{r^2}{2\beta^2}] - The Matérn 3/2 kernel, that depends on a lengthscale parameter $\beta$ and a signal variance $\lambda$ .. math:: k_{m32}(r;\lambda,\beta) = \lambda \left(1+\frac{\sqrt3\,r}{\beta}\right) \exp\left[-\frac{\sqrt3\,r}{\beta}\right] - The Matérn 5/2 kernel, that depends on a lengthscale parameter $\beta$ and a signal variance $\lambda$ .. math:: k_{m52}(r;\lambda,\beta) = \lambda \left(1+\frac{\sqrt5\,r}{\beta} + \frac{5\,r^2}{3\,\beta^2}\right) \exp\left[-\frac{\sqrt5\,r}{\beta}\right] - The exponential kernel, that depends on a lengthscale parameter $\beta$ and a signal variance $\lambda$ .. math:: k_{exp}(r;\lambda,\beta) = \lambda\exp\left[-r/\beta\right] - The rational quadratic kernel with \alpha=1, depending on a lengthscale parameter $\beta$ and a signal variance $\lambda$ .. math:: k_{ratquad} = \lambda\left(1+\frac{r^2}{2\alpha\beta^2}\right)^{-\alpha} For the multivariate GP, we choose a similarity kernel, that chooses the Matern 3/2 kernel when comparing the same task and otherwise 0, as explained in :ref:`multioutput_calibration`. All the kernels employ one or more *hyperparameters* whose value give shape to the covariance. Note that in ACBICI we have one lengthscale parameter for the parameters :math:`\beta_t` and one for the input variables :math:`\beta_x`. The use of Gaussian processes in calibration ============================================ Gaussian processes have many applications in statistics and data science. In partcular, they are powerfull parameter-free regression models. In ACBICI, they are employed for two reasons: - When the model that has to be calibrated is too expensive, it is replaced by a meta-model that, although needs to be calibrated, it is much inexpensive to evaluate. In ACBICI, this surrogate model is a Gaussian process. - Often, the discrepancy error of the model needs to be inferred. Since there is no a priori information about the form of this discrepancy, ACBICI uses a Gaussian process to represent it. Whenever a Gaussian process is used in ACBICI, its hyperparameters need to be inferred.