Preparing a calibration for ACBICI

As explained in the Introduction, the user needs to prepare three things to run a calibration in ACBICI:

The model itself or a rich data set of model evaluations.
Experimental data stored in a data file containing pairs.
Prior information about the probability distribution of the parameters.

We describe these three topics below and refer to the examples in directory examples and their documentation for further details.

Defining the model

The first step in a calibration is implementing the model that will be calibrated. In ACBICI, the models that can be calibrated are all of the form \(g(x;\theta)\), where \(x\in R^d\) collects the input variables and \(\theta\in R^p\) are the model parameters, the ones that need to be calibrated. The model should be provided in a Python function and it will be incorporated into the ACBICI workflow as explained later. In some cases, this function may be a wrapper to the actual model. For example, it could be the case where the model is a finite element code and the Python function will be in charge of calling the code with the right input. Also, it could be useful to wrap a C function in Python that is called from the model function. In all these cases, from the point of view of ACBICI, it is completely irrelevant how the model evaluates \(g(x;\theta)\).

It might be case that the model is so expensive that the user might not want to spend hours or days evaluating it thousands of times. In these situations, we say that the model is expensive and ACBICI provides a way of avoiding these costly evaluations. Instead of the model, ACBICI can build a surrogate, if provided with enough samples covering the graph of the model.

To generate the model samples, the user should randomly generate pairs \((x_i,\theta_i)\), evaluate the model at these pairs, and store the results in a text file with rows \(x_i, \theta_i, g(x_i;\theta_i)\). The pairs \((x_i,\theta_i)\) must cover the input and parameter space well enough and techniques such as the Latin hypercube sampling could be used.

Experimental data

The calibration will be performed so that the model will reproduce these data as close as possible, taking into account prior knowledge. Data are obtained from experiments and thus some type of measurement error is inevitable, that will need to be modeled as well. In all cases, the user will need to prepare a text file with all the experimental data, one pair \((x_i,y_i)\) per row. The variable \(x\) is a vector in \(\mathbb{R}^n\) and the measurement needs to be a scalar. In the future, we will extend ACBICI to work with multidimensional measurements.

As we will see later, calibration is not insensitive to units and thus it is assumed that experiments data are dimensionless and models are formulated with dimensionless constants. This pre-processing of the available information (and the post-processing of the results to give them back units) is key for accurate calibrations and should not be neglected.

Prior information

One of the fundamental tenets of Bayesian statistics is that when trying to infer a probability distribution, our prior information about it can – and should – be incorporated in a consistent way. Bayes theorem takes care of the right weight for the prior information vs. the information provided by the data. In any case, the user must prepare a prior probability distribution that accounts for our incomplete knowledge regarding the values of the model parameters and hyperparameters.

A word of caution: the supports of prior probabilities are preserved by Bayesian calibration, meaning that the support of the posterior will be a subset of the support of the prior. If a prior of a parameter (or hyperparameter) is given that forbids values in a region of parametric space, the posterior probability of these region will still be zero, irrespective of the experimental data measured. Hence, one should make sure that the priors are consistent with the constraints on the parameters; in case of doubt, select wider supports and let the calibration procedure concentrate the probability density where the experimental data hints at.

When designing a calibration in ACBCI, the user will have to implement a Python function that must provide the log-prior probability of the parameters, that is, \(\log p:\mathbb{R}^p\to[-\infty,0]\). The special value \(-\infty\) will correspond to the evaluation of the prior at a parameter value that has zero probability.

Useful prior distributions

Prior distributions should reflect the knowledge that the user has about the parameters before they are calibrated. The more accurate this prior distributions are, the fewer MCMC runs will be required to reach accurate posterior distributions so it is in the user’s interest to be as accurate as possible with the prior definitions.

There is much flexibility to choose the prior distributions. However, it must be noted that the posterior distribution of each parameter will be compatible with the prior, meaning that values of the parameter outside the support of the prior will be assigned zero probability in the posterior. For this reason it is important that the prior do not possess a support that is too small.

Depending on the prior knowledge on each of the parameters, the user of ACBICI can select different types of priors. Some of these are:

Informative priors. These are probability distributions that provide specific information about a parameter and should only be used when one has a great degree of confidence on the range of the parameter values since they can have a large impact on the posterior. Some of the most common informative priors are:
- Uniform distribution: A parameter \(\theta\) has a uniform probability distribution in the interval \([a,b]\) of the form
  
  \[\]
  
  \[p(\theta) \sim 1_{[a,b]}\]
  
  Here, the function \(1_{[a,b]}\) is the indicator function of the interval \([a,b]\), which is 1 when \(\theta\in[a,b]\) and 0 elsewhere. This probability distribution should be employed when we know that the parameter must take a value in the interval \([a,b]\), but not outside.
- Normal or Gaussian distribution with mean \(\mu\) and covariance \(\sigma^2\) corresponds to random variables that are centered around \(\mu\) and are symmetric. Their probability density is:
  
  \[\]
  
  \[p(x; \mu,\sigma) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left(- \frac{(x-\mu)^2}{2\sigma^2} \right)\ .\]
- Log-normal: This prior should be used when the parameter is known to be positive.
  
  \[\]
  
  \[p(x; \mu, \sigma) = \frac{1}{x \sigma \sqrt{2\pi}} \exp\left(-\frac{(\ln x - \mu)^2}{2\sigma^2}\right)\]
Weakly informative priors. These priors are employed when we would like to provide some guidance to the calibration, but without being overly restrictive. They are designed to regularize estimates by preventing extreme values while still allowing the data to have a significant influence. These priors are often used to ensure that inferences remain within a reasonable range and help in avoiding over-fitting. Some of these weakly informative priors are:
- Cauchy prior: A random variable has a Cauchy distribution with location \(\theta_0\) and scale \(\gamma\) if when its probability distribution is:
  
  \[\]
  
  \[p(\theta;\theta_0,\gamma) = \frac{1}{\pi\gamma\,\left[1+\left(\frac{\theta-\theta_0}{\gamma}\right)^2\right]}\]
- Gamma prior: This is a family of positive random variables that can have a variety of shapes depending on its paramters. Given the shape \(\alpha\) and scale \(\theta\), the probability density function of a Gamma random variable is:
  
  \[\]
  
  \[p(\theta;\alpha,\theta) = \frac{1}{\Gamma(\alpha)\,\theta^\alpha} x^{\alpha-1}\,e^{-x/\theta}\]
Uninformative priors. These are used where there is none or little information abut the distribution of a parameter. In this case, the prior should attempt to introduce as little bias as possible.
Improper priors. They are non-normalizable distributions that can be used in Bayesian analysis under certain conditions.

Surrogate model

To calibrate a model using Bayesian inference, lots of model evaluations will be required, a number the increases also with the dimension of the unknown variables. In certain situations, the model that we would like to calibrate is so expensive (for example, a finite element simulation) that from the beginning the decide to replace it with a surrogate.

In ACBICI, surrogates are Gaussian processes. If the user decides to employ a meta-model, the hyperparameters of this process will have to be calibrated.

Discrepancy error of the model

The user must decide whether to explicitly account for the discrepancy between the proposed model and the experimental data or ignore it. In the first place, there is an explicit acknowledgement of the limitations of the model and its inability to predict accurately the experimental results. This does not necessarily mean that the model is bad (a very simple model that gives qualitatively good results might be extremely useful). In any case, the discrepancy between the model predictions and the experimental data can be modeled as a parameter-free Gaussian process. In this situation the hyperparameters of this discrepancy model will have to be calibrated alongside the true model parameters.

Collecting experimental data

The only data available to help us calibrate the model are experimental data. These values must be stored in a file that can be accessed by ACBICI. The file will have as many rows as experimental results and each row will have \(d+1\) real numbers. The first \(d\) data are assumed to correspond to the values of \(x\) in each experiment and the last column is the measurement.

All experimental data has some type of error. In ACBICI, it will be assumed that the experimental error is always Gaussian with zero mean. In certain situations, the measurements come along with a estimation of the error variance; in others, the experimental error is unknown but can be calibrated as well.

Calibration and MCMC is very sensitive to the relative magnitude of the parameters to be calibrated. To optimize resources, it is strongly recommended that the model and the experimental data are expressed in terms of non-dimensional input variables and parameters, and the output is also non-dimensional.

Prior information

A fundamental ingredient of all Bayesian inference is the explicit description of the prior probability distributions for all the unknown parameters. In ACBICI, a Python function must be provided that computes the \(\log\) probability of all the model parameters and hyperparameters, if any, that enter the calibration. The total number of parameters and hyperparamters depend, thus, on the decisions made before

The number of probability distributions that need to be prepared depend on the model and the surrogate. Here we collect all the potential values:

\(p\) probabilities, one for each parameter \(\theta_i\) of the model.

\(3\) probabilities if we will like to replace the model with a surrogate. The distributions will have to represent the parameters identified with \(\beta_x, \beta_t, \lambda\) that are employed in the kernel of the Gaussian process for the surrogate.

\(2\) probabilities if we will like to model the discrepancy. The distributions will have to represent the parameters identified with \(\beta_d, \lambda_d\) that are employed in the kernel of the Gaussian process for the discrepancy.

\(1\) probability if the experimental error is unknown.

For each of these priors, a probability density needs to be selected in the corresponding variable.

The following table summarizes all the variables that might need to be calibrated by ACBICI.

Parameters and hyperparameters
Origin	Symbol	Dimension	Comment
Model parameters	\(\theta\)	\(p\)	Mandatory
Surrogate	\(\beta_x, \lambda_x, \beta_t\)	\(3\)	Only if meta-model
Discrepancy	\(\beta_d, \lambda_d\)	\(2\)	Optional
Experimental variance	\(\sigma\)	\(1\)	If unknown experimental error

Selecting calibration type

At this early stage, the user must decide whether to explicitly account for the discrepancy of the proposed model and the experimental data or ignore it. In the first case, the discrepancy will be modeled with a Gaussian process and its hyperparameters will be calibrated.

To calibrate a model using Bayesian inference, lots of model evaluations will be required, a number the increases also with the dimension of the unknown variables. In certain situations, the model that we would like to calibrate is so expensive (for example, a finite element simulation) that from the beginning the decide to replace it with a surrogate.

This is the information that needs to be prepared for the calibration:

\(g\) : Model in a Python function

\(d\) : Dimension of input variables

\(t\) : Number of parameters in \(g\)

Discrepancy: weather to calibrate the discrepancy or ignore it.

Surrogate: weather to replace the model with a surrogate or not.