improver.calibration.beta_recalibration module#

Module containing class for recalibrating blended probabilities.

Beta recalibration#

Beta recalibration is intended to be used for recalibrating the blended probabilistic output of several models, each of which has already been individually calibrated (for example, by reliability calibration or by the Rainforests calibration method). It is shown in Ranjan & Gneiting, 2008 that, when blending probabilistic forecasts, even if each input is perfectly calibrated, the output is in general not perfectly calibrated. The authors also show that applying a recalibration to the blended output improves its reliability, sharpness, and score on proper scoring metrics. IMPROVER implements the recalibration method studied in the article, namely a transformation given by the cumulative distribution function of the beta distribution. This is a natural choice of transformation function for probabilities because it is a monotone-increasing function that maps the interval [0, 1] onto itself. The implementation here allows the alpha and beta parameters of the beta distribution to vary by forecast period.

It is recommended that the alpha and beta parameters be chosen to optimise the desired metric (for example, the continuous rank probability score) over some training period. Specifically, this could be done as follows:

  1. Obtain blended forecast output, along with ground truth data from observations or analyses, for the training period.

  2. Implement the loss function. The loss function should have arguments alpha and beta and return the loss over the training period when the probabilistic forecast is transformed by the CDF of the beta distribution function. A suggested loss function is the CRPS calculated from the thresholded probability forecast. In this case, the loss function should transform the input blended probabilities by the CDF of the beta distribution (which is available in scipy.stats), then calculate the CRPS of the transformed probability forecast against the ground truth.

  3. Use scipy.optimize.minimize to find the parameters alpha and beta that minimise the loss.

Alternatively, one could jointly optimise the blending weights and the parameters of the beta calibration. This may yield better results, but is more complex.

class BetaRecalibrate(recalibration_dict)[source]#

Bases: PostProcessingPlugin

Recalibrate probabilities using the cumulative distribution function of the beta distribution.

__init__(recalibration_dict)[source]#
Parameters:

recalibration_dict (Dict[str, Any]) – Dictionary from which to calculate alpha and beta parameters for recalibrating blended output using the beta distribution. Dictionary format is as specified below. Weights will be interpolated over the forecast period from the values specified in the dictionary.

Recalibration dictionary format:

{
    "forecast_period": [7, 12],
    "alpha": [1, 1.5],
    "beta": [1.3, 2],
    "units": "hours",
}

The “units” key is optional. If it is omitted, it is assumed that the units are the same as those used in forecast_period coordinate of the input cube.

_abc_impl = <_abc._abc_data object>#
process(cube)[source]#

Recalibrate cube using the beta distribution with the alpha and beta parameters specified in self.recalibration_dict.

Parameters:

cube – A cube containing a forecast_period coordinate.

Returns:

A cube having the same dimensions as the input, with data transformed by the beta distribution cdf.

Raises:
  • CoordinateNotFoundError – if cube does not contain probability data

  • CoordinateNotFoundError – if cube does not contain forecast_period coordinate

  • RuntimeError – if any interpolated values of alpha or beta are <= 0