improver.calibration.load_and_apply_quantile_regression_random_forest module#

Script to load and apply the trained Quantile Regression Random Forest (QRF) model.

class PrepareAndApplyQRF(feature_config, target_cf_name, unique_site_id_keys=['wmo_id'], cycletime=None, forecast_period=None)[source]#

Bases: PostProcessingPlugin

Prepare the input forecast for application of a trained Quantile Regression Random Forest (QRF) model and apply the QRF model.

__init__(feature_config, target_cf_name, unique_site_id_keys=['wmo_id'], cycletime=None, forecast_period=None)[source]#

Initialise the plugin.

Parameters:
  • feature_config (dict) – Feature configuration defining the features to be used for quantile regression. The configuration is a dictionary of strings, where the keys are the names of the input cube(s) supplied, and the values are a list. This list can contain both computed features, such as the mean or standard deviation (std), or static features, such as the altitude. The computed features will be computed using the cube defined in the dictionary key. If the key is the feature itself e.g. a distance to water cube, then the value should state “static”. This will ensure the cube’s data is used as the feature. The config will have the structure: “DYNAMIC_VARIABLE_CF_NAME”: [“FEATURE1”, “FEATURE2”] e.g: { “air_temperature”: [“mean”, “std”, “altitude”], “visibility_at_screen_level”: [“mean”, “std”] “distance_to_water”: [“static”], }

  • target_cf_name (str) – A string containing the CF name of diagnostic to be calibrated. This will be used to separate it from the rest of the dynamic predictors, if present.

  • unique_site_id_keys (list) – The names of the coordinates that uniquely identify each site, e.g. “wmo_id” or [“latitude”, “longitude”].

  • cycletime (str) – The cycle time of the forecast to be calibrated in the format YYYYMMDDTHHMMZ. If not provided, the cycle time found in the first forecast cube will be used.

  • forecast_period (int) – The forecast period of the forecast to be calibrated in seconds. If not provided, the forecast period found in the first forecast cube will be used.

_abc_impl = <_abc._abc_data object>#
static _compute_quantile_list(forecast_cube, coord)[source]#

Compute the list of quantiles e.g. 0.25, 0.5, 0.75 that will be produced from a specified coordinate on the forecast cube.

Parameters:
  • forecast_cube (Cube) – Forecast to be calibrated.

  • coord (str) – Coordinate name. The length of the coordinate will be used to determine the number of quantiles to compute.

Return type:

list[float]

Returns:

List of quantiles (e.g. 0.25, 0.5, 0.75) computed from the forecast cube.

_cube_to_dataframe(cube_inputs)[source]#

Convert cube inputs to a pandas DataFrame.

Parameters:

cube_inputs (CubeList) – List of cubes containing the features and the forecast to be calibrated.

Return type:

DataFrame

Returns:

DataFrame containing the data from the cubes, with auxiliary coordinates included as columns.

_get_inputs(cube_inputs, qrf_model=None)[source]#

Split the forecast to be calibrated from the other features. Handle the case where the qrf_model is not provided, for example, if the input data required to train the QRF model isn’t yet available. In this case, the uncalibrated forecast is returned with a warning comment added.

Parameters:
  • cube_inputs (CubeList) – List of cubes containing the features and the forecast to be calibrated.

  • qrf_model (Optional[RandomForestQuantileRegressor]) – The trained QRF model to be applied to the forecast. If None, the input forecast will be returned unchanged with a warning comment added.

Return type:

tuple[CubeList, Cube]

Returns:

CubeList of the features cubes and the forecast cube.

Raises:
  • ValueError – If the target forecast is not provided.

  • ValueError – If the number of cubes provided does not match the number of features expected.

  • ValueError – If the input cubes contain a mix of realization and percentile coordinates.

_update_forecast_reference_time_and_period(cube_inputs)[source]#

Update the forecast_reference_time and forecast_period coordinates on the input cubes to match those provided, if they are provided. The rebadging of the forecast_period introduces a slight discrepancy between the forecasts used for training and application of the QRF model. However, as any forecast period rebadging is expected to be small (e.g. a few hours), this is not expected to be a significant issue.

Parameters:

cube_inputs (CubeList) – List of cubes containing the features and the forecast to be calibrated.

Return type:

CubeList

Returns:

CubeList of the input cubes with updated forecast_reference_time and forecast_period coordinates, if they were provided.

process(cube_inputs, qrf_descriptors=None)[source]#

Load and apply the trained Quantile Regression Random Forest (QRF) model. The model is used to calibrated the forecast provided. The calibrated forecast is written to a cube. If no model is provided the input forecast is returned unchanged.

Parameters:
  • cube_inputs (CubeList) – List of cubes containing the features and the forecast to be calibrated.

  • qrf_descriptors (Optional[tuple[RandomForestQuantileRegressor, str, float]]) – The trained QRF model to be applied to the forecast and the transformation and pre-transform addition applied during training. The descriptors expected are a tuple of: (qrf_model, transformation, pre_transform_addition).

Returns:

The calibrated forecast cube.

Return type:

iris.cube.Cube

class RandomForestQuantileRegressor[source]#

Bases: object