improver.calibration package#
Submodules#
- improver.calibration.beta_recalibration module
- improver.calibration.dataframe_utilities module
- Ingestion of DataFrames into iris cubes
_dataframe_column_check()_define_height_coord()_define_time_coord()_drop_duplicates()_ensure_consistent_static_cols()_fill_missing_entries()_prepare_dataframes()_preprocess_temporal_columns()_training_dates_for_calibration()_unique_check()forecast_and_truth_dataframes_to_cubes()forecast_dataframe_to_cube()get_forecast_representation()quantile_check()truth_dataframe_to_cube()
- improver.calibration.dz_rescaling module
- improver.calibration.emos_calibration module
- Ensemble Model Output Statistics (EMOS)
ApplyEMOSCalibratedForecastDistributionParametersCalibratedForecastDistributionParameters.__init__()CalibratedForecastDistributionParameters._abc_implCalibratedForecastDistributionParameters._calculate_location_parameter_from_mean()CalibratedForecastDistributionParameters._calculate_location_parameter_from_realizations()CalibratedForecastDistributionParameters._calculate_scale_parameter()CalibratedForecastDistributionParameters._create_output_cubes()CalibratedForecastDistributionParameters._diagnostic_match()CalibratedForecastDistributionParameters._spatial_domain_match()CalibratedForecastDistributionParameters.process()
ContinuousRankedProbabilityScoreMinimisersContinuousRankedProbabilityScoreMinimisers.BAD_VALUEContinuousRankedProbabilityScoreMinimisers.TOLERATED_PERCENTAGE_CHANGEContinuousRankedProbabilityScoreMinimisers.__init__()ContinuousRankedProbabilityScoreMinimisers._abc_implContinuousRankedProbabilityScoreMinimisers._calculate_percentage_change_in_last_iteration()ContinuousRankedProbabilityScoreMinimisers._minimise_caller()ContinuousRankedProbabilityScoreMinimisers._normal_crps_preparation()ContinuousRankedProbabilityScoreMinimisers._prepare_forecasts()ContinuousRankedProbabilityScoreMinimisers._process_points_independently()ContinuousRankedProbabilityScoreMinimisers._process_points_together()ContinuousRankedProbabilityScoreMinimisers.calculate_normal_crps()ContinuousRankedProbabilityScoreMinimisers.calculate_truncated_normal_crps()ContinuousRankedProbabilityScoreMinimisers.process()
EstimateCoefficientsForEnsembleCalibrationEstimateCoefficientsForEnsembleCalibration.__init__()EstimateCoefficientsForEnsembleCalibration._abc_implEstimateCoefficientsForEnsembleCalibration._add_predictor_coords()EstimateCoefficientsForEnsembleCalibration._create_cubelist()EstimateCoefficientsForEnsembleCalibration._create_temporal_coordinates()EstimateCoefficientsForEnsembleCalibration._get_spatial_associated_coordinates()EstimateCoefficientsForEnsembleCalibration._set_attributes()EstimateCoefficientsForEnsembleCalibration._validate_distribution()EstimateCoefficientsForEnsembleCalibration.compute_initial_guess()EstimateCoefficientsForEnsembleCalibration.create_coefficients_cubelist()EstimateCoefficientsForEnsembleCalibration.guess_and_minimise()EstimateCoefficientsForEnsembleCalibration.mask_cube()EstimateCoefficientsForEnsembleCalibration.process()
convert_to_realizations()generate_forecast_from_distribution()get_attribute_from_coefficients()get_forecast_type()
- improver.calibration.load_and_apply_quantile_regression_random_forest module
- improver.calibration.load_and_train_quantile_regression_random_forest module
- improver.calibration.quantile_regression_random_forest module
- improver.calibration.rainforest_calibration module
- RainForests calibration
ApplyRainForestsCalibrationApplyRainForestsCalibration._abc_implApplyRainForestsCalibration._check_num_features()ApplyRainForestsCalibration._get_feature_splits()ApplyRainForestsCalibration._get_num_features()ApplyRainForestsCalibration._parse_model_config()ApplyRainForestsCalibration.check_filenames()ApplyRainForestsCalibration.process()
ApplyRainForestsCalibrationLightGBMApplyRainForestsCalibrationLightGBM.__init__()ApplyRainForestsCalibrationLightGBM._abc_implApplyRainForestsCalibrationLightGBM._align_feature_variables()ApplyRainForestsCalibrationLightGBM._calculate_threshold_probabilities()ApplyRainForestsCalibrationLightGBM._evaluate_probabilities()ApplyRainForestsCalibrationLightGBM._get_ensemble_distributions()ApplyRainForestsCalibrationLightGBM._get_num_features()ApplyRainForestsCalibrationLightGBM._make_decreasing()ApplyRainForestsCalibrationLightGBM._prepare_features_array()ApplyRainForestsCalibrationLightGBM._prepare_threshold_probability_cube()ApplyRainForestsCalibrationLightGBM.process()
ApplyRainForestsCalibrationTreeliteModelFileNotFoundErrorlightgbm_package_available()treelite_packages_available()
- improver.calibration.reliability_calibration module
AggregateReliabilityCalibrationTablesApplyReliabilityCalibrationApplyReliabilityCalibration.__init__()ApplyReliabilityCalibration._abc_implApplyReliabilityCalibration._apply_calibration()ApplyReliabilityCalibration._apply_point_by_point_calibration()ApplyReliabilityCalibration._calculate_reliability_probabilities()ApplyReliabilityCalibration._ensure_monotonicity_across_thresholds()ApplyReliabilityCalibration._extract_matching_reliability_table()ApplyReliabilityCalibration._interpolate()ApplyReliabilityCalibration.process()
ConstructReliabilityCalibrationTablesConstructReliabilityCalibrationTables.__init__()ConstructReliabilityCalibrationTables._abc_implConstructReliabilityCalibrationTables._add_reliability_tables()ConstructReliabilityCalibrationTables._create_probability_bins_coord()ConstructReliabilityCalibrationTables._create_reliability_table_coords()ConstructReliabilityCalibrationTables._create_reliability_table_cube()ConstructReliabilityCalibrationTables._define_metadata()ConstructReliabilityCalibrationTables._define_probability_bins()ConstructReliabilityCalibrationTables._populate_masked_reliability_bins()ConstructReliabilityCalibrationTables._populate_reliability_bins()ConstructReliabilityCalibrationTables.process()
ManipulateReliabilityTableManipulateReliabilityTable.__init__()ManipulateReliabilityTable._abc_implManipulateReliabilityTable._assume_constant_observation_frequency()ManipulateReliabilityTable._combine_bin_pair()ManipulateReliabilityTable._combine_undersampled_bins()ManipulateReliabilityTable._create_new_bin_coord()ManipulateReliabilityTable._enforce_min_count_and_montonicity()ManipulateReliabilityTable._extract_reliability_table_components()ManipulateReliabilityTable._sum_pairs()ManipulateReliabilityTable._update_reliability_table()ManipulateReliabilityTable.process()
- improver.calibration.samos_calibration module
- improver.calibration.simple_bias_correction module
- improver.calibration.utilities module
_ceiling_fp()broadcast_data_to_time_coord()check_data_sufficiency()check_forecast_consistency()check_predictor()convert_cube_data_to_2d()convert_parquet_to_cube()create_unified_frt_coord()filter_non_matching_cubes()flatten_ignoring_masked_data()forecast_coords_match()get_frt_hours()merge_land_and_sea()prepare_cube_no_calibration()
Module contents#
init for calibration that contains functionality to split forecast, truth and coefficient inputs.
- add_feature_from_df_to_df(forecast_df, feature_df, feature_name, possible_merge_columns, float_decimals=4)[source]#
Add a feature to the forecast DataFrame from a second DataFrame based on the feature configuration. Columns within possible_merge_columns that are float are rounded to a specified number of decimal places before merging to avoid precision issues.
- Parameters:
forecast_df (
DataFrame) – DataFrame containing the forecast data.feature_df (
DataFrame) – DataFrame containing the feature data.feature_name (
str) – Name of the feature to be added.possible_merge_columns (
list[str]) – List of column names that can be used to merge the feature DataFrame to the forecast DataFrame.float_decimals (
int) – Number of decimal places to round float columns to before merging. Default is 4, which corresponds to rounding to 0.0001.
- Returns:
DataFrame with additional feature added.
- add_static_feature_from_cube_to_df(forecast_df, feature_cube, feature_name, possible_merge_columns, float_decimals=4)[source]#
Add a static feature to the forecast DataFrame from a cube based on the feature configuration. Other features are expected to already be present in the forecast DataFrame. Columns within possible_merge_columns that are float after converting from a Cube to a DataFrame, are rounded to a specified number of decimal places before merging to avoid precision issues.
- Parameters:
forecast_df (
DataFrame) – DataFrame containing the forecast data.cube_inputs – List of cubes containing additional features.
feature_name (
str) – Name of the feature to be added.possible_merge_columns (
list[str]) – List of column names that can be used to merge the feature DataFrame to the forecast DataFrame.float_decimals (
int) – Number of decimal places to round float columns to before merging. Default is 4, which corresponds to rounding to 0.0001.
- Return type:
DataFrame- Returns:
DataFrame with additional feature added from the input cubes.
- get_common_wmo_ids(forecast_cube, truth_cube, additional_predictors=None)[source]#
Extracts the common WMO IDs from the forecast, truth and any additional predictor cubes.
- Parameters:
- Raises:
IOError – If no common WMO IDs are found in the input cubes.
- Return type:
- Returns:
The forecast, truth and additional predictor cubes with only the common WMO IDs retained.
- get_training_period_cycles(cycletime, forecast_period, training_length)[source]#
Generate a list of forecast reference times for the training period.
- identify_parquet_type(parquet_paths)[source]#
Determine whether the provided parquet paths contain forecast or truth data. This is done by checking the columns within the parquet files for the presence of a forecast_period column which is only present for forecast data.
- Parameters:
parquet_paths (
List[Path]) – A list of paths to Parquet files.- Returns:
The path to the Parquet file containing the historical forecasts.
The path to the Parquet file containing the truths.
- split_cubes_for_samos(cubes, gam_features, truth_attribute=None, expect_emos_coeffs=False, expect_emos_fields=False)[source]#
Function to split the forecast, truth, gam additional predictors and emos additional predictor cubes.
- Parameters:
cubes (
CubeList) – A list of input cubes which will be split into relevant groups.gam_features (
List[str]) – A list of strings containing the names of the additional fields required for the SAMOS GAMs.truth_attribute (
Optional[str]) – An attribute and its value in the format of “attribute=value”, which must be present on truth cubes. If None, no truth cubes are expected or returned.expect_emos_coeffs (
bool) – If True, EMOS coefficient cubes are expected to be found in the input cubes. If False, an error will be raised if any such cubes are found.expect_emos_fields (
bool) – If True, additional EMOS fields are expected to be found in the input cubes. If False, an error will be raised if any such cubes are found.
- Raises:
IOError – If no forecast cube is found and/or no truth cube is found when a truth_attribute has been provided.
IOError – If EMOS coefficients cubes are found when they are not expected.
IOError – If additional fields cubes are found which do not match the features in gam_features.
IOError – If probability cubes are provided with more than one name.
- Returns:
A cube containing all the historic forecasts, or None if no such cubes were found.
A cube containing all the truth data, or None if no such cubes were found or no truth_attribute was provided.
A cubelist containing all the additional fields required for the GAMs, or None if no such cubes were found.
A cubelist containing all the EMOS coefficient cubes, or None if no such cubes were found.
A cubelist containing all the additional fields required for EMOS, or None if no such cubes were found.
A cube containing a probability template, or None if no such cube is found.
- split_forecasts_and_bias_files(cubes)[source]#
Split the input forecast from the forecast error files used for bias-correction.
- Parameters:
cubes (
CubeList) – A list of input cubes which will be split into forecast and forecast errors.- Return type:
- Returns:
A cube containing the current forecast.
If found, a cube or cubelist containing the bias correction files.
- Raises:
ValueError – If multiple forecast cubes provided, when only one is expected.
ValueError – If no forecast is found.
- split_forecasts_and_coeffs(cubes, land_sea_mask_name=None)[source]#
Split the input forecast, coefficients, static additional predictors, land sea-mask and probability template, if provided. The coefficients cubes and land-sea mask are identified based on their name. The static additional predictors are identified as not have a time coordinate. The current forecast and probability template are then split.
- Parameters:
cubes (
Union[List[CubeList[Cube]],List[List[Cube]]]) – A list either containing a CubeList or containing a list of input cubes which will be split into relevant groups. This includes the forecast, coefficients, static additional predictors, land-sea mask and probability template.land_sea_mask_name (
Optional[str]) – Name of the land-sea mask cube to help identification.
- Returns:
A cube containing the current forecast.
If found, a cubelist containing the coefficients else None.
If found, a cubelist containing the static additional predictor else None.
If found, a land-sea mask will be returned, else None.
If found, a probability template will be returned, else None.
- Raises:
ValueError – If multiple items provided, when only one is expected.
ValueError – If no forecast is found.
- split_forecasts_and_truth(cubes, truth_attribute)[source]#
A common utility for splitting the various inputs cubes required for calibration CLIs. These are generally the forecast cubes, historic truths, and in some instances a land-sea mask is also required.
- Parameters:
cubes (
List[Cube]) – A list of input cubes which will be split into relevant groups. These include the historical forecasts, in the format supported by the calibration CLIs, and the truth cubes.truth_attribute (
str) – An attribute and its value in the format of “attribute=value”, which must be present on truth cubes.
- Return type:
- Returns:
A cube containing all the historic forecasts.
A cube containing all the truth data.
If found within the input cubes list a land-sea mask will be returned, else None is returned.
- Raises:
ValueError – An unexpected number of distinct cube names were passed in.
IOError – More than one cube was identified as a land-sea mask.
IOError – Missing truth or historical forecast in input cubes.
- split_netcdf_parquet_pickle(files)[source]#
Split the input files into netcdf, parquet, and pickle files. Only a single pickle file is expected.
- Parameters:
files – A list of input file paths which will be split into pickle, parquet, and netcdf files.
- Returns:
A flattened cube list containing all the cubes contained within the provided paths to NetCDF files.
A list of paths to Parquet files.
A loaded pickle file.
- Raises:
ValueError – If multiple pickle files provided, as only one is ever expected.
- validity_time_check(forecast, validity_times)[source]#
Check the validity time of the forecast matches the accepted validity times within the validity times list.
- Parameters:
forecast (
Cube) – Cube containing the forecast to be calibrated.validity_times (
List[str]) – Times at which the forecast must be valid. This must be provided as a four digit string (HHMM) where the first two digits represent the hour and the last two digits represent the minutes e.g. 0300 or 0315. If the forecast provided is at a different validity time then no coefficients will be applied.
- Return type:
- Returns:
If the validity time within the cube matches a validity time within the validity time list, then True is returned. Otherwise, False is returned.