simeasren.pv_analysis.metrics

Functions

calculate_error_metrics(data_sim_meas, location_name)

Calculate PV simulation error metrics relative to measured data.

Module Contents

simeasren.pv_analysis.metrics.calculate_error_metrics(data_sim_meas, location_name, plot_palette=None, exclude_non_palette=True)

Calculate PV simulation error metrics relative to measured data.

This function computes key statistical error metrics — Mean Difference, Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) — for each simulation tool relative to measured PV data for a specific location. The outputs are structured as lists of dictionaries, ready for plotting or further analysis.

Parameters:
  • data_sim_meas (pandas.DataFrame) – DataFrame containing measured and simulated PV data for a given location. Columns should include one “PV-MEAS” column for measured data and one or more simulation tool columns (e.g., “Turin PV-SIM1”).

  • location_name (str) – Name of the location (e.g., “Turin”) used to filter relevant columns and label outputs.

  • plot_palette (dict, optional) – Dictionary mapping simulation tool names to colors or labels for plotting. If None, no filtering is applied based on the palette.

  • exclude_non_palette (bool, optional (default=True)) – If True, only simulation tools listed in plot_palette are included. If False, all simulation columns are processed regardless of the palette.

Returns:

A tuple (mean_diff_results, mae_results, rmse_results) where each element is a list of dictionaries with the following structure: - “Location” : str — name of the location - “Tool” : str — simulation tool name - “Mean Difference (%)” / “MAE (%)” / “RMSE (%)” : float — computed metric

Return type:

tuple of lists

Raises:

None – Missing ‘PV-MEAS’ column is handled by returning empty lists. Columns with insufficient data (empty after NaN removal) are skipped.

Notes

  • Metrics are calculated in percent (%) by multiplying the raw value by 100.

  • The function aligns indices of simulated and measured data to handle missing values.

  • Tool names are extracted from column names by splitting at whitespace and using the second part if available.

Examples

>>> from simeasren import calculate_error_metrics, prepare_pv_data_for_plots
>>> data_sim_meas, _, _ = prepare_pv_data_for_plots("Turin", "2019")
>>> mean_diff, mae, rmse = calculate_error_metrics(
...     data_sim_meas=data_sim_meas,
...     location_name="Turin"
... )
>>> mean_diff[0]
{'Location': 'Turin', 'Tool': 'PG2-SARAH2', 'Mean Difference (%)': np.float64(1.1827557468101797)}