Model stats
Training accuracy and diagnostic plots
Model error trend over time

Cross-validation RMSE for each forecast run (5-fold). The line is the mean across folds; the band is ±1 standard deviation. A downward trend indicates the model is improving as more data is collected.

Forecast error heatmap — last 7 days

Actual Agile price shown in yellow; thin grey lines are historical forecasts. The heatmap shows how errors evolve over forecast date — warm colours indicate larger error.

Diagnostic plots — last 30 days
1441 half-hour slots · 31 May 2026 – 30 Jun 2026
Actual vs Predicted Over Time

Actual Agile prices (amber) versus the spread of all forecasts made in the last 30 days. Red line is the median forecast; the band shows the P10–P90 range across all forecast runs. Narrower bands near recent dates reflect shorter lead times.

Prediction vs Actual Scatter

Each dot is one half-hour slot. Points on the diagonal indicate a perfect prediction. Colour shows forecast lead time — dots from short-lead forecasts should cluster closer to the diagonal.

Residuals Distribution

Distribution of forecast errors. A peak near zero with low spread indicates accurate, unbiased predictions. Positive values mean the actual price exceeded the forecast.

Forecast Error by Horizon

Forecast error distribution grouped by lead time. Boxes show the interquartile range; the diamond marks ±1 standard deviation. Wider boxes at longer lead times indicate less certainty further ahead.

Feature Importance

Average normalised feature importance across the three ensemble models (CatBoost, LightGBM, ExtraTrees). Higher bars indicate features that most strongly drive the predicted price. Updated each time the model retrains.

Feature set experiment
Last run: 25 Jun 2026 13:19 UTC

Walk-forward cross-validation (5 folds, 21-day train / 3-day test) across 10 candidate feature sets, scored on weighted MAE + RMSE with near-term forecasts upweighted (≤3 days at 3×, ≤7 days at 2×). The winning set — fr_weather — is highlighted in green and is used for all subsequent runs until the next experiment (scheduled every 14 days).