Agile Predict

Model detail

Gradient boosting ensemble

AgilePredict uses an equal-weight average of three gradient-boosted tree models. Each model makes different kinds of errors on unseen data; averaging them cancels a portion of those errors and produces a more stable forecast than any individual model.

CatBoost

Gradient boosted trees with symmetric (oblivious) tree structure and native handling of missing values. Tends to be robust without extensive hyperparameter tuning.

LightGBM

Leaf-wise gradient boosted trees. Fast to train and accurate on tabular regression, with native support for missing values. Complements CatBoost with a different tree-growth strategy.

Extra Trees

Extremely Randomised Trees: at each split both the candidate feature and the split threshold are randomised, using a random √n subset of features. This diversifies the ensemble and helps discover unusual feature combinations that drive extreme prices.

Training

The model is retrained each time an update runs (typically 4× daily). Each training example pairs a set of input features at a specific half-hour slot with the actual Agile price realised at that time.

Window: the most recent 90 days of historical forecasts.
Selection: one forecast per day (the run closest to 16:15) is used for training, matching the day-ahead auction publication schedule.
Sample weights: training examples where the actual price was far from the period mean are up-weighted. The weight is max(1, |z|) where z is the number of standard deviations from the mean. This pushes the models to predict price spikes and negative prices more accurately — errors on extreme slots are most consequential for Agile tariff users.

Cross-validation RMSE (5-fold) is computed after training and shown on the Stats page trend chart.

Feature selection

Not all available inputs are equally useful. Every 14 days (or on demand) an automated experiment evaluates the candidate feature sets using walk-forward cross-validation:

Structure: 5 folds of 21-day train / 3-day test, working backwards from the most recent data.
Scoring: each fold computes weighted MAE and weighted RMSE, then averages them.
Horizon weights: slots ≤3 days ahead get 3×, ≤7 days get 2×, beyond get 1×.
Tail weights: slots where the actual price has a high z-score get an additional max(1, |z|) multiplier, so the experiment rewards sets that predict extreme prices — not just typical days.

The winning set is stored in the database and applied to all subsequent runs until the next experiment. Results are shown on the Stats page.

Fixed and experimental features

Fixed base — always included

`opmr_surplus`	NESO operating margin reserve surplus (MW). When negative the grid is in deficit and prices spike. Included unconditionally: the signal is physically causal but rare enough that the CV experiment cannot score it reliably.
`bm_wind`	Metered wind generation
`solar`	Embedded solar generation
`emb_wind`	Embedded wind generation
`demand`	National electricity demand
`peak`	Flag for 16:00–19:00 pricing window
`weekend`	Weekend flag
`bank_holiday`	England & Wales public bank holiday (GOV.UK calendar)
`days_ago`	Age of the forecast at prediction time

Experimental — selected by CV

The experiment evaluates which optional features to add on top of the fixed base. The current candidate sets and their scores are shown on the Stats page. Candidate optional features include:

`fr_wind`	French 10 m wind speed (Open-Meteo)
`fr_rad`	French solar radiation (Open-Meteo)
`fr_nuclear`	French nuclear generation (ENTSO-E)
`wind_10m`	UK 10 m wind speed (Open-Meteo)
`temp_2m`	UK 2 m temperature (Open-Meteo)
`rad`	UK solar radiation (Open-Meteo)
`nuclear`	UK nuclear availability (BMRS)
`gas_ttf`	TTF natural gas futures (Yahoo Finance)

Confidence bands (p10–p90)

The shaded band represents a p10–p90 interval: roughly 80% of half-hour slots should fall inside it. It is constructed from two independent components that are then combined.

Empirical residuals

Held-out historical predictions are collected over the training window and their errors (actual − predicted) are binned by forecast horizon: 6 h, 12 h, 24 h, 36 h, 48 h, and beyond. The p10 and p90 of the residuals in each bin set a baseline interval that grows with look-ahead distance.

Weather ensemble

The ICON seamless NWP ensemble provides 10 alternative wind, solar, and temperature scenarios. Each member is run through the price model; the p10 and p90 of the resulting price predictions widen the band on days where weather is uncertain.

The two intervals are merged by taking the minimum of the lower bounds and maximum of the upper bounds. The final band is smoothed with a 3-period rolling average and always encloses the point forecast.