cP2O: Context-Aware Water Level Forecasting

Hybrid context + dilated LSTM with attention for 4–6 hour WWTP forecasts
Md Nazmul Kabir Sikder1, Feras A. Batarseh2
1Virginia Tech, Bradley Department of ECE (CCI)   |   2Virginia Tech, Biological Systems Engineering
📄 Paper (PDF) 💻 Code (GitHub)

Abstract

Wastewater utilities need accurate 4–6 hour forecasts to operate pumps, chemicals, and energy during extreme weather. We propose cP2O, a context-driven forecasting architecture that fuses exogenous signals (weather, river flow, demographics, economic activity) with internal plant data. A two-stage pipeline performs dynamic context extraction plus hierarchical dilated LSTM forecasting with attention and quantile loss, producing point forecasts and calibrated prediction intervals. On two full-scale utilities (DC Water tunnel levels, AlexRenew nitrate) cP2O cuts MAPE by 22% and 19% vs strong baselines, with 90% bands covering 90.5% ± 3.2% of observations (5.9% below, 3.6% above).

Research Questions

  • Does adding external context improve 4–6h WWTP forecasts vs. internal-only models?
  • Can one architecture adapt to multiple utilities (tunnel levels, nitrate) with high accuracy?
  • Does attention help dynamically weight inputs for better forecasts?
  • Does quantile loss reduce peak-event bias and yield reliable prediction intervals?

Contributions

Hybrid architecture

Dynamic context extractor + dilated LSTMs with attention to capture short/long temporal structure without heavy preprocessing.

Context integration

Weather, river, demographic, and economic signals shape forecasts; quantile loss reduces bias during peaks.

Uncertainty-aware

Predicts point values and prediction intervals for operational risk decisions in SCADA workflows.

Real deployments

Validated on DC Water tunnel levels and AlexRenew nitrate forecasting with consistent accuracy gains.

Method Overview

The pipeline aligns context and utility data, applies dynamic smoothing, and feeds concatenated features to a 3-layer dilated LSTM (dilations 1/2/4) with an internal attention gate. Quantile pinball loss produces median forecasts and 90% intervals while mitigating peak bias. An ensemble variant (cP2Oe) averages members for robustness.

Two-stage context extraction and forecasting pipeline for cP2O.

Stage 1 extracts context vectors; Stage 2 performs multi-horizon forecasting with attention and dilated LSTMs.

Dilated LSTM cell with recent and delayed states plus attention modulation.

Dilated LSTM cell mixes recent and delayed states; attention modulates inputs for feature saliency.

Datasets & Context

DC Water (Blue Plains): tunnel level forecasting under flash-flood and coastal surge events; rain gauges, pumps, flow sensors, NOAA storm context, river levels.

AlexRenew: nutrient forecasting (pH, ammonia, nitrate) with weather, influent flow, and operational context.

Context signals used in cP2O
GroupExamplesRole
WeatherRainfall, temp, humidity, windDrives inflow surges and dilution
River/OceanFlow, stage/tideBackflow risk and boundary conditions
Demographic/EconomicUsage intensity proxiesBaseline demand patterns
WWTP sensorsLevels, pumps, flows, chemistryCore autoregressive signal

Experiments

Setup. 80/20 train/val; 48-step input, 4–6h horizon; quantile pinball loss (0.05/0.5/0.95); Adam with decaying LR; batch 16→64; ensembles up to 20 members.

Architectural levers. Dynamic smoothing + context extractor → concatenated with plant data → 3-layer dilated LSTM (d=1/2/4) with internal attention → point + interval forecasts; quantile loss reduces peak bias.

Ablations. Removing context, attention, or dilation raises MAPE/RMSE and drops peak detection; context and attention are the largest gains, dilation helps on fast spikes.

Results

MAPE 2.10%DC Water (tunnel)
MAPE 1.90%AlexRenew (NO3)
PDR 93.5%Peak detection (DC)
−22% / −19%MAPE vs strong baselines
Headline performance
TaskHorizonGain vs baselines90% PI coverageNotes
DC Water tunnel levels4–6 h−22% MAPE90.5% ± 3.2% (5.9% below / 3.6% above)PDR 93.5%, fast-spike tracking
AlexRenew nitrate4–6 h−19% MAPE≈90% interval coverageImproved bias under peaks
Forecast vs actual with prediction intervals during peak events.

cP2Oe tracks extreme peaks with calibrated intervals; context is critical for flood surges.

Takeaways

  • Context-aware forecasting cuts MAPE by ~20% and improves peak detection vs. non-context baselines.
  • Quantile loss + ensembles provide actionable prediction intervals for SCADA alarms.
  • Designed for deployment: hourly rolling forecasts, 4–6 hour horizons, compatible with existing tunnel/plant ops.