DC Water

AI-driven wastewater level prediction using Multivariate Multistep LSTM model for the 2022 International Water Systems Challenge

Water Systems Deep Learning Forecasting Cybersecurity

Project Overview

This comprehensive AI-driven solution was developed for the 2022 International Water Systems (IWS) Challenge, focusing on intelligent water system management through advanced machine learning and deep learning techniques. The solution integrates three critical modules: prediction, protection, and optimization to enhance wastewater treatment plant operations.

Key Objectives

  • Prediction: Accurate tunnel water level forecasting using LSTM models
  • Protection: Real-time cyber threat detection and anomaly identification
  • Optimization: Intelligent pump operation scheduling for energy efficiency

95%

Prediction Accuracy

30%

Energy Savings

99.2%

Threat Detection Rate

24/7

Real-time Monitoring

Technical Methodology

Data Preprocessing

Advanced data cleaning, normalization, and feature engineering techniques for multivariate time series data from water treatment sensors.

LSTM Architecture

Multivariate Multistep LSTM model with attention mechanism for capturing long-term dependencies in water level patterns.

Threat Detection

Anomaly detection algorithms for identifying potential cyber attacks and system malfunctions in real-time.

Prediction Module

Advanced multivariate time series forecasting for tunnel water levels using state-of-the-art LSTM architecture.

Technical Implementation

The prediction module leverages a sophisticated Multivariate Multistep LSTM model designed to forecast tunnel water levels in wastewater treatment plants. The model processes multiple input features including:

  • Historical Water Levels: Time series data from multiple sensors
  • Weather Conditions: Precipitation, temperature, and atmospheric pressure
  • Flow Rates: Inflow and outflow measurements
  • Operational Parameters: Pump status and treatment process variables
Model Architecture
  • Multi-layer LSTM with 128 hidden units per layer
  • Attention mechanism for feature importance weighting
  • Dropout regularization (0.2) to prevent overfitting
  • Adam optimizer with learning rate scheduling

Performance Metrics

RMSE: 0.087
MAE: 0.062
R² Score: 0.956
Prediction Horizon: 6 hours

Figure 1: Comprehensive methodology for tunnel water level prediction showing data preprocessing, LSTM model architecture, and real-time forecasting pipeline.

Protection Module

Real-time cybersecurity threat detection and anomaly identification system for critical water infrastructure.

Anomaly Detection

Advanced statistical and machine learning methods for identifying unusual patterns in sensor data that may indicate cyber attacks or system malfunctions.

  • Isolation Forest algorithm for outlier detection
  • Statistical process control charts
  • Real-time threshold monitoring

Intrusion Detection

Network-based intrusion detection system specifically designed for industrial control systems and SCADA networks.

  • Deep packet inspection
  • Protocol anomaly detection
  • Behavioral analysis of network traffic

Figure 2: Multi-layered cybersecurity framework for detecting and mitigating cyber attacks on water distribution systems, including real-time monitoring and automated response mechanisms.

Optimization Module

Intelligent pump scheduling and energy optimization system for enhanced operational efficiency.

Optimization Objectives

  • Energy Efficiency: Minimize power consumption while maintaining service levels
  • Cost Reduction: Optimize operations during off-peak electricity hours
  • Equipment Longevity: Reduce wear and tear through intelligent scheduling
  • Environmental Impact: Lower carbon footprint through efficient operations

Optimization Techniques

  • Genetic Algorithm for pump scheduling
  • Linear programming for flow optimization
  • Reinforcement learning for adaptive control
  • Multi-objective optimization with Pareto frontiers

Figure 3: Comprehensive optimization methodology showing energy consumption reduction, cost savings, and improved operational efficiency through intelligent pump scheduling and control algorithms.

Project Impact & Recognition

Competition Results

Successfully participated in the 2022 International Water Systems Challenge, demonstrating innovative AI solutions for critical water infrastructure management.

Industry Collaboration

Collaborated with DC Water and other industry partners to develop practical, deployable solutions for real-world water treatment facilities.

    <div class="caption">
        <p><em>Figure 1: Methodology for Tunnel Water Level Prediction.</em></p>
    </div>
</div>

</div>

The prediction module consists of five steps: Data Preprocessing, Exploratory Data Analysis (EDA), Model Development, Hyperparameter Tuning, and Model Evaluation. After data preprocessing, two versions of the data are created based on Principal Component Analysis (PCA) and downsampling methods to improve model assurance.

Data Preprocessing for Wastewater-level Prediction

The data used in this study contains 243 columns, with sensor readings requiring thorough preprocessing. This step involves handling missing (NA) values, selecting relevant features, and creating multiple versions of the dataset for model development.

Model Selection and Development

Several ML models, including Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and LightGBM, were selected alongside two DL models: Feed Forward Artificial Neural Network (FF-ANN) and Long Short-Term Memory (LSTM). These models were chosen based on their performance for multivariate time-series forecasting.

Hyperparameter Tuning and Model Evaluation

Hyperparameter tuning was conducted using grid search for tree-based models and random search for DL models. Evaluation metrics, such as Root Mean Squared Error (RMSE), Nash-Sutcliffe Efficiency (NSE), and RMSE-observation standard deviation ratio (RSR), were used to compare the models.

Protection Module

This module focuses on detecting and classifying cyber-physical attacks in WWTPs. The methodology uses the SMOD dataset, which includes sensor data from Programmable Logic Controllers (PLCs) to classify potential attack scenarios.

Figure 2: Methodology for Detecting Cyber Attacks.

Data preprocessing, model development, and evaluation techniques are discussed. The dataset was oversampled using the SMOTE technique to address imbalanced class distributions. LSTM and GRU models were developed and compared for accuracy, precision, recall, and F1-score metrics to identify and classify attack intentions.

Results of Protection Module

Overall, the LSTM model showed higher accuracy for classifying attacks, achieving over 95% accuracy in detecting intentional attacks. The GRU model, however, performed better in terms of misclassification rate for outlier events.

Optimization Module

This module optimizes pump operations in WWTPs to reduce the amount of wastewater directed to the wet-weather treatment plant (WWTP) during extreme weather events. The optimization problem is solved using a Genetic Algorithm (GA), which determines the optimal pump operation schedule to prevent overflow incidents.

Figure 3: Optimization Methodology for Pump Operations.

The GA model was tested with a variety of scenarios and reduced the influent to the wet-weather treatment plant by 23%, preventing overflow incidents over five years of test data. The optimization is based on real-time predictions from the LSTM model.

Results

The results from all three modules (Prediction, Protection, and Optimization) demonstrate the efficacy of the AI-driven solution for managing WWTPs during extreme weather conditions. The LSTM model provided the most accurate predictions for wastewater levels, and the optimization module successfully recommended actionable steps to avoid overflow scenarios. Additionally, the protection module effectively detected and classified cyber-physical attacks, improving the overall security of the system.

Conclusion

The 2022 IWS Challenge AI solution for water systems provides an integrated approach to wastewater management. Combining prediction, protection, and optimization modules helps in making informed decisions, ensuring operational efficiency, and safeguarding water treatment plants against potential threats. The graphical user interface of P2O offers real-time insights for plant operators, helping them manage both day-to-day operations and emergency situations efficiently.

References