A Comparative Analysis of Machine Learning Models for Simulating, Classifying, and Assessment River Inflow

Ali Najah Ahmed; Nguyen Van Thieu; Kai Lun Chong; Yuk Feng Huang; Ahmed El-Shafie

doi:10.1007/s11269-025-04146-1

A Comparative Analysis of Machine Learning Models for Simulating, Classifying, and Assessment River Inflow

Journal

Water Resources Management

ISSN

0920-4741

Date Issued

2025-04-08

Author(s)

Ali Najah Ahmed

Nguyen Van Thieu

Kai Lun Chong

Yuk Feng Huang

Lee Kong Chian Faculty of Engineering and Science

Ahmed El-Shafie

DOI

10.1007/s11269-025-04146-1

Abstract

Accurately classifying river inflow is crucial for understanding river dynamics and ecosystem health. This study evaluates the performance of seven machine learning models, including Support Vector Machines (SVM), Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Trees (DT), Random Forests (RF), Adaptive Boosting (AdaBoost), and Multi-Layer Perceptron (MLP), for streamflow classification. One of the key challenges in this task is the imbalance in class distributions, which can negatively impact model performance. To address this, we apply the Synthetic Minority Over-sampling Technique (SMOTE) to improve classification outcomes for minority classes. Furthermore, we investigate the impact of four proposed feature selection methods, including mutual information (MI-FS), linear kernel SVM (SVM-FS), random forest (RF-FS), and multi-criteria selection (MC-FS) on model performance by identifying optimal lag values. Model hyperparameters are fine-tuned using GridSearchCV technique, and evaluation step is assessed across seven performance metrics. Experimental results show that MLP and SVM consistently outperform other models, making them the most suitable choices for streamflow classification. Among the FS techniques, MC-FS demonstrates superior performance by effectively reducing dimensionality while preserving predictive power. However, our findings indicate that SMOTE enhances classification for minority classes but reduces accuracy for majority classes, highlighting a trade-off in handling imbalanced data. Additionally, we observe that the linear assumption in SVM-FS can negatively impact model performance when it fails to detect all relevant input features. These insights provide valuable guidance for future streamflow classification tasks.

Subjects

Inflow classification...

Feature selection

SMOTE

Random forest

Adaptive boosting

Machine learning

SUPPORT VECTOR MACHIN...

RANDOM FOREST

OZONE

File(s)

Name

j.png

Size

17.27 KB

Format

PNG

Checksum

(MD5):85f5e85fa8f8c13d7350540217a227b6

Options

A Comparative Analysis of Machine Learning Models for Simulating, Classifying, and Assessment River Inflow