Streamflow classification by employing various machine learning models for peninsular Malaysia

Nouar AlDahoul; Mhd Adel Momo; K. L. Chong; Ali Najah Ahmed; Yuk Feng Huang; Mohsen Sherif; Ahmed El-Shafie

doi:10.1038/s41598-023-41735-9

Streamflow classification by employing various machine learning models for peninsular Malaysia

Journal

Scientific Reports

ISSN

2045-2322

Date Issued

2023-09-04

Author(s)

Nouar AlDahoul

Mhd Adel Momo

K. L. Chong

Ali Najah Ahmed

Yuk Feng Huang

Lee Kong Chian Faculty of Engineering and Science

Mohsen Sherif

Ahmed El-Shafie

DOI

10.1038/s41598-023-41735-9

Abstract

Abstract
Due to excessive streamflow (SF), Peninsular Malaysia has historically experienced floods and droughts. Forecasting streamflow to mitigate municipal and environmental damage is therefore crucial. Streamflow prediction has been extensively demonstrated in the literature to estimate the continuous values of streamflow level. Prediction of continuous values of streamflow is not necessary in several applications and at the same time it is very challenging task because of uncertainty. A streamflow category prediction is more advantageous for addressing the uncertainty in numerical point forecasting, considering that its predictions are linked to a propensity to belong to the pre-defined classes. Here, we formulate streamflow prediction as a time series classification with discrete ranges of values, each representing a class to classify streamflow into five or ten, respectively, using machine learning approaches in various rivers in Malaysia. The findings reveal that several models, specifically LSTM, outperform others in predicting the following n-time steps of streamflow because LSTM is able to learn the mapping between streamflow time series of 2 or 3 days ahead more than support vector machine (SVM) and gradient boosting (GB). LSTM produces higher F1 score in various rivers (by 5% in Johor, 2% in Kelantan and Melaka and Selangor, 4% in Perlis) in 2 days ahead scenario. Furthermore, the ensemble stacking of the SVM and GB achieves high performance in terms of F1 score and quadratic weighted kappa. Ensemble stacking gives 3% higher F1 score in Perak river compared to SVM and gradient boosting.

File(s)

Name

Picture1.png

Type

personal picture

Size

3.11 KB

Format

PNG

Checksum

(MD5):21881560e0c3c9c06b18c6e8fdc11acf

Options

Streamflow classification by employing various machine learning models for peninsular Malaysia