Options
Enhancing classification in high-dimensional data with robust rMI-SVM feature selection
Journal
Bulletin of Electrical Engineering and Informatics
ISSN
2089-3191
Date Issued
2024-10-01
DOI
10.11591/eei.v13i5.7938
Abstract
<jats:p>Dealing with high-dimensional datasets presents notable challenges for classification modelling, primarily due to complexity and susceptibility to overfitting. Traditional feature selection methods frequently struggle to guarantee improved classification performance by including more features. Instead, they often rely on utilising the entire feature set. To address these challenges, a robust feature selection algorithm known as ranked mutual information for support vector machines (rMI-SVM) has been introduced. This approach mitigates the risk of overfitting by selecting features that augment the classification model with additional information, thereby ensuring enhanced performance as more features are selected. rMI-SVM can accommodate datasets with missing values regardless of data linearity as it does not require additional parameters or preset the number of features needed. The proposed method offers a solution to the challenges posed by high-dimensional data, and explicitly identifies the optimal number of features required for a classification model, thus circumventing the necessity of using the full feature set. These findings are supported by receiver operating characteristic (ROC) curves, which highlight the effectiveness of rMI-SVM in outperforming existing baselines and delivering a superior classification model performance.</jats:p>
File(s)
Loading...
Name
Journal Article.png
Size
17.27 KB
Format
PNG
Checksum
(MD5):85f5e85fa8f8c13d7350540217a227b6
