Exposing Optimal Feature Sets for Enhancing Machine Learning Performance

The majority of high dimensional gene expression data contain a significant amount of redundant genes, posing challenges for machine learning algorithms due to their high dimensionality. Feature selection has shown to be a successful method for improving classification algorithms performance by addr...

Full description

Saved in:
Bibliographic Details
Main Authors: Hiba Mohammed Al-Marwai, Ghaleb H. Al-Gaphari, Mohammed Mohammed Zayed
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11037676/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The majority of high dimensional gene expression data contain a significant amount of redundant genes, posing challenges for machine learning algorithms due to their high dimensionality. Feature selection has shown to be a successful method for improving classification algorithms performance by addressing two primary objectives: reducing the number of features and improving classification accuracy. The aim of this paper is to introduce a novel hybrid multiobjective wrapper method for feature selection. The method combines the technique for Order Preference by Similarity to Ideal Solution (TOPSIS), which is a multi-attribute decision making approach with a filtering mechanism for extracting informative features. Additionally, a multiobjective Crow Search Algorithm (CSA) that simultaneously reduces the number of features and classification error whereas obtaining a set of Pareto nondominated (ND) solutions is employed. The opposition based learning (OBL) technique is used to mitigate the risk of CSA converging towards local optima. To evaluate the effectiveness of our approach, we conduct experiments on benchmark microarray datasets from the ADNI database. Comparative analysis is performed against six traditional single objective methods and five other existing multiobjective methods. The results demonstrate that our proposed approach outperforms the single objective methods in terms of classification accuracy. Moreover, when compared to other multiobjective algorithms, our method exhibits superior performance in terms of both classification accuracy and the number of selected features.
ISSN:2169-3536