Lung Cancer Classification with Novel Gene Biomarkers using Machine Learning

Contenu principal de l'article

Karthik Sekaran, Balajee Jeyakumar, Danny Omar Camacho Cárdenas, Nelson Ardila V., Ortiz Loayza, Roque Mauricio


In this paper, an effective machine learning framework is proposed to identify the novel gene biomarkers of lung cancer from the gene expressions of normal and Small Cell Lung Cancer (SCLC) tumor tissues of patients. The dataset is accessed from the gene expression omnibus repository and the accession number of the dataset is GSE50412. The differentially expressed genes (DEG) are selected based on the significance score obtained through the t-test. The optimal gene feature subsets from the top 100 DEG’s are obtained by Wolf Search Algorithm. The features are trained with different machine learning models, validated through k-fold cross-validation. This model proves its efficacy by attaining 92.7% accuracy, 92.6% of precision, and 92.7% recall on multilayered perceptron neural network classifier. The results are benchmarked with state-of-the-art machine learning algorithms, where the proposed pipeline outperformed the existing methods.

Renseignements sur l'article