School of Mathematical and Statistical Sciences Faculty Publications and Presentations
Document Type
Article
Publication Date
6-30-2024
Abstract
In cancer diagnosis, machine learning helps improve cancer detection by providing doctors with a second perspective and allowing for faster and more accurate determination and decisions. Numerous studies have used both classic machine learning approaches and deep learning to address cancer classification. In this study, we examine the efficacy of five commonly used machine learning algorithms; both traditional and deep learning models namely, Logistic Regression, Support Vector Machines (SVM), Random Forest (RF), Decision Tree and Deep Neural Networks (DNN). We analyze their ability to properly classify tumors as Benign or Malignant using the Wisconsin breast cancer dataset (WBCD). Random Forest classifier was employed to reduce model complexity, successfully narrowing down the number of features to 17 through cross-validation and achieving a validation score of 96.84%. Subsequently, a grid search was used to determine the maximum tree depth, resulting in five. The Synthetic Minority Oversampling Technique (SMOTE) was employed as a resampling tool to balance the Benign and Malignant categories adequately solving the class imbalance problem encountered in classification problems. After evaluating the overall performance for the unbalanced data, Random Forest emerged as the best classification model with an accuracy of 98.20%, followed by Logistic Regression with an accuracy of 97.40%. However, after applying SMOTE, both Random Forest and Logistic Regression emerged as the best models both with an accuracy of 94.70%. B o t h Random Forest and Logistic Regression models had an outstanding p er fo rm a n ce with an area under the curve (AUC) value of 0.997 and 0.994 respectively.
Recommended Citation
Agbota, Lawrence, Edmund Agyemang, Priscilla Kissi-Appiah, Lateef Moshood, Akua Osei-Nkwantabisa, Vincent Agbenyeavu, Abraham Nsiah, and Augustina Adjei. “Enhancing Tumor Classification Through Machine Learning Algorithms for Breast Cancer Diagnosis.” Computer Engineering and Intelligent Systems 15, no. 1 (2024): 71–85. https://doi.org/10.7176/CEIS/15-1-08
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.
First Page
71
Last Page
85
Publication Title
Computer Engineering and Intelligent Systems
DOI
https://doi.org/10.7176/CEIS/15-1-08
Included in
Artificial Intelligence and Robotics Commons, Mathematics Commons, Medicine and Health Sciences Commons
Comments
The journal is licensed under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) License.