High-Performance Heart Disease Prediction Through Soft Voting Ensemble: An Interpretable Machine Learning Approach
Received: 29 January 2026 | Revised: 23 March 2026 and 4 April 2026 | Accepted: 5 April 2026 | Online: 6 June 2026
Corresponding author: Arshad Ali
Abstract
Heart disease continues to be one of the world's leading causes of death, making the creation of precise, comprehensible prediction models necessary to support risk assessment and early diagnosis. To address this need, ensemble learning combines several basic classifiers and has shown significant promise in enhancing prediction robustness and performance. In this context, this work assesses the effectiveness and interpretability of a voting-based ensemble classifier for heart disease prediction. The dataset utilized includes comprehensive health indicators such as age, providing a robust foundation for analysis. Three heterogeneous base classifiers were employed: Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Extra Trees. These classifiers were integrated into a soft voting ensemble by aggregating posterior class probabilities to generate final predictions. As a result, the soft voting ensemble achieved superior performance, with an accuracy of 98.03%, an F1-score of 98.01%, and a Receiver Operating Characteristic–Area Under the Curve (ROC-AUC) of 98.69%, outperforming individual base classifiers. This highlights the benefits of incorporating probabilistic predictions from base classifiers. Consequently, the model offers high predictive accuracy and interpretability, enabling its potential application in clinical decision-making. These findings suggest that ensemble learning approaches, and particularly soft voting mechanisms, can support healthcare providers in early detection and personalized treatment planning for heart disease.
Keywords:
heart disease, classification, ensemble learning, soft voting, Extra Trees, Support Vector Machine (SVM), Multilayer Perceptron (MLP)References
M. Rahardi, B. P. Asaddulloh, A. Aminuddin, F. F. Abdulloh, I. Saifudin, and F. P. Kusumawijaya, "Optimizing Machine Learning Models for Class Imbalance in Heart Disease Prediction," Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 23599–23604, June 2025. DOI: https://doi.org/10.48084/etasr.10407
V. V. R. Karna, V. R. Karna, V. Janamala, V. N. K. R. Devana, V. R. S. Ch, and A. B. Tummala, "A Comprehensive Review on Heart Disease Risk Prediction using Machine Learning and Deep Learning Algorithms," Archives of Computational Methods in Engineering, vol. 32, no. 3, pp. 1763–1795, Apr. 2025. DOI: https://doi.org/10.1007/s11831-024-10194-4
A. Dhankhar, S. Juneja, A. Juneja, and V. Bali, "Kernel Parameter Tuning to Tweak the Performance of Classifiers for Identification of Heart Diseases," International Journal of E-Health and Medical Communications, vol. 12, no. 4, pp. 1–16, 2021. DOI: https://doi.org/10.4018/IJEHMC.20210701.oa1
S. Parsa, P. Shah, R. Doijad, and F. Rodriguez, "Artificial Intelligence in Ischemic Heart Disease Prevention," Current Cardiology Reports, vol. 27, no. 1, Feb. 2025, Art. no. 44. DOI: https://doi.org/10.1007/s11886-025-02203-0
M. Ferdowsi, C.-H. Goh, H. Liu, G. Tse, J. M. Ho Hui, and X. Wang, "Clinical Application of Artificial Intelligence in the Diagnosis, Prediction, and Classification of Coronary Heart Disease," Cardiovascular Innovations and Applications, vol. 10, no. 1, Mar. 2025, Art. no. 976. DOI: https://doi.org/10.15212/CVIA.2025.0009
A. Arya, M. Sehgal, N. Bhatia, S. Juneja, and D. Koundal, "Heart disease prediction with machine learning and virtual reality: from future perspective," in Extended Reality for Healthcare Systems, S. Khan, M. Alam, S. A. Banday, and M. S. Usta, Eds. Cambridge, MA, USA: Academic Press, 2023, pp. 209–228. DOI: https://doi.org/10.1016/B978-0-323-98381-5.00011-8
R. Saini et al., "Firefly algorithm and DNN for improved contactless heart rate measurement from videos," Scientific Reports, vol. 16, no. 1, Jan. 2026, Art. no. 2778. DOI: https://doi.org/10.1038/s41598-025-32633-3
A. R. Ilyas, S. Javaid, and I. L. Kharisma, "Heart Disease Prediction Using ML," Engineering Proceedings, vol. 107, no. 1, Oct. 2025, Art. no. 124. DOI: https://doi.org/10.3390/engproc2025107124
M. D. Teja and G. M. Rayalu, "Optimizing heart disease diagnosis with advanced machine learning models: a comparison of predictive performance," BMC Cardiovascular Disorders, vol. 25, no. 1, Mar. 2025, Art. no. 212. DOI: https://doi.org/10.1186/s12872-025-04627-6
R. Reátegui, C. Tandazo-Malla, R. Suárez, and L. Ramírez-Cerna, "Cardiovascular risk prediction via ensemble machine learning and oversampling methods," Scientific Reports, vol. 15, no. 1, Dec. 2025, Art. no. 43576. DOI: https://doi.org/10.1038/s41598-025-30895-5
M. Ozcan and S. Peker, "A classification and regression tree algorithm for heart disease modeling and prediction," Healthcare Analytics, vol. 3, Nov. 2023, Art. no. 100130. DOI: https://doi.org/10.1016/j.health.2022.100130
N. Biswas et al., "Machine Learning‐Based Model to Predict Heart Disease in Early Stage Employing Different Feature Selection Techniques," BioMed Research International, vol. 2023, no. 1, May 2023, Art. no. 6864343. DOI: https://doi.org/10.1155/2023/6864343
"Heart Disease Data for Health Research." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/oktayrdeki/heart-disease.
M. H. N. Le et al., "Protective predictors of cardiovascular disease: an explainable AI approach," Public Health, vol. 250, Jan. 2026, Art. no. 106050. DOI: https://doi.org/10.1016/j.puhe.2025.106050
A. Kumar, S. Pal, A. Singh, and A. P. Singh, "Comparative study of supervised machine learning techniques in heart disease prediction: A review," AIP Conference Proceedings, vol. 3224, no. 1, Feb. 2025, Art. no. 020052. DOI: https://doi.org/10.1063/5.0247130
H. A. Al-Alshaikh et al., "Comprehensive evaluation and performance analysis of machine learning in heart disease prediction," Scientific Reports, vol. 14, no. 1, Apr. 2024, Art. no. 7819. DOI: https://doi.org/10.1038/s41598-024-58489-7
M. Imani, A. Beikmohammadi, and H. R. Arabnia, "Comprehensive Analysis of Random Forest and XGBoost Performance with SMOTE, ADASYN, and GNUS Under Varying Imbalance Levels," Technologies, vol. 13, no. 3, Feb. 2025, Art. no. 88. DOI: https://doi.org/10.3390/technologies13030088
J. Beinecke and D. Heider, "Gaussian noise up-sampling is better suited than SMOTE and ADASYN for clinical decision making," BioData Mining, vol. 14, no. 1, Nov. 2021, Art. no. 49. DOI: https://doi.org/10.1186/s13040-021-00283-6
M. Ahmadi, M. Khashei, and N. Bakhtiarvand, "Enhancing air quality classification using a novel discrete learning-based multilayer perceptron model (DMLP)," International Journal of Environmental Science and Technology, vol. 22, no. 5, pp. 3051–3062, Mar. 2025. DOI: https://doi.org/10.1007/s13762-024-06017-5
A. M. Elshewey, E. Selem, and A. H. Abed, "Improved CKD classification based on explainable artificial intelligence with extra trees and BBFS," Scientific Reports, vol. 15, no. 1, May 2025, Art. no. 17861. DOI: https://doi.org/10.1038/s41598-025-02355-7
Z. Abbas, S. Kim, N. Lee, S. A. W. Kazmi, and S. W. Lee, "A robust ensemble framework for anticancer peptide classification using multi-model voting approach," Computers in Biology and Medicine, vol. 188, Apr. 2025, Art. no. 109750. DOI: https://doi.org/10.1016/j.compbiomed.2025.109750
H. Khoshvaght, R. R. Permala, A. Razmjou, and M. Khiadani, "A critical review on selecting performance evaluation metrics for supervised machine learning models in wastewater quality prediction," Journal of Environmental Chemical Engineering, vol. 13, no. 6, Dec. 2025, Art. no. 119675. DOI: https://doi.org/10.1016/j.jece.2025.119675
Downloads
How to Cite
License
Copyright (c) 2026 Hoda El-Batrawy, Arshad Ali, Ghulam Mustafa, Gahangir Hossain, Wesam Ahmed

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
