Amalgamating Ensemble Machine Learning Soft Voting Classifier, SMOTE, and Pearson's Correlation Coefficient for Enhanced Malware Detection
Received: 1 February 2025 | Revised: 23 February 2025 and 1 March 2025 | Accepted: 6 March 2025 | Online: 4 June 2025
Corresponding author: Zaid Ameen Abduljabbar
Abstract
Obfuscated malware poses a significant threat to personal and IoT devices, and traditional detection methods often face significant challenges and weaknesses in their capabilities and performance. This study proposes a malware detection approach using Machine Learning (ML) algorithms and a soft voting ensemble technique, enhanced by the Pearson's correlation coefficient for feature selection on the CIC-MalMem-2022 dataset. It addresses data imbalances with the Synthetic Minority Oversampling Technique (SMOTE) method and employs various ML classifiers. The results demonstrate improved accuracy, precision, and recall in malware detection compared to single classifiers and traditional methods. The research model is evaluated using a confusion matrix and evaluation metrics, and achieves 99.99% accuracy rate, 99.99% classification rate, 99.99% precision rate, 99.99% recall rate and 99.99% F1 score, surpassing the results of previous studies. These results indicate that the combination of feature selection and ensemble learning can significantly improve the efficiency and security of high-performance malware prediction systems, paving the way for advanced threat mitigation strategies.
Keywords:
machine learning, malware detection, SMOTE, IoT, feature selectionDownloads
References
M. Lombardi, F. Pascale, and D. Santaniello, "Internet of Things: A General Overview between Architectures, Protocols and Applications," Information, vol. 12, no. 2, Feb. 2021, Art. no. 87. DOI: https://doi.org/10.3390/info12020087
M. Shafiq, Z. Gu, O. Cheikhrouhou, W. Alhakami, and H. Hamam, "The Rise of ‘Internet of Things’: Review and Open Research Issues Related to Detection and Prevention of IoT-Based Security Attacks," Wireless Communications and Mobile Computing, vol. 2022, no. 1, Aug. 2022, Art. no. 8669348. DOI: https://doi.org/10.1155/2022/8669348
A. E. Omolara et al., "The internet of things security: A survey encompassing unexplored areas and new insights," Computers & Security, vol. 112, Jan. 2022, Art. no. 102494. DOI: https://doi.org/10.1016/j.cose.2021.102494
K. Aldriwish, "A Deep Learning Approach for Malware and Software Piracy Threat Detection," Engineering, Technology & Applied Science Research, vol. 11, no. 6, pp. 7757–7762, Dec. 2021. DOI: https://doi.org/10.48084/etasr.4412
R. Sridharan and S. Domnic, "Network policy aware placement of tasks for elastic applications in IaaS-cloud environment," Cluster Computing, vol. 24, no. 2, pp. 1381–1396, Jun. 2021. DOI: https://doi.org/10.1007/s10586-020-03194-z
M. A. Mohammed, M. A. Hussain, Z. A. Oraibi, Z. A. Abduljabbar, and V. O. Nyangaresi, "Secure Content Based Image Retrieval System Using Deep Learning," Basrah Researches Sciences, vol. 49, no. 2, pp. 94–111, Dec. 2023. DOI: https://doi.org/10.56714/bjrs.49.2.9
J.-P. A. Yaacoub, H. N. Noura, O. Salman, and A. Chehab, "Robotics cyber security: vulnerabilities, attacks, countermeasures, and recommendations," International Journal of Information Security, vol. 21, no. 1, pp. 115–158, Feb. 2022. DOI: https://doi.org/10.1007/s10207-021-00545-8
S. Abdelkader et al., "Securing modern power systems: Implementing comprehensive strategies to enhance resilience and reliability against cyber-attacks," Results in Engineering, vol. 23, Sep. 2024, Art. no. 102647. DOI: https://doi.org/10.1016/j.rineng.2024.102647
I. H. Sarker, A. I. Khan, Y. B. Abushark, and F. Alsolami, "Internet of Things (IoT) Security Intelligence: A Comprehensive Overview, Machine Learning Solutions and Research Directions," Mobile Networks and Applications, vol. 28, no. 1, pp. 296–312, Feb. 2023. DOI: https://doi.org/10.1007/s11036-022-01937-3
G. Sharma, S. Vidalis, N. Anand, C. Menon, and S. Kumar, "A Survey on Layer-Wise Security Attacks in IoT: Attacks, Countermeasures, and Open-Issues," Electronics, vol. 10, no. 19, Oct. 2021, Art. no. 2365. DOI: https://doi.org/10.3390/electronics10192365
A. Al-Marghilani, "Comprehensive Analysis of IoT Malware Evasion Techniques," Engineering, Technology & Applied Science Research, vol. 11, no. 4, pp. 7495–7500, Aug. 2021. DOI: https://doi.org/10.48084/etasr.4296
Gopinath and S. C. Sethuraman, "A comprehensive survey on deep learning based malware detection techniques," Computer Science Review, vol. 47, Feb. 2023, Art. no. 100529. DOI: https://doi.org/10.1016/j.cosrev.2022.100529
M. Aqeel, F. Ali, M. W. Iqbal, T. A. Rana, M. Arif, and Md. R. Auwul, "A Review of Security and Privacy Concerns in the Internet of Things (IoT)," Journal of Sensors, vol. 2022, no. 1, Sep. 2022, Art. no. 5724168. DOI: https://doi.org/10.1155/2022/5724168
N. K. Gyamfi, N. Goranin, D. Ceponis, and H. A. Čenys, "Automated System-Level Malware Detection Using Machine Learning: A Comprehensive Review," Applied Sciences, vol. 13, no. 21, Nov. 2023, Art. no. 11908. DOI: https://doi.org/10.3390/app132111908
E. Nowroozi, A. Dehghantanha, R. M. Parizi, and K.-K. R. Choo, "A survey of machine learning techniques in adversarial image forensics," Computers & Security, vol. 100, Jan. 2021, Art. no. 102092. DOI: https://doi.org/10.1016/j.cose.2020.102092
M. Soni and D. K. Singh, "New directions for security attacks, privacy, and malware detection in WBAN," Evolutionary Intelligence, vol. 16, no. 6, pp. 1917–1934, Dec. 2023. DOI: https://doi.org/10.1007/s12065-022-00759-2
D. Gibert, C. Mateu, and J. Planes, "The rise of machine learning for detection and classification of malware: Research developments, trends and challenges," Journal of Network and Computer Applications, vol. 153, Mar. 2020, Art. no. 102526. DOI: https://doi.org/10.1016/j.jnca.2019.102526
M. J. J. Ghrabat, G. Ma, I. Y. Maolood, S. S. Alresheedi, and Z. A. Abduljabbar, "An effective image retrieval based on optimized genetic algorithm utilized a novel SVM-based convolutional neural network classifier," Human-centric Computing and Information Sciences, vol. 9, no. 1, Aug. 2019, Art. no. 31. DOI: https://doi.org/10.1186/s13673-019-0191-8
R. J. Mohammed et al., "A Robust Hybrid Machine and Deep Learning-based Model for Classification and Identification of Chest X-ray Images," Engineering, Technology & Applied Science Research, vol. 14, no. 5, pp. 16212–16220, Oct. 2024. DOI: https://doi.org/10.48084/etasr.7828
K. Liu, S. Xu, G. Xu, M. Zhang, D. Sun, and H. Liu, "A Review of Android Malware Detection Approaches Based on Machine Learning," IEEE Access, vol. 8, pp. 124579–124607, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.3006143
M. S. Khalefa et al., "Deep Sentiment Analysis System with Attention Mechanism for the COVID-19 Vaccine," TEM Journal, vol. 13, no. 2, pp. 1470–1480, May 2024. DOI: https://doi.org/10.18421/TEM132-61
H. M. Jasim et al., "Provably Efficient Multi-Cancer Image Segmentation Based on Multi-Class Fuzzy Entropy," Informatica, vol. 47, no. 8, pp. 77–88, Sep. 2023. DOI: https://doi.org/10.31449/inf.v47i8.4840
M. J. J. Ghrabat, G. Ma, Z. A. Abduljabbar, M. A. Al Sibahee, and S. J. Jassim, "Greedy Learning of Deep Boltzmann Machine (GDBM)’s Variance and Search Algorithm for Efficient Image Retrieval," IEEE Access, vol. 7, pp. 169142–169159, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2948266
N. Usman et al., "Intelligent Dynamic Malware Detection using Machine Learning in IP Reputation for Forensics Data Analytics," Future Generation Computer Systems, vol. 118, pp. 124–141, May 2021. DOI: https://doi.org/10.1016/j.future.2021.01.004
A. Mezina and R. Burget, "Obfuscated malware detection using dilated convolutional network," in 2022 14th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops, Valencia, Spain, 2022, pp. 110–115. DOI: https://doi.org/10.1109/ICUMT57764.2022.9943443
H. Naeem, S. Dong, O. J. Falana, and F. Ullah, "Development of a deep stacked ensemble with process based volatile memory forensics for platform independent malware detection and classification," Expert Systems with Applications, vol. 223, Aug. 2023, Art. no. 119952. DOI: https://doi.org/10.1016/j.eswa.2023.119952
B. Taşcı, "Deep-Learning-Based Approach for IoT Attack and Malware Detection," Applied Sciences, vol. 14, no. 18, Sep. 2024, Art. no. 8505. DOI: https://doi.org/10.3390/app14188505
M. Rostami, K. Berahmand, E. Nasiri, and S. Forouzandeh, "Review of swarm intelligence-based feature selection methods," Engineering Applications of Artificial Intelligence, vol. 100, Apr. 2021, Art. no. 104210. DOI: https://doi.org/10.1016/j.engappai.2021.104210
J. Allgaier and R. Pryss, "Cross-Validation Visualized: A Narrative Guide to Advanced Methods," Machine Learning and Knowledge Extraction, vol. 6, no. 2, pp. 1378–1388, Jun. 2024. DOI: https://doi.org/10.3390/make6020065
Y. Zhang, H. Zhang, J. Cai, and B. Yang, "A Weighted Voting Classifier Based on Differential Evolution," Abstract and Applied Analysis, vol. 2014, no. 1, May 2014, Art. no. 376950. DOI: https://doi.org/10.1155/2014/376950
M. A. Khan et al., "Voting Classifier-Based Intrusion Detection for IoT Networks," in Advances on Smart and Soft Computing: Proceedings of ICACIn 2021, Casablanca, Morocco, 2021, pp. 313–328. DOI: https://doi.org/10.1007/978-981-16-5559-3_26
T. Carrier, P. Victor, A. Tekeoglu, and A. H. Lashkari, "Malware memory analysis (CIC-MalMem-2022)." Canadian Institute for Cybersecurity, UNB, 2022. [Online]. Available: https://www.unb.ca/cic/datasets/malmem-2022.html.
Downloads
How to Cite
License
Copyright (c) 2025 Mustafa Jumaah, Ali A. Yassin, Zaid Ameen Abduljabbar, Muwafaq Jawad, Vincent Omollo Nyangaresi, Ali Hassan Ali

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.