PulmoNet: A Hybrid CNN-Vision Transformer Architecture for Enhanced Lung Nodule Classification in CT Imaging

Authors

  • Venkatesh M R Presidency School of Computer Science and Engineering, Presidency University, Bengaluru, India
  • Hasan Hussain Shahul Hameed Presidency School of Computer Science and Engineering, Presidency University, Bengaluru, India
Volume: 16 | Issue: 2 | Pages: 33278-33284 | April 2026 | https://doi.org/10.48084/etasr.16612

Abstract

Lung cancer remains the leading cause of cancer-related mortality, with early detection by CT screening being critical for patient survival. Current deep learning approaches face significant limitations: Convolutional Neural Networks (CNNs) extract local texture patterns but cannot capture global spatial relationships, while Vision Transformers (ViTs) model long-range dependencies but struggle with fine-grained feature extraction. Existing hybrid architectures use static fusion strategies that fail to adapt to diverse nodule characteristics. This paper presents PulmoNet, a novel hybrid framework that integrates a modified ResNet-50 with Vision Transformers through an adaptive cross-attention fusion mechanism that dynamically adjusts branch contributions based on individual nodule morphology. The framework processes CT volumes through parallel pipelines where CNNs extract multi-scale local patterns and transformers capture long-range spatial dependencies. Evaluated on the LUNA16 and LIDC-IDRI datasets using 5-fold cross-validation, PulmoNet achieves 94.7% accuracy and 0.982 AUC-ROC, outperforming state-of-the-art baselines by 3.5-5.4%. Cross-dataset evaluation demonstrates robust generalization across nodule sizes, types, and locations. PulmoNet demonstrates clinical viability with 93.8% sensitivity at 95% specificity and 143 ms inference time, establishing potential for real-time lung cancer screening programs.

Keywords:

lung nodule classification, hybrid deep learning, Vision Transformer (ViT), adaptive fusion, medical image analysis, computer-aided diagnosis

Downloads

Download data is not yet available.

References

J. A. Barta, C. A. Powell, and J. P. Wisnivesky, "Global Epidemiology of Lung Cancer," Annals of Global Health, vol. 85, no. 1, Art. no. 8. DOI: https://doi.org/10.5334/aogh.2419

T. Wang, R. A. Nelson, A. Bogardus, and F. W. Grannis Jr, "Five-year lung cancer survival," Cancer, vol. 116, no. 6, pp. 1518–1525, 2010. DOI: https://doi.org/10.1002/cncr.24871

R. Nooreldeen and H. Bach, "Current and Future Development in Lung Cancer Diagnosis," International Journal of Molecular Sciences, vol. 22, no. 16, Aug. 2021. DOI: https://doi.org/10.3390/ijms22168661

S. T. Vemula, M. Sreevani, P. Rajarajeswari, K. Bhargavi, J. M. R. S. Tavares, and S. Alankritha, "Deep Learning Techniques for Lung Cancer Recognition," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 14916–14922, Aug. 2024. DOI: https://doi.org/10.48084/etasr.7510

N. Ayesha, "A Vision Transformer-Based Convolutional Neural Network for the Automated Diagnosis of Eye Diseases Using Self-Attention Mechanisms," Engineering, Technology & Applied Science Research, vol. 15, no. 4, pp. 24493–24497, Aug. 2025. DOI: https://doi.org/10.48084/etasr.10649

S. Huang, J. Yang, N. Shen, Q. Xu, and Q. Zhao, "Artificial intelligence in lung cancer diagnosis and prognosis: Current application and future perspective," Seminars in Cancer Biology, vol. 89, pp. 30–37, Feb. 2023. DOI: https://doi.org/10.1016/j.semcancer.2023.01.006

K. Suzuki, "Overview of deep learning in medical imaging," Radiological Physics and Technology, vol. 10, no. 3, pp. 257–273, Sept. 2017. DOI: https://doi.org/10.1007/s12194-017-0406-5

Z. UrRehman et al., "Effective lung nodule detection using deep CNN with dual attention mechanisms," Scientific Reports, vol. 14, no. 1, Feb. 2024, Art. no. 3934. DOI: https://doi.org/10.1038/s41598-024-51833-x

V. Thakare and S. S. Aote, "Prognostic Predictions in Lung Diseases Using Convolutional Neural Network and Attention Mechanism," in ICT Analysis and Applications, 2025, pp. 349–358. DOI: https://doi.org/10.1007/978-981-97-9526-0_31

B. Dayan, "Lung Disease Detection with Vision Transformers: A Comparative Study of Machine Learning Methods." arXiv, Nov. 18, 2024.

K. Abdullahi, K. Ramakrishnan, and A. B. Ali, "Deep Learning Techniques for Lung Cancer Diagnosis with Computed Tomography Imaging: A Systematic Review for Detection, Segmentation, and Classification," Information, vol. 16, no. 6, May 2025. DOI: https://doi.org/10.3390/info16060451

Y. Chen, E. Zitello, R. Guo, and Y. Deng, "The function of LncRNAs and their role in the prediction, diagnosis, and prognosis of lung cancer," Clinical and Translational Medicine, vol. 11, no. 4, 2021, Art. no. e367. DOI: https://doi.org/10.1002/ctm2.367

M. Šutić et al., "Diagnostic, Predictive, and Prognostic Biomarkers in Non-Small Cell Lung Cancer (NSCLC) Management," Journal of Personalized Medicine, vol. 11, no. 11, Oct. 2021. DOI: https://doi.org/10.3390/jpm11111102

"LUNA 16." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/mansigambhir13/luna-16.

I. Naseer, S. Akram, T. Masood, A. Jaffar, M. A. Khan, and A. Mosavi, "Performance Analysis of State-of-the-Art CNN Architectures for LUNA16," Sensors, vol. 22, no. 12, June 2022. DOI: https://doi.org/10.3390/s22124426

"LIDC-IDRI." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/zhangweiled/lidcidri.

Downloads

How to Cite

[1]
V. M R and H. H. S. Hameed, “PulmoNet: A Hybrid CNN-Vision Transformer Architecture for Enhanced Lung Nodule Classification in CT Imaging”, Eng. Technol. Appl. Sci. Res., vol. 16, no. 2, pp. 33278–33284, Apr. 2026.

Metrics

Abstract Views: 198
PDF Downloads: 116

Metrics Information