AugCXR Dataset: An Augmented Chest X-Ray Image Dataset for Robust Deep Learning Pneumonia Diagnosis

Authors

  • Waqar Ahmad Department of Computer Science, University of Hertfordshire, United Kingdom
  • Deepak Panday Department of Computer Science, University of Hertfordshire, United Kingdom
  • Muhammad Ibrahim Institute of Biotechnology and Genetic Engineering, The University of Agriculture, Pakistan
  • Tahir Mehmood School of Information Technology, UNITAR International University, Malaysia
  • Muhammad Yaqoob Department of Computer Science, University of Hertfordshire, United Kingdom
Volume: 16 | Issue: 3 | Pages: 35077-35084 | June 2026 | https://doi.org/10.48084/etasr.16271

Abstract

Chest X-ray (CXR) imaging is commonly used to detect pneumonia, but reliance on expert radiologists may delay diagnosis and increase costs. The shortage of radiologists and the risk of diagnostic errors highlight the need for automated solutions. Deep learning models using Convolutional Neural Networks (CNNs) have shown potential in computer-aided diagnosis of pneumonia from CXR images. Most studies use the publicly available dataset from Guangzhou Women and Children’s Hospital, which includes CXR images of children aged 1 to 5. However, the dataset is imbalanced, with more pneumonia cases than normal. This imbalance may affect model performance and generalizability. This study proposes a geometric data augmentation method using five transformations: rotation, width-shift, height-shift, zoom, and brightness to balance the dataset and improve model accuracy. The proposed Augmented Chest X-ray (AugCXR) dataset was validated using three widely adopted architectures: Improved Visual Geometry Group-13 (IVGG13), MobileNetV2, and EfficientNetV2L. The results demonstrate that the proposed augmentation method enhances classification performance across all three pretrained deep learning models.

Keywords:

pneumonia, chest X-ray, data augmentation, deep learning

References

"Pneumonia in Children," World Health Organization, Nov. 2022. https://www.who.int/news-room/fact-sheets/detail/pneumonia.

S. Andronikou et al., "Guidelines for the Use of Chest Radiographs in Community-Acquired Pneumonia in Children and Adolescents," Pediatric Radiology, vol. 47, no. 11, pp. 1405–1411, Oct. 2017.

K. Pink, I. Mitchell, and H. Davies, "P17 The Accuracy of a Diagnosis of Pneumonia in a UK Teaching Hospital," Thorax, vol. 67, no. Suppl 2, Dec. 2012, Art. no. A71.

H. N. T. Al-Azzawi et al., "Utilization of a Deep Convolutional Neural Network for the Binary Classification of Chest X-Ray Pneumonia," Engineering, Technology & Applied Science Research, vol. 15, no. 1, pp. 20471–20483, Feb. 2025.

T. Mehmood, A. E. Gerevini, A. Lavelli, M. Olivato, and I. Serina, "Distilling Knowledge with a Teacher’s Multitask Model for Biomedical Named Entity Recognition," Information, vol. 14, no. 5, Apr. 2023, Art. no. 255.

A. M. Nababan et al., "Extreme Learning Machine Approach on Heart Abnormalities Identification in ECG Images," International Journal of Electronics and Telecommunications, pp. 473–480, Jun. 2024.

N. C. Kundur, B. C. Anil, P. M. Dhulavvagol, R. Ganiger, and B. Ramadoss, "Pneumonia Detection in Chest X-Rays Using Transfer Learning and TPUs," Engineering, Technology & Applied Science Research, vol. 13, no. 5, pp. 11878–11883, Oct. 2023.

T. Rahman et al., "Transfer Learning with Deep Convolutional Neural Network (CNN) for Pneumonia Detection Using Chest X-ray," Applied Sciences, vol. 10, no. 9, May 2020, Art. no. 3233.

T. Kaur and T. K. Gandhi, "Automated Brain Image Classification Based on VGG-16 and Transfer Learning," in International Conference on Information Technology, Bhubaneswar, India, Dec. 2019, pp. 94–98.

M. Ali et al., "Pneumonia Detection Using Chest Radiographs with Novel EfficientNetV2L Model," IEEE Access, vol. 12, pp. 34691–34707, 2024.

Z.-P. Jiang, Y.-Y. Liu, Z.-E. Shao, and K.-W. Huang, "An Improved VGG16 Model for Pneumonia Image Classification," Applied Sciences, vol. 11, no. 23, Nov. 2021, Art. no. 11185.

M. Elgendi et al., "The Effectiveness of Image Augmentation in Deep Learning Networks for Detecting COVID-19: A Geometric Transformation Perspective," Frontiers in Medicine, vol. 8, Mar. 2021, Art. no. 629134.

B. Jing and Y. Du, "DTDG-Net: A Few-shot Data Augmentation Method for X-ray Security Inspection Images," in International Conference on Image Processing, Computer Vision and Machine Learning, Shenzhen, China, Nov. 2024, pp. 493–502.

M. M. A. Monshi, J. Poon, V. Chung, and F. M. Monshi, "CovidXrayNet: Optimizing Data Augmentation and CNN Hyperparameters for Improved COVID-19 Detection from CXR," Computers in Biology and Medicine, vol. 133, Jun. 2021, Art. no. 104375.

S. Kora Venu and S. Ravula, "Evaluation of Deep Convolutional Generative Adversarial Networks for Data Augmentation of Chest X-ray Images," Future Internet, vol. 13, no. 1, Dec. 2020, Art. no. 8.

M. Moradi, A. Madani, A. Karargyris, and T. F. Syeda-Mahmood, "Chest X-Ray Generation and Data Augmentation for Cardiovascular Abnormality Classification," in Medical Imaging 2018: Image Processing, Houston, TX, United States, Mar. 2018, Art. no. 57.

S. Motamed, P. Rogalla, and F. Khalvati, "Data Augmentation Using Generative Adversarial Networks (GANs) for GAN-Based Detection of Pneumonia and COVID-19 in Chest X-Ray Images," Informatics in Medicine Unlocked, vol. 27, 2021, Art. no. 100779.

Y. Pamungkas, M. R. N. Ramadani, and E. N. Njoto, "Effectiveness of CNN Architectures and SMOTE to Overcome Imbalanced X-Ray Data in Childhood Pneumonia Detection," Journal of Robotics and Control, vol. 5, no. 3, pp. 775–785, Apr. 2024.

A. Alqahtani, Q. Abu Al‐Haija, A. A. Alsulami, B. Alturki, N. Alqahtani, and R. Alsini, "Optimizing Chest Tuberculosis Image Classification with Oversampling and Transfer Learning," IET Image Processing, vol. 18, no. 5, pp. 1109–1118, Apr. 2024.

E. Chamseddine, N. Mansouri, M. Soui, and M. Abed, "Handling Class Imbalance in COVID-19 Chest X-Ray Images Classification: Using SMOTE and Weighted Loss," Applied Soft Computing, vol. 129, Nov. 2022, Art. no. 109588.

K. Koonsanit, S. Thongvigitmanee, N. Pongnapang, and P. Thajchayapong, "Image Enhancement on Digital X-Ray Images Using N-CLAHE," in 2017 10th Biomedical Engineering International Conference (BMEiCON), Hokkaido, Japan, Aug. 2017, pp. 1–4.

S. Saifullah and R. Dreżewski, "Modified Histogram Equalization for Improved CNN Medical Image Segmentation," Procedia Computer Science, vol. 225, pp. 3021–3030, 2023.

M. Sharma and D. Kumar, "Comparative Analysis of Image Enhancement Techniques for Chest X-ray Images," in International Conference on Computational Intelligence and Sustainable Engineering Solutions, Greater Noida, India, May 2022, pp. 130–135.

K. Munir, M. Usama Tanveer, H. J. Alyamani, A. Bermak, and A. Ur Rehman, "PneuX-Net: An Enhanced Feature Extraction and Transformation Approach for Pneumonia Detection in X-Ray Images," IEEE Access, vol. 13, pp. 84024–84037, 2025.

D. Kermany, "Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification." Mendeley, Jan. 06, 2018, [Online]. Available: https://data.mendeley.com/datasets/rscbjbr9sj/2.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, Jun. 2018, pp. 4510–4520.

M. Tan and Q. Le, "EfficientNetV2: Smaller Models and Faster Training," in Proceedings of the 38th International Conference on Machine Learning, Virtual, Jul. 2021, Art. no. 139.

Downloads

How to Cite

[1]
W. Ahmad, D. Panday, M. Ibrahim, T. Mehmood, and M. Yaqoob, “AugCXR Dataset: An Augmented Chest X-Ray Image Dataset for Robust Deep Learning Pneumonia Diagnosis”, Eng. Technol. Appl. Sci. Res., vol. 16, no. 3, pp. 35077–35084, Jun. 2026.

Metrics

Abstract Views: 9
PDF Downloads: 4

Metrics Information