A Hardware-Aware Analysis of PTQ and QAT Quantized CNNs for Object Detection on FPGA
Received: 11 January 2026 | Revised: 15 February 2026 | Accepted: 25 February 2026 | Online: 14 March 2026
Corresponding author: Noura Jariri
Abstract
Real-time object detection on embedded platforms is critical for safety-critical and industrial applications, but FPGA deployment remains challenging due to constraints on numerical precision, latency, and hardware resources. Although quantization is widely used to enable efficient FPGA inference, its impact on object-detection models combining classification and bounding-box regression has not been systematically analyzed within an hls4ml based workflow. This work compares Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) for deploying a lightweight CNN-based detector on an FPGA. An FP32 model is quantized to INT16 and INT8 using QKeras and subsequently converted to fixed-point hardware representations with hls4ml. The results show that PTQ severely degrades detection performance, reducing classification accuracy to approximately 10% and mean IoU below 0.30. In contrast, QAT preserves near-floating-point performance, achieving ≈94% accuracy and ≈0.89 IoU at the software level for both INT16 and INT8. However, default HLS fixed-point configurations introduce software-hardware discrepancies, particularly in classification. A regression-aware refinement that increases fractional precision in the bounding-box head restores hardware-level localization accuracy (IoU ≈0.89), while residual classification gaps remain due to fixed-point constraints. These findings demonstrate that reliable FPGA-based object detection requires both QAT and hardware-aware fixed-point design, providing practical guidelines for low-precision deployment using hls4ml.
Keywords:
quantization, CNN, object detection, PTQ, QAT, hls4ml, FPGADownloads
References
S. Pouyanfar et al., "A Survey on Deep Learning: Algorithms, Techniques, and Applications," ACM Computing Surveys, vol. 51, no. 5, pp. 1–36, Sept. 2019. DOI: https://doi.org/10.1145/3234150
S. Dargan, M. Kumar, M. R. Ayyagari, and G. Kumar, "A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning," Archives of Computational Methods in Engineering, vol. 27, no. 4, pp. 1071–1092, Sept. 2020. DOI: https://doi.org/10.1007/s11831-019-09344-w
S. Dong, P. Wang, and K. Abbas, "A survey on deep learning and its applications," Computer Science Review, vol. 40, May 2021, Art. no. 100379. DOI: https://doi.org/10.1016/j.cosrev.2021.100379
J. He, J. Jiang, and C. Zhang, "A survey of lightweight methods for object detection networks," Array, vol. 29, Mar. 2026, Art. no. 100589. DOI: https://doi.org/10.1016/j.array.2025.100589
S. H. Hozhabr and R. Giorgi, "A Survey on Real-Time Object Detection on FPGAs," IEEE Access, vol. 13, pp. 38195–38238, 2025. DOI: https://doi.org/10.1109/ACCESS.2025.3544515
T. Saidani, R. Ghodhbani, A. Alhomoud, A. Alshammari, H. Zayani, and M. Ben Ammar, "Hardware Acceleration for Object Detection using YOLOv5 Deep Learning Algorithm on Xilinx Zynq FPGA Platform," Engineering, Technology & Applied Science Research, vol. 14, no. 1, pp. 13066–13071, Feb. 2024. DOI: https://doi.org/10.48084/etasr.6761
B. Jacob et al., "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference." arXiv, 2017. DOI: https://doi.org/10.1109/CVPR.2018.00286
G. Alsuhli, V. Sakellariou, H. Saleh, M. Al-Qutayri, B. Mohammad, and T. Stouraitis, "A Survey and Comparative Analysis of Number Systems for Deep Neural Networks," Proceedings of the IEEE, vol. 113, no. 2, pp. 172–207, Feb. 2025. DOI: https://doi.org/10.1109/JPROC.2025.3578756
A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, "A Survey of Quantization Methods for Efficient Neural Network Inference." arXiv, 2021. DOI: https://doi.org/10.1201/9781003162810-13
L. Wei, Z. Ma, C. Yang, and Q. Yao, "Advances in the Neural Network Quantization: A Comprehensive Review," Applied Sciences, vol. 14, no. 17, Aug. 2024, Art. no. 7445. DOI: https://doi.org/10.3390/app14177445
H. C. Moon, S. Lee, J. Jeong, and S. Kim, "YOLOv6+: simple and optimized object detection model for INT8 quantized inference on mobile devices," Signal, Image and Video Processing, vol. 19, no. 8, Aug. 2025, Art. no. 665. DOI: https://doi.org/10.1007/s11760-025-04234-0
D. Wu, Y. Wang, Y. Fei, and G. Gao, "A Novel Mixed-Precision Quantization Approach for CNNs," IEEE Access, vol. 13, pp. 49309–49319, 2025. DOI: https://doi.org/10.1109/ACCESS.2025.3551802
L. Huang et al., "HQOD: Harmonious Quantization for Object Detection." arXiv, 2024. DOI: https://doi.org/10.1109/ICME57554.2024.10687589
A. Chen, "Comparative Analysis of YOLO Variants Based on Performance Evaluation for Object Detection," ITM Web of Conferences, vol. 70, 2025, Art. no. 03008. DOI: https://doi.org/10.1051/itmconf/20257003008
A. Kumar and S. Srivastava, "Object Detection System Based on Convolution Neural Networks Using Single Shot Multi-Box Detector," Procedia Computer Science, vol. 171, pp. 2610–2617, 2020. DOI: https://doi.org/10.1016/j.procs.2020.04.283
M. Wang, H. Sun, J. Shi, X. Liu, B. Zhang, and X. Cao, "Q-YOLO: Efficient Inference for Real-time Object Detection." arXiv, 2023. DOI: https://doi.org/10.1007/978-3-031-47665-5_25
Z. Jiang, C. Li, T. Qu, C. He, and D. Wang, "MSQuant: Efficient Post-Training Quantization for Object Detection via Migration Scale Search," Electronics, vol. 14, no. 3, Jan. 2025, Art. no. 504. DOI: https://doi.org/10.3390/electronics14030504
C. U. Oflamaz and M. E. Yalçın, "HADQ-Net: A Power-Efficient and Hardware-Adaptive Deep Convolutional Neural Network Translator Based on Quantization-Aware Training for Hardware Accelerators," Electronics, vol. 14, no. 18, Sept. 2025, Art. no. 3686. DOI: https://doi.org/10.3390/electronics14183686
T. Aarrestad et al., "Fast convolutional neural networks on FPGAs with hls4ml," Machine Learning: Science and Technology, vol. 2, no. 4, Dec. 2021, Art. no. 045015. DOI: https://doi.org/10.1088/2632-2153/ac0ea1
S. Curzel, N. Ghielmetti, M. Fiorito, and F. Ferrandi, "De-specializing an HLS library for Deep Neural Networks: improvements upon hls4ml." arXiv, Mar. 24, 2021.
F. Fahim et al., "hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices." arXiv, Mar. 23, 2021.
H. Zhu, H. Wei, B. Li, X. Yuan, and N. Kehtarnavaz, "A Review of Video Object Detection: Datasets, Metrics and Methods," Applied Sciences, vol. 10, no. 21, Nov. 2020, Art. no. 7834. DOI: https://doi.org/10.3390/app10217834
R. Padilla, S. L. Netto, and E. A. B. Da Silva, "A Survey on Performance Metrics for Object-Detection Algorithms," in 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), July 2020, pp. 237–242. DOI: https://doi.org/10.1109/IWSSIP48289.2020.9145130
O. B. H. Salah, S. Messaoud, M. A. Hajjaji, M. Atri, and N. Liouane, "Post-training quantization for efficient FPGA-based neural network acceleration," Integration, vol. 105, Nov. 2025, Art. no. 102508. DOI: https://doi.org/10.1016/j.vlsi.2025.102508
C. Ding, S. Wang, N. Liu, K. Xu, Y. Wang, and Y. Liang, "REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs." arXiv, 2019. DOI: https://doi.org/10.1145/3289602.3293904
S. E. Chang et al., "Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework." arXiv, 2020. DOI: https://doi.org/10.1109/HPCA51647.2021.00027
C. Sun et al., "HGQ: High Granularity Quantization for Real-time Neural Networks on FPGAs," in Proceedings of the 2026 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Feb. 2026, pp. 79–91. DOI: https://doi.org/10.1145/3748173.3779200
M. Tasci, A. Istanbullu, V. Tumen, and S. Kosunalp, "FPGA-QNN: Quantized Neural Network Hardware Acceleration on FPGAs," Applied Sciences, vol. 15, no. 2, Jan. 2025, Art. no. 688. DOI: https://doi.org/10.3390/app15020688
M. Jaiswal, V. Sharma, A. Sharma, S. Saini, and R. Tomar, "Quantized CNN-based efficient hardware architecture for real-time hand gesture recognition," Microelectronics Journal, vol. 151, Sept. 2024, Art. no. 106345. DOI: https://doi.org/10.1016/j.mejo.2024.106345
Downloads
How to Cite
License
Copyright (c) 2026 Noura Jariri, Kaoutar Allabouche, Mohamed Benaly, Mohammed Chaman, Rania Majdoubi, Abdelkader Hadjoudja

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
