From Chaos to Detection: Accident Benchmarking in Surveillance Videos with a Curated Dataset and 3D CNNs
Received: 18 December 2025 | Revised: 31 January 2026, 4 February 2026, and 7 February 2026 | Accepted: 8 February 2026 | Online: 6 June 2026
Corresponding author: Ranjit Singh Sarban Singh
Abstract
Automatic accident detection from surveillance videos is an important task for intelligent transportation and public safety, yet it remains underdeveloped compared to violence and anomaly detection. While existing methods often report high accuracy, these results are frequently based on contaminated datasets containing duplicates, overlapping scenes, or annotation artifacts, which inflate performance and limit real-world applicability. To address this gap, this paper introduces a curated benchmark dataset of 513 videos from multiple sources, which are segmented into 4 s clips, resulting in 1,122 clips (561 per class). Using this dataset, we evaluate a spectrum of approaches ranging from handcrafted Violent Flows (ViF) + Support Vector Machine (SVM) pipelines to Convolutional Neural Network–Recurrent Neural Network (CNN–RNN) hybrids and modern 3D convolutional networks. The conducted experiments confirm that traditional methods collapse under stricter evaluation, whereas lightweight architectures such as X3D achieve the best balance between accuracy and efficiency. The reconfigured X3D-S variant achieved 84.48% accuracy, establishing a strong baseline for accident detection under realistic conditions, while offering both a cleaner benchmark and a practical design for future deployment. The Surveillance Curated Accident Dataset (SCAD) and full implementation code are publicly available and can be cited through this paper.
Keywords:
accident detection surveillance videos, anomaly detection, X3D, benchmark dataset, deep learning, computer vision, artificial intelligenceReferences
I. J. Mrema and M. A. Dida, "A Survey of Road Accident Reporting and Driver's Behavior Awareness Systems: The Case of Tanzania," Engineering, Technology & Applied Science Research, vol. 10, no. 4, pp. 6009–6015, Aug. 2020.
M. S. Arefin, M. I. S. Mahin, and F. A. Mily, "Real-time rapid accident detection for optimizing road safety in Bangladesh," Heliyon, vol. 11, no. 4, Feb. 2025, Art. no. e42432.
B. Pérez, M. Resino, T. Seco, F. García, and A. Al-Kaff, "Innovative Approaches to Traffic Anomaly Detection and Classification Using AI," Applied Sciences, vol. 15, no. 10, May 2025, Art. no. 5520.
P. Kalpana, G. Sowmiya, C. R. S. Sri, and S. Sivapriya, "Road traffic accident detection based on Yolov8 and Byte Track," AIP Conference Proceedings, vol. 3204, no. 1, Feb. 2025, Art. no. 040013.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 779–788.
D. S. Bolme, J. R. Beveridge, B. A. Draper, and Y. M. Lui, "Visual object tracking using adaptive correlation filters," in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 2010, pp. 2544–2550.
Y. Zhang et al., "ByteTrack: Multi-object Tracking by Associating Every Detection Box," in 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022, pp. 1–21.
K. Sabry and M. Emad, "Road Traffic Accidents Detection Based On Crash Estimation," in 2021 17th International Computer Engineering Conference, Cairo, Egypt, 2021, pp. 63–68.
S. Dikbas and Y. Altunbasak, "Novel True-Motion Estimation Algorithm and Its Application to Motion-Compensated Temporal Frame Interpolation," IEEE Transactions on Image Processing, vol. 22, no. 8, pp. 2931–2945, Aug. 2013.
T. Hassner, Y. Itcher, and O. Kliper-Gross, "Violent flows: Real-time detection of violent crowd behavior," in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA, 2012, pp. 1–6.
V. Machaca Arceda, K. Fernández Fabián, and J. C. Gutíerrez, "Real time violence detection in video," in International Conference on Pattern Recognition Systems (ICPRS-16), Talca, Chile, 2016, pp. 1–7.
V. Machaca Arceda and E. Laura Riveros, "Fast car Crash Detection in Video," in 2018 XLIV Latin American Computer Conference, Sao Paulo, Brazil, 2018, pp. 632–637.
F. Bukhari, B. Gul, J. H. Shah, and A. Ali, "Attention-Guided Transformer-CNN Hybrid for Real-Time Road Accident Classification with Interpretability." Social Science Research Network, Rochester, NY, Sept. 05, 2025.
C. Charan Kumar, "Accident Detection From CCTV Footage." Kaggle.
A. P. Shah, J.-B. Lamare, T. Nguyen-Anh, and A. Hauptmann, "CADP: A Novel Dataset for CCTV Traffic Camera based Accident Analysis," in 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance, Auckland, New Zealand, 2018, pp. 1–9.
L. Wen et al., "UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking," Computer Vision and Image Understanding, vol. 193, Apr. 2020, Art. no. 102907.
X. Chen, H. Xu, M. Ruan, M. Bian, Q. Chen, and Y. Huang, "SO-TAD: A surveillance-oriented benchmark for traffic accident detection," Neurocomputing, vol. 618, Feb. 2025, Art. no. 129061.
P. P. Kumar and K. Kant, "TU-DAT: A Computer Vision Dataset on Road Traffic Anomalies," Sensors, vol. 25, no. 11, May 2025, Art. no. 3259.
M. S. M. Shubber and Z. T. M. Al-Ta'i, "A review on video violence detection approaches," International Journal of Nonlinear Analysis and Applications, vol. 13, no. 2, pp. 1117–1130, July 2022.
M. M. Soliman, M. H. Kamal, M. A. El-Massih Nashed, Y. M. Mostafa, B. S. Chawky, and D. Khattab, "Violence Recognition from Videos using Deep Learning Techniques," in 2019 Ninth International Conference on Intelligent Computing and Information Systems, Cairo, Egypt, 2019, pp. 80–85.
M. Cheng, K. Cai, and M. Li, "RWF-2000: An Open Large Scale Video Database for Violence Detection," in 2020 25th International Conference on Pattern Recognition, Milan, Italy, 2021, pp. 4183–4190.
Ş. Aktı, G. A. Tataroğlu, and H. K. Ekenel, "Vision-based Fight Detection from Surveillance Cameras," in 2019 Ninth International Conference on Image Processing Theory, Tools and Applications, Istanbul, Turkey, 2019, pp. 1–6.
C. Feichtenhofer, "X3D: Expanding Architectures for Efficient Video Recognition," in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 200–210.
J. Su, P. Her, E. Clemens, E. Yaz, S. Schneider, and H. Medeiros, "Violence Detection using 3D Convolutional Neural Networks," in 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance, Madrid, Spain, 2022, pp. 1–8.
D. C. Senadeera, X. Yang, D. Kollias, and G. Slabaugh, "CUE-Net: Violence Detection Video Analytics with Spatial Cropping, Enhanced UniformerV2 and Modified Efficient Additive Attention," in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 2024, pp. 4888–4897.
M.-S. Kang, R.-H. Park, and H.-M. Park, "Efficient Spatio-Temporal Modeling Methods for Real-Time Violence Recognition," IEEE Access, vol. 9, pp. 76270–76285, 2021.
P. Her, E. Yaz, and S. Schneider, "Interpretable Convolutional Neural Network for Violence Recognition," in 11th International Conference on Computational Science and Computational Intelligence, Las Vegas, NV, USA, 2024, pp. 40–53.
B. Qi, B. Wu, and B. Sun, "Automated violence monitoring system for real-time fistfight detection using deep learning-based temporal action localization," Scientific Reports, vol. 15, no. 1, Aug. 2025, Art. no. 29497.
W. Sultani, C. Chen, and M. Shah, "Real-World Anomaly Detection in Surveillance Videos," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 6479–6488.
A. Gao and J. Liu, "STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive Applications," in 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, Hangzhou, China, 2025, pp. 6256–6263.
W. Sun, L. Cao, Y. Guo, and K. Du, "Multimodal and multiscale feature fusion for weakly supervised video anomaly detection," Scientific Reports, vol. 14, no. 1, Oct. 2024, Art. no. 22835.
G. D'Amicantonio et al., "Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection." arXiv, Aug. 08, 2025.
A. Phapale and S. Bhingarkar, "Deep Context-Aware Feature Extraction for Anomaly Detection in Surveillance Videos," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 21633–21638, Apr. 2025.
WhoCares258, "WhoCares258/SCAD: Surveillance Curated Accident Dataset." Zenodo, Feb. 04, 2026.
J. Carreira and A. Zisserman, "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset," in 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 4724–4733.
C. Feichtenhofer, H. Fan, J. Malik, and K. He, "SlowFast Networks for Video Recognition," in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, 2019, pp. 6201–6210.
Downloads
How to Cite
License
Copyright (c) 2026 Muhammad Umer Danka, Ranjit Singh Sarban Singh, Muhammad Ayoub Danka

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
