A Hardware Platform for Smart Video Monitoring Based on ESP32-CAM and Mojo FPGA (Spartan-6) with Event Activation Triggered by a PIR Sensor

Authors

  • Saltanat Adilzhanova Almaty Technological University, Almaty, Kazakhstan | Al Farabi Kazakh National University, Almaty, Kazakhstan
  • Gulshat Amirkhanova Al Farabi Kazakh National University, Almaty, Kazakhstan
  • Murat Kunelbayev Al Farabi Kazakh National University, Almaty, Kazakhstan
  • Aigerim Rakhysh Al Farabi Kazakh National University, Almaty, Kazakhstan
Volume: 16 | Issue: 3 | Pages: 36089-36096 | June 2026 | https://doi.org/10.48084/etasr.18359

Abstract

This article presents an event-based edge video surveillance architecture for Internet of Things (IoT) systems, in which the Passive Infrared (PIR) sensor initiates frame capture, the ESP32-CAM module (OV2640) performs control and network publishing, and the Mojo Field-Programmable Gate Array (FPGA) performs hardware-accelerated preprocessing and detection. Communication between the ESP32-CAM and FPGA is carried out via the Universal Asynchronous Receiver–Transmitter (UART) interface at a rate of 921,600 baud (optionally using INT/READY signals), whereas detection results are transmitted to the cloud via Wi-Fi using Message Queuing Telemetry Transport (MQTT) or Hypertext Transfer Protocol (HTTP). Two modes of operation are experimentally evaluated: ESP32-CAM only and ESP32-CAM + FPGA. In terms of end-to-end latency (from PIR triggering to acknowledgment received), a reduction in latency quantiles is observed. For N = 80 events, the P50 decreases from 620 to 480 ms, the P95 from 980 to 780 ms, and the P99 from 1,300 to 1,050 ms, confirming the performance gains achieved by offloading computation to dedicated hardware. The UART load is quantitatively characterized based on the transmitted data type. For compressed frames, the average packet size is approximately 1,200 B (P95: 1,600 B) at a rate of 25 packets/s, corresponding to approximately 30,000 B/s. For Regions of Interest (ROIs) or compact features, the average packet size is approximately 180 B (P95: 240 B) at 40 packets/s, corresponding to approximately 7,200 B/s. For detection results (bounding boxes, confidence values, and flags), the average packet size is approximately 64 B (P95: 96 B) at 40 packets/s, corresponding to approximately 2,560 B/s. These results demonstrate the advantage of transmitting feature-level data instead of full-frame data in bandwidth-constrained scenarios. Detection performance is evaluated using annotated views (60 windows per view). For human movement at a distance of 2 m (front view), accuracy/recall/F1-score values of 0.93/0.95/0.94 are achieved, whereas at 4 m (side view), the values are 0.87/0.85/0.86. The False Alarm Rate (FAR) is 0.15 and 0.25, respectively. For scenes without target movement, the FAR ranges from 0.03 to 0.17, depending on background conditions (idle scene, background movement, and lighting changes). The results demonstrate that the combination of the PIR sensor, ESP32-CAM, and FPGA provides an effective trade-off between latency, communication overhead, and detection performance, making it suitable as a minimal yet extensible platform for distributed security systems and industrial event-based monitoring.

Keywords:

ESP32, ESP32-CAM, Mojo FPGA, video surveillance, hardware monitoring

References

L. O. M. Ali, A. A. Mochtar, and F. Djamaluddin, "Design and Development of Motion Control for a Metal Waste Cleaning-24 Robot Using ESP32 and PID Control," Engineering, Technology & Applied Science Research, vol. 16, no. 1, pp. 31770–31778, Feb. 2026.

S. T. Nowroz, N. M. Saleh, S. Shakur, S. Banerjee, and F. Amsaad, "A Benchmark Reference for ESP32-CAM Module." arXiv, May 29, 2025.

P. R. C. Abordo et al., "Smart surveillance system using ESP32 and camera-based motion detection with IM technology," International Journal of Research Studies in Educational Technology, vol. 8, no. 2, pp. 63–74, July 2024.

K. Okokpujie, I. P. Okokpujie, F. T. Young, and R. E. Subair, "Development of an Affordable Real-Time IoT-Based Surveillance System Using ESP32 and TWILIO API," Journal of Safety and Security Engineering, vol. 13, no. 6, pp. 1069–1075, Dec. 2023.

A. Zhaxalikov, A. Mombekov, and Z. Sotsial, "Surveillance Camera Using Wi-Fi Connection," Procedia Computer Science, vol. 231, pp. 721–726, Jan. 2024.

F. Hahn, S. Valle, R. Rendón, O. Oyorzabal, and A. Astudillo, "Mango Fruit Fly Trap Detection Using Different Wireless Communications," Agronomy, vol. 13, no. 7, June 2023, Art. no. 1736.

C. L. Kok, J. B. Heng, Y. Y. Koh, and T. H. Teo, "Energy-, Cost-, and Resource-Efficient IoT Hazard Detection System with Adaptive Monitoring," Sensors, vol. 25, no. 6, Mar. 2025, Art. no. 1761.

K. Koszewski et al., "Utilizing IoT Sensors and Spatial Data Mining for Analysis of Urban Space Actors’ Behavior in University Campus Space Design," Sensors, vol. 25, no. 5, Feb. 2025, Art. no. 1393.

M. R. Z. Chowdhury, A. Seum, M. R. Talukder, R. A. Amin, F. S. Hossain, and R. Obermaisser, "Towards Next-Generation FPGA-Accelerated Vision-Based Autonomous Driving: A Comprehensive Review," Signals, vol. 6, no. 4, Oct. 2025, Art. no. 53.

O. Al-Shamma and M. A. Fadhel, "Trusted outdoor multi-camera tracking system powered by FPGA," Journal of Engineering Research, vol. 13, no. 4, pp. 3092–3106, Dec. 2025.

C. W. Heng, C. Uttraphan, C. C. Choon, and K. B. Ching, "Optimizing FPGA-based YOLO series accelerators: A survey of techniques," Neurocomputing, vol. 650, Oct. 2025, Art. no. 130874.

Z. Yan, B. Zhang, and D. Wang, "An FPGA-Based YOLOv5 Accelerator for Real-Time Industrial Vision Applications," Micromachines, vol. 15, no. 9, Sept. 2024, Art. no. 1164.

K. Zeng, Q. Ma, J. W. Wu, Z. Chen, T. Shen, and C. Yan, "FPGA-based accelerator for object detection: a comprehensive survey," The Journal of Supercomputing, vol. 78, no. 12, pp. 14096–14136, Aug. 2022.

S. H. Hozhabr and R. Giorgi, "A Survey on Real-Time Object Detection on FPGAs," IEEE Access, vol. 13, pp. 38195–38238, 2025.

S. M. Sali, M. Meribout, and A. A. Majeed, "Real Time FPGA Based CNNs for Detection, Classification, and Tracking in Autonomous Systems: State of the Art Designs and Optimizations." arXiv, Sept. 04, 2025.

T. Kryjak, "Event-Based Vision on FPGAs - a Survey," in 2024 27th Euromicro Conference on Digital System Design, Paris, France, 2024, pp. 541–550.

G. Gallego et al., "Event-Based Vision: A Survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, pp. 154–180, Jan. 2022.

K. S. Velaga, Y. Guo, and W. Yu, "Edge AI for Smart Cities: Foundations, Challenges, and Opportunities," Smart Cities, vol. 8, no. 6, Dec. 2025, Art. no. 211.

A. Trigkas, D. Piromalis, and P. Papageorgas, "Edge Intelligence in Urban Landscapes: Reviewing TinyML Applications for Connected and Sustainable Smart Cities," Electronics, vol. 14, no. 14, July 2025, Art. no. 2890.

S. Adilzhanova, A. Rakhysh, M. Kunelbayev, G. Amirkhanova, and D. Sybanova, "Digital Representations in IoT: Cryptographic Tools for Improved Security," Journal of Advances in Information Technology, vol. 17, no. 2, pp. 390–404, 2026.

T. Zhukabayeva, L. Zholshiyeva, N. Karabayev, S. Khan, and N. Alnazzawi, "Cybersecurity Solutions for Industrial Internet of Things–Edge Computing Integration: Challenges, Threats, and Future Directions," Sensors, vol. 25, no. 1, Jan. 2025, Art. no. 213.

P. Lech, B. Marciniak, and K. Okarma, "A Low-Cost Energy-Efficient IoT Camera Trap Network for Remote Forest Surveillance," Electronics, vol. 14, no. 21, Oct. 2025, Art. no. 4266.

Y. Gao, S. Wang, and H. K.-H. So, "REMOT: A Hardware-Software Architecture for Attention-Guided Multi-Object Tracking with Dynamic Vision Sensors on FPGAs," in Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 2022, pp. 158–168.

V. A. Méndez-Lópe, C. Soubervielle-Montalvo, A. S. Núñez-Varela, O. E. Pérez-Cham, and J. E. González-Galván, "A survey on FPGA-based design methodologies for visual object tracking," in V Congreso Internacional y XIII Congreso Nacional de Ciencias de la Computación, Puebla, Mexico, 2023, pp. 102–113.

A. O. Elfaki, W. Messoudi, A. Bushnag, S. Abuzneid, and T. Alhmiedat, "A Smart Real-Time Parking Control and Monitoring System," Sensors, vol. 23, no. 24, Dec. 2023, Art. no. 9741.

R. Al Amin and R. Obermaisser, "Real-Time Object Detection and Classification using YOLO for Edge FPGAs," in 2025 International Symposium ELMAR, Zadar, Croatia, 2025, pp. 291–295.

Downloads

How to Cite

[1]
S. Adilzhanova, G. Amirkhanova, M. Kunelbayev, and A. Rakhysh, “A Hardware Platform for Smart Video Monitoring Based on ESP32-CAM and Mojo FPGA (Spartan-6) with Event Activation Triggered by a PIR Sensor”, Eng. Technol. Appl. Sci. Res., vol. 16, no. 3, pp. 36089–36096, Jun. 2026.

Metrics

Abstract Views: 12
PDF Downloads: 3

Metrics Information