Enhanced Text Detection in Natural Scenes using Advanced Machine Learning Techniques
Received: 25 December 2024 | Revised: 11 January 2025, 21 January 2025, and 29 January 2025 | Accepted: 31 January 2025 | Online: 3 April 2025
Corresponding author: Praveen M. Dhulavvagol
Abstract
Text detection in natural scenes remains a fundamental challenge in computer vision, impacting applications from mobile navigation to document digitization. Traditional methods struggle with varying text orientations, complex backgrounds, and inconsistent lighting, while recent deep-learning approaches face computational efficiency challenges. This paper presents a novel hybrid machine-learning framework that combines traditional computer vision with advanced machine learning to achieve robust text detection. The framework integrates optimized preprocessing techniques, feature extraction methods, including Histogram Oriented Gradients (HOG) and Maximally Stable Extremal Regions (MSER), and a lightweight convolutional neural network for improved accuracy and efficiency. Experimental evaluation on benchmark datasets demonstrates superior performance, achieving 98% precision, 97.5% recall, and 97.8% F1-score, while maintaining real-time processing capabilities at 45 fps. The framework significantly outperforms existing methods in handling diverse text scenarios, establishing a new standard for natural scene text detection. This research contributes to the advancement of text detection technology and offers practical applications in augmented reality, autonomous navigation, and document processing systems.
Keywords:
text detection, natural scenes, feature extraction, machine learning, CNN, SVMDownloads
References
L. Dai and C. Chen, "Intelligent Detection Method of English Text in Natural Scenes in Video," Scientific Programming, vol. 2021, no. 1, 2021, Art. no. 6239112.
L. Sun, Q. Huo, W. Jia, and K. Chen, "A robust approach for text detection from natural scene images," Pattern Recognition, vol. 48, no. 9, pp. 2906–2920, Sep. 2015.
Xiangrong Chen and A. L. Yuille, "Detecting and reading text in natural scenes," in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Washington, DC, USA, 2004, vol. 2, pp. 366–373.
A. Khalil, M. Jarrah, M. Al-Ayyoub, and Y. Jararweh, "Text detection and script identification in natural scene images using deep learning," Computers & Electrical Engineering, vol. 91, May 2021, Art. no. 107043.
J. Yin, J. Zhang, and D. Li, "Natural scene text recognition based on artificial intelligence machine learning," in Second International Conference on Electronic Information Technology (EIT 2023), Wuhan, China, Aug. 2023.
F. Naiemi, V. Ghods, and H. Khalesi, "Scene text detection using enhanced Extremal region and convolutional neural network," Multimedia Tools and Applications, vol. 79, no. 37, pp. 27137–27159, Oct. 2020.
T. Q. Phan, P. Shivakumara, and C. L. Tan, "Text detection in natural scenes using Gradient Vector Flow-Guided symmetry," in Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan, Aug. 2012, pp. 3296–3299.
T. He, W. Huang, Y. Qiao, and J. Yao, "Text-Attentional Convolutional Neural Network for Scene Text Detection," IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2529–2541, Jun. 2016.
R. M. Badiger, R. Yakkundimath, G. Konnurmath, and P. M. Dhulavvagol, "Deep Learning Approaches for Age-based Gesture Classification in South Indian Sign Language," Engineering, Technology & Applied Science Research, vol. 14, no. 2, pp. 13255–13260, Apr. 2024.
H. Turki, M. Elleuch, M. Kherallah, and A. Damak, "Arabic-Latin Scene Text Detection based on YOLO Models," in 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), Hammamet, Tunisia, Sep. 2023, pp. 1–6.
D. Cao, Y. Zhong, L. Wang, Y. He, and J. Dang, "Scene Text Detection in Natural Images: A Review," Symmetry, vol. 12, no. 12, Dec. 2020, Art. no. 1956.
S. Zhao, L. Sun, G. Li, Y. Liu, and B. Liu, "A CCD based machine vision system for real-time text detection," Frontiers of Optoelectronics, vol. 13, no. 4, pp. 418–424, Dec. 2020.
H. F. Mahmood, "Text Detection and Recognition from Natural Images," Ph.D. dissertation, Loughborough University, 2020.
A. Mittal, P. P. Roy, P. Singh, and B. Raman, "Rotation and script independent text detection from video frames using sub pixel mapping," Journal of Visual Communication and Image Representation, vol. 46, pp. 187–198, Jul. 2017.
P. M. Dhulavvagol and S. G. Totad, "Performance Enhancement of Distributed Processing Systems Using Novel Hybrid Shard Selection Algorithm," Engineering, Technology & Applied Science Research, vol. 14, no. 2, pp. 13720–13725, Apr. 2024.
Downloads
How to Cite
License
Copyright (c) 2025 Shivanand <. Patil, V. S. Malemath, Suman Muddapur, Praveen M. Dhulavvagol

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.