Dual-Branch Convolutional Neural Network for Image Comparison in Presentation Style Coherence

Maria Vlahova-Takova; Milena Lazarova

doi:10.48084/etasr.9571

Authors

Maria Vlahova-Takova Technical University of Sofia, Sofia, Bulgaria
Milena Lazarova Technical University of Sofia, Sofia, Bulgaria

Volume: 15 | Issue: 2 | Pages: 21719-21727 | April 2025 | https://doi.org/10.48084/etasr.9571

Received: 10 November 2024 | Revised: 29 December 2024 and 29 December 2024 | Accepted: 4 January 2025 | Online: 3 April 2025

Corresponding author: Maria Vlahova-Takova

Abstract

Image comparison is an important task that is part of the pipeline in many different computer vision applications. Maintaining style coherence across presentation slides is essential for professionalism and effective communication. Inconsistent design elements, such as varying fonts, colors, borders, and logo placements, can disrupt the visual flow and diminish the overall impact. This study introduces a novel approach to automate the validation of presentation slide coherence using a Dual-Branch Convolutional Neural Network. The model is trained to calculate a similarity score between image slides based on key design parameters, including font consistency, color schemes, border styles, and layout alignment. The proposed CNN architecture is specifically designed to compare two inputs representing slide images for binary classification. Unlike traditional Siamese networks that rely on identical branches and a distance metric for feature comparison, the proposed dual-branch architecture concatenates feature embeddings from two specialized branches and processes them through fully connected layers for final classification, allowing more targeted and nuanced feature extraction and coherence evaluation. The model was evaluated on a custom image dataset comprising 6000 images synthesized following specific design guidelines for style coherence of image features to ensure consistency and variety in the dataset while maintaining a balance for comparative tasks. The experimental results demonstrate significant improvements over the baseline Siamese network across all key metrics. Specifically, the proposed model achieved an accuracy of 0.85 compared to 0.81 for the baseline Siamese network, Jaccard similarity 0.76 vs 0.72, Kappa coefficient 0.69 vs 0.62, and ROC AUC 0.87 vs 0.81. Additionally, precision increased from 0.73 to 0.77 and the F1-score reached 0.87, reflecting a stronger balance between precision and recall. This work provides a significant contribution to automated design evaluation, offering a flexible and modular architecture that supports multi-view analysis and captures intricate visual patterns and discrepancies. By addressing key limitations of traditional approaches, the proposed model provides a robust tool to ensure style coherence in professional presentations, paving the way for more efficient and accurate design validation processes.

Keywords:

image similarity, presentation advisor, image processing, presentation coherence, neural networks

Downloads

Download data is not yet available.

References

S. M. Kosslyn, R. A. Kievit, A. G. Russell, and J. M. Shephard, "PowerPoint® Presentation Flaws and Failures: A Psychological Analysis," Frontiers in Psychology, vol. 3, Jul. 2012.

D. A. Makandar, A. Patrot, and B. Halalli, "Color Image Analysis and Contrast Stretching using Histogram Equalization," International Journal of Advanced Information Science and Technology, vol. 3, no. 7, pp. 27–33, 2014.

F. S. Mohamad, A. A. Manaf, and S. Chuprat, "Histogram matching for color detection: A preliminary study," in 2010 International Symposium on Information Technology, Kuala Lumpur, Malaysia, Jun. 2010, pp. 1679–1684.

N. Sahu and M. Sonkusare, "A Study on Optical Character Recognition Techniques," International Journal of Computational Science, Information Technology and Control Engineering, vol. 4, no. 1, pp. 01–15, Jan. 2017.

G. Chechik, V. Sharma, U. Shalit, and S. Bengio, "Large Scale Online Learning of Image Similarity through Ranking," Journal of Machine Learning Research, vol. 11, pp. 1109–1135, 2009.

H. Hsu, X. He, Y. Peng, H. Kong, and Q. Zhang, "PosterLayout: A New Benchmark and Approach for Content-Aware Visual-Textual Presentation Layout," in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, Jun. 2023, pp. 6018–6026.

J. Gartner-Schmidt, "The New Normal – Virtual and Hybrid Presentations: Developing Content, Designing Slides, and Delivery Guidelines," Ear, Nose & Throat Journal, vol. 101, no. 9_suppl, pp. 20S-28S, Nov. 2022.

P. Bhargava and O. A. Awan, "Mastering Presentation Zen: How to Improve Presentation Design and Educational Impact," Current Problems in Diagnostic Radiology, vol. 50, no. 5, pp. 571–573, Sep. 2021.

E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, "ORB: An efficient alternative to SIFT or SURF," in 2011 International Conference on Computer Vision, Barcelona, Spain, Nov. 2011, pp. 2564–2571.

Y. Rubner, C. Tomasi, and L. J. Guibas, "A metric for distributions with applications to image databases," in Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), Bombay, India, 1998, pp. 59–66.

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, Apr. 2004.

V. Appana, T. M. Guttikonda, D. Shree, S. Bano, and H. Kurra, "Similarity Score of Two Images using Different Measures," in 2021 6th International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, Jan. 2021, pp. 741–746.

J. Wang et al., "Learning Fine-Grained Image Similarity with Deep Ranking," in 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, Jun. 2014, pp. 1386–1393.

V. Lytvyn, R. Peleshchak, I. Rishnyak, B. Kopach, and Y. Gal, "Detection of Similarity Between Images Based on Contrastive Language-Image Pre-Training Neural Network," in Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Systems. Volume I: Machine Learning Workshop, Apr. 2024.

S. Merugu, R. Yadav, V. Pathi, and H. R. Perianayagam, "Identification and Improvement of Image Similarity using Autoencoder," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 15541–15546, Aug. 2024.

D. Chicco, "Siamese Neural Networks: An Overview," in Artificial Neural Networks, H. Cartwright, Ed. Springer US, 2021, pp. 73–94.

I. Melekhov, J. Kannala, and E. Rahtu, "Siamese network features for image matching," in 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, Dec. 2016, pp. 378–383.

Ž. Ð. Vujovic, "Classification Model Evaluation Metrics," International Journal of Advanced Computer Science and Applications, vol. 12, no. 6, 2021.

N. Gessert and A. Schlaefer, "Learning Preference-Based Similarities from Face Images using Siamese Multi-Task CNNs." arXiv, Jan. 25, 2020, https://doi.org/10.48550/arXiv.2001.09371.

R. Z. Khaleel, H. Z. Khaleel, A. A. A. Al-Hareeri, A. Sh. M. Al-Obaidi, and A. J. Humaidi, "Improved Trajectory Planning of Mobile Robot Based on Pelican Optimization Algorithm," Journal Européen des Systèmes Automatisés, vol. 57, no. 4, pp. 1005–1013, Aug. 2024, https://doi.org/10.18280/jesa.570408.

H. Z. Khaleel and A. J. Humaidi, "Towards accuracy improvement in solution of inverse kinematic problem in redundant robot: A comparative analysis," International Review of Applied Sciences and Engineering, vol. 15, no. 2, pp. 242–251, Jan. 2024.

Vol. 15 (2025)	Vol. 7 (2017)
Vol. 14 (2024)	Vol. 6 (2016)
Vol. 13 (2023)	Vol. 5 (2015)
Vol. 12 (2022)	Vol. 4 (2014)
Vol. 11 (2021)	Vol. 3 (2013)
Vol. 10 (2020)	Vol. 2 (2012)
Vol. 9 (2019)	Vol. 1 (2011)
Vol. 8 (2018)

Dual-Branch Convolutional Neural Network for Image Comparison in Presentation Style Coherence

Authors

Abstract

Keywords:

Downloads

References

Downloads

How to Cite

Metrics

License