Optimizing Similar Audience Search in Targeted Advertising: Effectiveness of Siamese Networks for Autoencoder-based User Embeddings

Il'murat Tokhtakhunov; Aizhan Altaibek; Marat Nurtas

doi:10.48084/etasr.10527

Authors

Il'murat Tokhtakhunov International Information Technology University, Manasa, Almaty, Kazakhstan
Aizhan Altaibek International Information Technology University, Manasa, Almaty, Kazakhstan | Institute of Ionosphere, Gardening Community IONOSPHERE 117, Almaty, Kazakhstan
Marat Nurtas International Information Technology University, Manasa, Almaty, Kazakhstan | Institute of Ionosphere, Gardening Community IONOSPHERE 117, Almaty, Kazakhstan

Volume: 15 | Issue: 3 | Pages: 23367-23375 | June 2025 | https://doi.org/10.48084/etasr.10527

Received: 10 February 2025 | Revised: 12 March 2025 and 26 March 2025 | Accepted: 29 March 2025 | Online: 26 May 2025

Corresponding author: Marat Nurtas

Abstract

This study investigates the effectiveness of using Siamese networks for comparing embedding vectors that describe user profiles. A model was developed to identify similar audiences in the context of targeted advertising. The analysis of the requirements for such a model revealed that traditional approaches to tabular data processing often struggle to address the unique challenges posed by this task, particularly in terms of scalability and adaptability. The proposed approach allows for the effective identification of lookalike users without relying on explicit feature engineering. This method was evaluated using an anonymized proprietary dataset provided by a telecommunications operator, which included sociodemographic descriptions of subscribers, their tariff plans, and mobile devices. Experimental results showed that the model achieved an F1 score of 0.75, a ROC-AUC of 0.79, and a lift score in the top 1 of 12.9, outperforming baseline methods in targeted user identification by 41.61% on average. The results highlight the ability of the proposed method to meet the key requirements for this task, showcasing its effectiveness and scalability. This study highlights the versatility of the proposed approach, emphasizing its applicability across various domains for tabular data classification tasks. Future research will focus on developing multiple autoencoders tailored to different domains and integrating them to solve specific tasks.

Keywords:

user profiling, siamese network, embeddings, audience selection, autoencoder, targeted advertising, cosine similarity distance

Downloads

Download data is not yet available.

References

A. Altaibek, I. Tokhtakhunov, M. Nurtas, D. Kozhamzharova, and M. Aitimov, "The Efficacy of Autoencoders in the Utilization of Tabular Data for Classification Tasks," Procedia Computer Science, vol. 238, pp. 492–502, 2024. DOI: https://doi.org/10.1016/j.procs.2024.06.052

K. Mamta and S. Sangwan, "AaPiDL: an ensemble deep learning-based predictive framework for analyzing customer behaviour and enhancing sales in e-commerce systems," International Journal of Information Technology, vol. 16, no. 5, pp. 3019–3025, Jun. 2024. DOI: https://doi.org/10.1007/s41870-024-01796-z

R. Gustriansyah, J. Alie, and N. Suhandi, "A Hybrid Machine Learning Model for Market Clustering," Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18824–18828, Dec. 2024. DOI: https://doi.org/10.48084/etasr.9259

W. Wang, "Dimensionality Reduction Task," in Principles of Machine Learning, Springer Nature Singapore, 2025, pp. 481–505. DOI: https://doi.org/10.1007/978-981-97-5333-8_15

S. A. Wegner, "Curse and Blessing of High Dimensionality," in Mathematical Introduction to Data Science, Springer Berlin Heidelberg, 2024, pp. 115–125. DOI: https://doi.org/10.1007/978-3-662-69426-8_8

Y. Bengio, L. Yao, G. Alain, and P. Vincent, "Generalized Denoising Auto-Encoders as Generative Models," in Advances in Neural Information Processing Systems, 2013, vol. 26, [Online]. Available: https://proceedings.neurips.cc/paper/2013/hash/559cb990c9dffd8675f6bc2186971dc2-Abstract.html.

H. S. Lom, A. C. Thoo, W. M. Lim, and K. Y. Koay, "Advertising value and privacy concerns in mobile advertising: the case of SMS advertising in banking," Journal of Financial Services Marketing, vol. 29, no. 3, pp. 1135–1153, Sep. 2024. DOI: https://doi.org/10.1057/s41264-023-00263-3

N. Capuano, M. Meyer, and F. D. Nota, "Analyzing the impact of conversation structure on predicting persuasive comments online," Journal of Ambient Intelligence and Humanized Computing, vol. 15, no. 11, pp. 3719–3732, Nov. 2024. DOI: https://doi.org/10.1007/s12652-024-04841-8

S. Merugu, R. Yadav, V. Pathi, and H. R. Perianayagam, "Identification and Improvement of Image Similarity using Autoencoder," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 15541–15546, Aug. 2024. DOI: https://doi.org/10.48084/etasr.7548

W. Lee, S. Lee, H. Kim, and J. Lee, "Sliced Wasserstein adversarial training for improving adversarial robustness," Journal of Ambient Intelligence and Humanized Computing, vol. 15, no. 8, pp. 3229–3242, Aug. 2024. DOI: https://doi.org/10.1007/s12652-024-04791-1

M. A. Javed et al., "Leveraging Convolutional Neural Network (CNN)-based Auto Encoders for Enhanced Anomaly Detection in High-Dimensional Datasets," Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 17894–17899, Dec. 2024. DOI: https://doi.org/10.48084/etasr.8619

K. M. Ghori, M. Imran, A. Nawaz, R. A. Abbasi, A. Ullah, and L. Szathmary, "Performance analysis of machine learning classifiers for non-technical loss detection," Journal of Ambient Intelligence and Humanized Computing, vol. 14, no. 11, pp. 15327–15342, Nov. 2023. DOI: https://doi.org/10.1007/s12652-019-01649-9

P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol, "Extracting and composing robust features with denoising autoencoders," in Proceedings of the 25th international conference on machine learning - ICML ’08, Helsinki, Finland, 2008, pp. 1096–1103. DOI: https://doi.org/10.1145/1390156.1390294

Z. Hu, Z. Xiao, H. Sun, and H. Yang, "Autoencoder evolutionary algorithm for large-scale multi-objective optimization problem," International Journal of Machine Learning and Cybernetics, vol. 15, no. 11, pp. 5159–5172, Nov. 2024. DOI: https://doi.org/10.1007/s13042-024-02221-4

T. O. Hodson, "Root-mean-square error (RMSE) or mean absolute error (MAE): when to use them or not," Geoscientific Model Development, vol. 15, no. 14, pp. 5481–5487, Jul. 2022. DOI: https://doi.org/10.5194/gmd-15-5481-2022

N. Serrano and A. Bellogín, "Siamese neural networks in recommendation," Neural Computing and Applications, vol. 35, no. 19, pp. 13941–13953, Jul. 2023. DOI: https://doi.org/10.1007/s00521-023-08610-0

F. Baier, S. Mair, and S. G. Fadel, "Self-supervised Siamese Autoencoders," in Advances in Intelligent Data Analysis XXII, vol. 14641, I. Miliou, N. Piatkowski, and P. Papapetrou, Eds. Cham: Springer Nature Switzerland, 2024, pp. 117–128. DOI: https://doi.org/10.1007/978-3-031-58547-0_10

Y. Zhang et al., "Similarity-based pairing improves efficiency of siamese neural networks for regression tasks and uncertainty quantification," Journal of Cheminformatics, vol. 15, no. 1, Aug. 2023, Art. no. 75. DOI: https://doi.org/10.1186/s13321-023-00744-6

A. Fedele, R. Guidotti, and D. Pedreschi, "Explaining Siamese networks in few-shot learning," Machine Learning, vol. 113, no. 10, pp. 7723–7760, Oct. 2024. DOI: https://doi.org/10.1007/s10994-024-06529-8

W. Q. Yan, "Generative Adversarial Networks and Siamese Nets," in Computational Methods for Deep Learning, Springer Nature Singapore, 2023, pp. 125–140. DOI: https://doi.org/10.1007/978-981-99-4823-9_4

A. J. Chemmanam, B. Jose, and A. Moopan, "Improved multi object tracking with locality sensitive hashing," Pattern Analysis and Applications, vol. 27, no. 4, Dec. 2024, Art. no. 136. DOI: https://doi.org/10.1007/s10044-024-01353-1

Q. Ma, M. Wen, Z. Xia, and D. Chen, "A Sub-linear, Massive-scale Look-alike Audience Extension System A Massive-scale Look-alike Audience Extension," in Proceedings of the 5th International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications at KDD 2016, Dec. 2016, pp. 51–67. [Online]. Available: https://proceedings.mlr.press/v53/

ma16.html.

O. Rainio, J. Teuho, and R. Klén, "Evaluation metrics and statistical tests for machine learning," Scientific Reports, vol. 14, no. 1, Mar. 2024, Art. no. 6086. DOI: https://doi.org/10.1038/s41598-024-56706-x

K. Berahmand, F. Daneshfar, E. S. Salehi, Y. Li, and Y. Xu, "Autoencoders and their applications in machine learning: a survey," Artificial Intelligence Review, vol. 57, no. 2, Feb. 2024, Art. no. 28. DOI: https://doi.org/10.1007/s10462-023-10662-6

M. Nurtas, et al., "Predicting the Likelihood of an Earthquake by Leveraging Volumetric Statistical Data through Machine Learning Techniques," Engineered Science, 2023.

M. Nurtas, Z. Zhantaev, and A. Altaibek, "Earthquake time-series forecast in Kazakhstan territory: Forecasting accuracy with SARIMAX," Procedia Computer Science, vol. 231, pp. 353–358, 2024. DOI: https://doi.org/10.1016/j.procs.2023.12.216

Optimizing Similar Audience Search in Targeted Advertising: Effectiveness of Siamese Networks for Autoencoder-based User Embeddings

Authors

Abstract

Keywords:

Downloads

References

Downloads

How to Cite

Metrics

License