Unenhanced Sparse Vector-based Embedding Method for Sentiment Analysis

Authors

  • G. R. Kishore Department of Information Science and Engineering, JSS Science and Technology University, Mysuru, Karnataka, India
  • B. S. Harish Department of Information Science and Engineering, JSS Science and Technology University, Mysuru, Karnataka, India.
  • C. K. Roopa Department of Information Science and Engineering, JSS Science and Technology University, Mysuru, Karnataka, India.
Volume: 15 | Issue: 2 | Pages: 21225-21231 | April 2025 | https://doi.org/10.48084/etasr.10098

Abstract

Natural language processing is one of the most trending fields in research, with sentiment analysis being one of the well-known problems in the field. Many methods have been proposed to handle text-based sentiment data, with social networks acting as one of the main data sources and research targets. An important step in designing a text-based model is the embedding method, which helps in the representation of the inputs. This study presents a novel static text embedding method to represent text inputs and compares its sentiment classification performance with some well-known text embedding methods. The results are on par with existing embedding methods, achieving a promising classification accuracy of 90.66%.

Keywords:

sentiment analysis, representation, word2vec, fast-text, word embedding, sparse vector, contextual embedding model

Downloads

References

B. Jang, I. Kim, and J. W. Kim, "Word2vec convolutional neural networks for classification of news articles and tweets," PLOS ONE, vol. 14, no. 8, Aug. 2019, Art. no. e0220976.

B. Jang, M. Kim, G. Harerimana, S. Kang, and J. W. Kim, "Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism," Applied Sciences, vol. 10, no. 17, Aug. 2020, Art. no. 5841.

P. F. Muhammad, R. Kusumaningrum, and A. Wibowo, "Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews," Procedia Computer Science, vol. 179, pp. 728–735, 2021.

B. Li, A. Drozd, Y. Guo, T. Liu, S. Matsuoka, and X. Du, "Scaling Word2Vec on Big Corpus," Data Science and Engineering, vol. 4, no. 2, pp. 157–175, Jun. 2019.

T. Adewumi, F. Liwicki, and M. Liwicki, "Word2Vec: Optimal hyperparameters and their impact on natural language processing downstream tasks," Open Computer Science, vol. 12, no. 1, pp. 134–141, Mar. 2022.

F. Sakketou and N. Ampazis, "A constrained optimization algorithm for learning GloVe embeddings with semantic lexicons," Knowledge-Based Systems, vol. 195, May 2020, Art. no. 105628.

A. Khatri and P. P, "Sarcasm Detection in Tweets with BERT and GloVe Embeddings," in Proceedings of the Second Workshop on Figurative Language Processing, 2020, pp. 56–60.

P. Gupta, I. Roy, G. Batra, and A. K. Dubey, "Decoding Emotions in Text Using GloVe Embeddings," in 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India, Feb. 2021, pp. 36–40.

A. Pimpalkar and J. R. Raj, "MBiLSTMGloVe: Embedding GloVe knowledge into the corpus using multi-layer BiLSTM deep learning model for social media sentiment analysis," Expert Systems with Applications, vol. 203, Oct. 2022, Art. no. 117581.

R. Ni and H. Cao, "Sentiment Analysis based on GloVe and LSTM-GRU," in 2020 39th Chinese Control Conference (CCC), Shenyang, China, Jul. 2020, pp. 7492–7497.

M. Umer et al., "Impact of convolutional neural network and FastText embedding on text classification," Multimedia Tools and Applications, vol. 82, no. 4, pp. 5569–5585, Feb. 2023.

I. N. Khasanah, "Sentiment Classification Using fastText Embedding and Deep Learning Model," Procedia Computer Science, vol. 189, pp. 343–350, 2021.

D. A. Wibowo and A. Musdholifah, "Sentiments Analysis of Indonesian Tweet About Covid-19 Vaccine Using Support Vector Machine and Fasttext Embedding," in 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, Indonesia, Dec. 2021, pp. 184–188.

F. Alotaibi and V. G. Gupta, "Sentiment Analysis System using Hybrid Word Embeddings with Convolutional Recurrent Neural Network," The International Arab Journal of Information Technology, vol. 19, no. 3, 2022.

S. Khomsah, R. D. Ramadhani, and S. Wijaya, "The Accuracy Comparison Between Word2Vec and FastText On Sentiment Analysis of Hotel Reviews," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 3, pp. 352–358, Jun. 2022.

A. Templeton, "Word Equations: Inherently Interpretable Sparse Word Embeddingsthrough Sparse Coding." arXiv, 2020.

S. Selva Birunda and R. Kanniga Devi, "A Review on Word Embedding Techniques for Text Classification," in Innovative Data Communication Technologies and Application, vol. 59, J. S. Raj, A. M. Iliyasu, R. Bestak, and Z. A. Baig, Eds. Springer Singapore, 2021, pp. 267–281.

S. Prakash, T. Chakravarthy, and E. Kaveri, "Statistically weighted reviews to enhance sentiment classification," Karbala International Journal of Modern Science, vol. 1, no. 1, pp. 26–31, Sep. 2015.

M. Michailidis, "Sentiment140 dataset with 1.6 million tweets." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/kazanova/sentiment140.

J. Nikhil, "Tweets with Sarcasm and Irony." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/nikhiljohnk/tweets-with-sarcasm-and-irony.

Downloads

How to Cite

[1]
Kishore, G.R., Harish, B.S. and Roopa, C.K. 2025. Unenhanced Sparse Vector-based Embedding Method for Sentiment Analysis. Engineering, Technology & Applied Science Research. 15, 2 (Apr. 2025), 21225–21231. DOI:https://doi.org/10.48084/etasr.10098.

Metrics

Abstract Views: 22
PDF Downloads: 12

Metrics Information