Unenhanced Sparse Vector-based Embedding Method for Sentiment Analysis
Received: 31 December 2024 | Revised: 23 January 2025 | Accepted: 7 February 2025 | Online: 3 April 2025
Corresponding author: B. S. Harish
Abstract
Natural language processing is one of the most trending fields in research, with sentiment analysis being one of the well-known problems in the field. Many methods have been proposed to handle text-based sentiment data, with social networks acting as one of the main data sources and research targets. An important step in designing a text-based model is the embedding method, which helps in the representation of the inputs. This study presents a novel static text embedding method to represent text inputs and compares its sentiment classification performance with some well-known text embedding methods. The results are on par with existing embedding methods, achieving a promising classification accuracy of 90.66%.
Keywords:
sentiment analysis, representation, word2vec, fast-text, word embedding, sparse vector, contextual embedding modelDownloads
References
B. Jang, I. Kim, and J. W. Kim, "Word2vec convolutional neural networks for classification of news articles and tweets," PLOS ONE, vol. 14, no. 8, Aug. 2019, Art. no. e0220976.
B. Jang, M. Kim, G. Harerimana, S. Kang, and J. W. Kim, "Bi-LSTM Model to Increase Accuracy in Text Classification: Combining Word2vec CNN and Attention Mechanism," Applied Sciences, vol. 10, no. 17, Aug. 2020, Art. no. 5841.
P. F. Muhammad, R. Kusumaningrum, and A. Wibowo, "Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews," Procedia Computer Science, vol. 179, pp. 728–735, 2021.
B. Li, A. Drozd, Y. Guo, T. Liu, S. Matsuoka, and X. Du, "Scaling Word2Vec on Big Corpus," Data Science and Engineering, vol. 4, no. 2, pp. 157–175, Jun. 2019.
T. Adewumi, F. Liwicki, and M. Liwicki, "Word2Vec: Optimal hyperparameters and their impact on natural language processing downstream tasks," Open Computer Science, vol. 12, no. 1, pp. 134–141, Mar. 2022.
F. Sakketou and N. Ampazis, "A constrained optimization algorithm for learning GloVe embeddings with semantic lexicons," Knowledge-Based Systems, vol. 195, May 2020, Art. no. 105628.
A. Khatri and P. P, "Sarcasm Detection in Tweets with BERT and GloVe Embeddings," in Proceedings of the Second Workshop on Figurative Language Processing, 2020, pp. 56–60.
P. Gupta, I. Roy, G. Batra, and A. K. Dubey, "Decoding Emotions in Text Using GloVe Embeddings," in 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India, Feb. 2021, pp. 36–40.
A. Pimpalkar and J. R. Raj, "MBiLSTMGloVe: Embedding GloVe knowledge into the corpus using multi-layer BiLSTM deep learning model for social media sentiment analysis," Expert Systems with Applications, vol. 203, Oct. 2022, Art. no. 117581.
R. Ni and H. Cao, "Sentiment Analysis based on GloVe and LSTM-GRU," in 2020 39th Chinese Control Conference (CCC), Shenyang, China, Jul. 2020, pp. 7492–7497.
M. Umer et al., "Impact of convolutional neural network and FastText embedding on text classification," Multimedia Tools and Applications, vol. 82, no. 4, pp. 5569–5585, Feb. 2023.
I. N. Khasanah, "Sentiment Classification Using fastText Embedding and Deep Learning Model," Procedia Computer Science, vol. 189, pp. 343–350, 2021.
D. A. Wibowo and A. Musdholifah, "Sentiments Analysis of Indonesian Tweet About Covid-19 Vaccine Using Support Vector Machine and Fasttext Embedding," in 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, Indonesia, Dec. 2021, pp. 184–188.
F. Alotaibi and V. G. Gupta, "Sentiment Analysis System using Hybrid Word Embeddings with Convolutional Recurrent Neural Network," The International Arab Journal of Information Technology, vol. 19, no. 3, 2022.
S. Khomsah, R. D. Ramadhani, and S. Wijaya, "The Accuracy Comparison Between Word2Vec and FastText On Sentiment Analysis of Hotel Reviews," Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 3, pp. 352–358, Jun. 2022.
A. Templeton, "Word Equations: Inherently Interpretable Sparse Word Embeddingsthrough Sparse Coding." arXiv, 2020.
S. Selva Birunda and R. Kanniga Devi, "A Review on Word Embedding Techniques for Text Classification," in Innovative Data Communication Technologies and Application, vol. 59, J. S. Raj, A. M. Iliyasu, R. Bestak, and Z. A. Baig, Eds. Springer Singapore, 2021, pp. 267–281.
S. Prakash, T. Chakravarthy, and E. Kaveri, "Statistically weighted reviews to enhance sentiment classification," Karbala International Journal of Modern Science, vol. 1, no. 1, pp. 26–31, Sep. 2015.
M. Michailidis, "Sentiment140 dataset with 1.6 million tweets." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/kazanova/sentiment140.
J. Nikhil, "Tweets with Sarcasm and Irony." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/nikhiljohnk/tweets-with-sarcasm-and-irony.
Downloads
How to Cite
License
Copyright (c) 2025 G. R. Kishore, B. S. Harish, C. K. Roopa

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.