Comparative Analysis of Machine Learning Algorithms for Sentiment Analysis on Instagram App Reviews: A Guideline for Enhancing User Experience

  • Nutthapat Kaewrattanapat Suan Sunandha Rajabhat University
  • Mongkolchai Tiansoodeenon Suan Sunandha Rajabhat University
  • Jarumon Nookhong Suan Sunandha Rajabhat University
Keywords: Sentiment Analysis, Machine Learning, Naïve Bayes, k-Means Clustering, Instagram Reviews

Abstract

This study focuses on a comparative analysis of machine learning algorithms for classifying and clustering Instagram app reviews from the Google Play Store, aimed at understanding user sentiments and improving app performance. The dataset, obtained from Kaggle, consists of 210,542 reviews, with a randomly selected sample of 1,103 reviews used for analysis. Four algorithms were applied: Naïve Bayes, k-Nearest Neighbors (k-NN), k-Means, and k-Medoids. To evaluate their performance, 12 models were tested using different vectorization techniques such as TF-IDF, Term Occurrences, Binary Term Occurrences, and n-grams. The results show that Naïve Bayes, combined with TF-IDF and 2-grams, delivered the highest accuracy for sentiment classification at 99.46%. This high accuracy can be attributed to Naïve Bayes’ effectiveness in handling probabilistic distributions in text data. In the clustering analysis, k-Means with 2-grams and stemming produced the most distinct groupings, with a Davies-Bouldin index of -7.279, successfully separating positive and negative sentiments. This research provides a robust framework for conducting Sentiment Analysis on large-scale user reviews. By applying these machine learning techniques, developers and app managers can gain valuable insights into user satisfaction and identify areas for improvement. Ultimately, these methods can help enhance user experience and drive app success through a deeper understanding of user feedback.

Published
2024-12-05