Perbandingan Algoritma Naive Bayes dan K-Nearest Neighbors dalam Analisis Sentimen Ulasan Aplikasi Gojek
DOI:
https://doi.org/10.46880/jmika.Vol10No1.pp186-191Keywords:
Sentiment Analysis, Naive Bayes, K-Nearest Neighbors, TF-IDF, PySastrawi, Gojek, Google Play StoreAbstract
Sentiment analysis of mobile app reviews helps understand public perceptions of digital service quality. This study compares two machine learning algorithms, Naive Bayes (NB) and K-Nearest Neighbors (KNN), for classifying sentiment in Gojek app reviews from the Google Play Store. The dataset includes 5,000 reviews (1-star as negative and 5-star as positive), processed through Indonesian text preprocessing steps: case folding, tokenization, stopword removal, stemming with PySastrawi, and TF-IDF feature extraction using unigrams and bigrams. After cleaning, 4,685 valid reviews remained, split into 80% training and 20% testing, producing 3,322 features. Results show that Naive Bayes (MultinomialNB, α = 1.0) outperforms KNN, achieving 89.43% accuracy, 90.09% precision, 89.43% recall, and 89.34% F1-score, with a 5-fold cross-validation score of 91.22%. Meanwhile, KNN (k = 7, cosine metric) achieves 86.77% accuracy, 86.78% precision, 86.77% recall, and 86.75% F1-score, with a cross-validation score of 87.83%. Overall, Naive Bayes proves more effective for high-dimensional Indonesian text classification using TF-IDF.
References
Andayani, M., Marisa, F., & Putra, R. P. (2024). Sentiment Analysis of Indonesia 2024 Election with a Comparison of Naive Bayes and KNN Algorithms on Twitter. SAR Journal, 7(3), 204–212. https://doi.org/10.18421/SAR73
Azhar, Masruroh, S. U., Wardhani, L. K., & Okfalisa. (2023). Performance comparison of the Naive Bayes algorithm and the k-NN lexicon approach on Twitter media sentiment analysis. Science, Technology, and Communication Journal, 3(2), 35–40. https://doi.org/10.59190/stc.v3i2.229
DataReportal. (2025). Digital 2026: Indonesia. https://datareportal.com/
Fields, J., Chovanec, K., & Madiraju, P. (2024). A survey of text classification with transformers: How wide? how large? how long? how accurate? how expensive? how safe?. IEEE Access, 12, 6518-6531.
GoTo Group. (2026). GoTo beats guidance, achieving record results as it reports 2025 fourth quarter and full year earnings. https://www.gotocompany.com/
Liu, B., & Cardie, C. (2014). Book Reviews Sentiment Analysis and Opinion Mining. https://doi.org/10.1162/COLI
Manning, C. D. (2008). Introduction to information retrieval. Syngress Publishing.
Pang, B., Lee, L., Rd, H., & Jose, S. (2002). Thumbs up? Sentiment Classification using Machine Learning Techniques. In Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP 2002) (pp. 79-86).
Pradana, A. W., & Hayaty, M. (2019). The effect of stemming and removal of stopwords on the accuracy of sentiment analysis on indonesian-language texts. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 4(3).
Turney, P. D. (2002). Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. (July), (pp. 417-424).
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Rousyati Rousyati, Dany Pratmanto, Fandhilah Fandhilah

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.










