Analisis Efektivitas Algoritma Cosine Similarity dan Boyer-Moore dalam Sistem Pencarian Dokumen Digital
DOI:
https://doi.org/10.46880/jmika.Vol10No1.pp269-277Keywords:
Cosine Similarity, Boyer-Moore, Information Retrieval, TF-IDF, Precision, RecallAbstract
This study analyzes the effectiveness of the cosine similarity and Boyer-Moore algorithms in digital document retrieval within the SIPANDOK system developed for the Bengkalis Regency Health Office. Cosine similarity measures semantic document relevance via TF-IDF vector weighting, while Boyer-Moore performs direct pattern matching through string heuristics. The system was built using the Rapid Application Development (RAD) methodology and evaluated against 56 documents and 55 test queries using precision, recall, F1-score, accuracy, and execution time metrics. Results indicate that Boyer-Moore achieves higher average recall (66.7%) and F1-score (33.3%), demonstrating superiority in retrieving relevant documents, whereas cosine similarity offers faster execution time (average 0.31 seconds) compared to Boyer-Moore (0.91 seconds). Each algorithm presents distinct advantages depending on whether precision-orientation or recall-orientation is prioritized in document retrieval scenarios.
References
Ahmad, I., Borman, R. I., Caksana, G. G., & Fakhrurozi, J. (2022). Implementasi String Matching dengan Algoritma Boyer-Moore untuk Menentukan Tingkat Kemiripan pada Pengajuan Judul Skripsi/TA Mahasiswa. SINTECH Journal.
Al Rasyid, R., Handayani, D., & Ningsih, U. (2024). Penerapan Algoritma TF-IDF dan Cosine Similarity untuk Query Pencarian pada Dataset Destinasi Wisata. Jurnal Teknologi Informasi dan Komunikasi, 8(1).
Cahyani, A. D., Fathoni, M. W., Rachman, F. H., Basuki, A., Amin, S., & Khotimah, B. K. (2025). Automatic essay scoring: leveraging Jaccard coefficient and Cosine similarity with n-gram variation. IAES International Journal of Artificial Intelligence, 14(5), 3599–3612. https://doi.org/10.11591/ijai.v14.i5.pp3599-3612
Erickson, J. (2018). Algorithms Lecture: String Matching. University of Illinois at Urbana-Champaign.
Fadhullah, A. N. (2022). Aplikasi Deteksi Dini Plagiarism Penelitian Ilmiah Menggunakan Algoritma Cosine Similarity Berbasis Web. Jurnal Teknologi Informasi dan Komunikasi, 6(3).
Faqih, Y., Rahmanto, Y., Aldino, A. A., & Waluyo, B. (2022). Penerapan String Matching Menggunakan Algoritma Boyer-Moore pada Pengembangan Sistem Pencarian Buku Online. Bulletin of Computer Science Research, 2(3), 100–106. https://doi.org/10.47065/bulletincsr.v2i3.172
Fifuadi, S., Gutama, H. D., Pramuntadi, A., & Wijaya, P. W. (2024). Implementasi Algoritma String Matching Boyer-Moore untuk Pencarian Nama Dokumen pada Sistem Pengarsipan Dokumen. Majalah Ilmiah UNIKOM, 22(1), 19–28. https://doi.org/10.34010/miu.v22i1.13383
Firmansyah, F., Fauziah, & Hayati, N. (2022). Analisis Perbandingan dan Implementasi String Matching dan SQL Query pada Sistem Informasi Persediaan Obat Berbasis Web Apotek Erha Farma. Jurnal Ilmiah Teknologi dan Rekayasa, 27(2), 154–168. https://doi.org/10.35760/tr.2022.v27i2.7079
Formal, T., Piwowarski, B., & Clinchant, S. (2021). SPLADE: Sparse lexical and expansion model for first stage ranking. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21) (pp. 2288–2292). ACM. https://doi.org/10.1145/3404835.3463098
Gou, M. (2014). Algorithms for String Matching. Working Paper, July 2014.
Iskandar, D., & Kurniawati, A. (2025). Analisis Perbandingan Teknik Word2vec dan Doc2vec dalam Mengukur Kemiripan Dokumen Menggunakan Cosine Similarity. Jurnal Teknologi Informasi dan Ilmu Komputer, 12(1), 133–144. https://doi.org/10.25126/jtiik.2025129143
Karimah, M., & Zein, A. (2024). Penerapan Algoritma Boyer-Moore Sebagai Pra-Proses Identifikasi DNA Forensik. SAINSTECH, 34(3), 57–62. https://doi.org/10.37277/stch.v34i3.2109
Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., & Yih, W. (2020). Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 6769–6781). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.550
Kasim, R. J., & Utami, E. (2024). Penerapan Algoritma Boyer Moore yang Dimodifikasi untuk Stemmer Bahasa Indonesia. JIPI, 9(3), 1657–1667. https://doi.org/10.29100/jipi.v9i3.5449
Lin, S.-C., & Lin, J. (2022). A few brief notes on DeepImpact, COIL, and a conceptual framework for information retrieval techniques. arXiv preprint. https://doi.org/10.48550/arXiv.2106.14807
Manning, D. C., Raghavan, P., & Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
Rinjeni, T. P., Indriawan, A., & Rakhmawati, N. A. (2024). Matching Scientific Article Titles using Cosine Similarity and Jaccard Similarity Algorithm. Procedia Computer Science (Elsevier), 553–560. https://doi.org/10.1016/j.procs.2024.03.039
Robertson, S. E., & Zaragoza, H. (2009). The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval, 3(4), 333–389. https://doi.org/10.1561/1500000019
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523. https://doi.org/10.1016/0306-4573(88)90021-0
Setiawan, A. B., Mahdiyah, U., Farida, I. N., & Prasetyo, A. R. (2023). Pengukuran Kemiripan Makna Menggunakan Cosine Similarity dan Basis Data Sinonim Kata. Jurnal Teknologi Informasi dan Ilmu Komputer, 10(4). https://doi.org/10.25126/jtiik.2023106864
Thakur, N., Reimers, N., Rücklé, A., Srivastava, A., & Gurevych, I. (2021). BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS 2021). https://doi.org/10.48550/arXiv.2104.08663
Wahyuni, S., & Abdullah, A. (2025). Penggunaan Information Retrieval untuk Mendeteksi Kesamaan Judul Skripsi dengan Modified Cosine Similarity. Jurnal JUSTEK, 8(2), 117–126. https://doi.org/10.31764/justek.v8i2.117-126
Yates, B. R., & Neto, R. B. (1999). Modern Information Retrieval. Addison-Wesley Longman Limited.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Wawan Ade Saputra, Lidya Wati

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.










