Covid-19 Hoax Detection Using KNN in Jaccard Space
Ema Utami(1), Ahmad Fikri Iskandar(2*), Wahyu Hidayat(3), Agung Budi Prasetyo(4), Anggit Dwi Hartanto(5)
(1) Magister Teknik Informatika, Univeristas Amikom Yogyakarta, Yogyakarta
(2) Magister Teknik Informatika, Univeristas Amikom Yogyakarta, Yogyakarta
(3) Magister Teknik Informatika, Univeristas Amikom Yogyakarta, Yogyakarta
(4) Magister Teknik Informatika, Univeristas Amikom Yogyakarta, Yogyakarta
(5) Magister Teknik Informatika, Univeristas Amikom Yogyakarta, Yogyakarta
(*) Corresponding Author
Abstract
Social media has become a communication key to spark thinking, dialogue and action around social issues. Hoax is information that added or subtracted from the content of the actual news. The spread of unconfirmed Covid-19 news can cause public concern. The purpose of this research was to modify KNN with Jaccard Space in the classification of hoax news related to Covid-19. The data used from Jabar Saber Hoaks and Jala Hoaks. The classification results with KNN with Jaccard Space and stemming Nazief & Adriani get the highest accuracy than other models in this research. The accuracy of the KNN model on the Jaccard Space with stemming Nazief & Adriani and K = 5 was 75.89%, while for Naïve Bayes was 65.18%.
Keywords
Full Text:
PDFReferences
[1] C. Juditha, “Hoax Communication Interactivity in Social Media and Anticipation (Interaksi Komunikasi Hoax di Media Sosial serta Antisipasinya),” J. Pekommas, vol. 3, no. 1, p. 31, 2018, doi: 10.30818/jpkm.2018.2030104.
[2] C. Juditha, “People Behavior Related To The Spread Of Covid-19’s Hoax,” J. Pekommas, vol. 5, no. 2, p. 105, 2020, doi: 10.30818/jpkm.2020.2050201.
[3] J. S. Hoaks, “Jabar Saber Hoaks,” Instagram. 2020, [Online]. Available: https://www.instagram.com/jabarsaberhoaks/ accesed on 06 June 2020.
[4] B. Zaman, A. Justitia, K. N. Sani, and E. Purwanti, “An Indonesian Hoax News Detection System Using Reader Feedback and Naïve Bayes Algorithm,” Cybern. Inf. Technol., vol. 20, no. 1, pp. 82–94, 2020, doi: 10.2478/cait-2020-0006.
[5] A. Fauzi, E. B. Setiawan, and Z. K. A. Baizal, “Hoax News Detection on Twitter using Term Frequency Inverse Document Frequency and Support Vector Machine Method,” J. Phys. Conf. Ser., vol. 1192, no. 1, 2019, doi: 10.1088/1742-6596/1192/1/012025.
[6] R. Sagita, U. Enri, and A. Primajaya, “Klasifikasi Berita Clickbait Menggunakan K-Nearest Neighbor (KNN),” JOINS (Journal Inf. Syst., vol. 5, no. 2, pp. 230–239, 2020, doi: 10.33633/joins.v5i2.3705.
[7] A. Kesarwani, S. S. Chauhan, and A. R. Nair, “Fake News Detection on Social Media using K-Nearest Neighbor Classifier,” Proc. 2020 Int. Conf. Adv. Comput. Commun. Eng. ICACCE 2020, pp. 0–3, 2020, doi: 10.1109/ICACCE49060.2020.9154997.
[8] S. Sunardi, A. Yudhana, and I. A. Mukaromah, “Implementasi Deteksi Plagiarisme Menggunakan Metode N-Gram Dan Jaccard Similarity Terhadap Algoritma Winnowing,” Transmisi, vol. 20, no. 3, p. 105, 2018, doi: 10.14710/transmisi.20.3.105-110.
[9] E. Y. Sari, A. D. Wierfi, and A. Setyanto, “Sentiment Analysis of Customer Satisfaction on Transportation Network Company Using Naive Bayes Classifier,” 2019 Int. Conf. Comput. Eng. Network, Intell. Multimedia, CENIM 2019 - Proceeding, vol. 2019-Novem, 2019, doi: 10.1109/CENIM48368.2019.8973262.
[10] A. P. Ardhana, D. E. Cahyani, and Winarno, “Classification of Javanese Language Level on Articles Using Multinomial Naive Bayes and N-Gram Methods,” J. Phys. Conf. Ser., vol. 1306, no. 1, pp. 0–9, 2019, doi: 10.1088/1742-6596/1306/1/012049.
[11] H. Mustofa and A. A. Mahfudh, “Klasifikasi Berita Hoax Dengan Menggunakan Metode Naive Bayes,” Walisongo J. Inf. Technol., vol. 1, no. 1, p. 1, 2019, doi: 10.21580/wjit.2019.1.1.3915.
[12] K. Umam, “Group chat analysis of hoax detection during the covid-19 pandemic using the k nearest neighbors algorithm and massive text processing,” J. Phys. Conf. Ser., vol. 1918, no. 4, p. 042149, 2021, doi: 10.1088/1742-6596/1918/4/042149.
[13] M. A. Rahmat, Indrabayu, and I. S. Areni, “Hoax web detection for news in bahasa using support vector machine,” 2019 Int. Conf. Inf. Commun. Technol. ICOIACT 2019, pp. 332–336, 2019, doi: 10.1109/ICOIACT46704.2019.8938425.
[14] B. Irena and Erwin Budi Setiawan, “Fake News (Hoax) Identification on Social Media Twitter using Decision Tree C4.5 Method,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 4, no. 4, pp. 711–716, 2020, doi: 10.29207/resti.v4i4.2125.
[15] S. Temma, M. Sugii, and H. Matsuno, “The Document Similarity Index based on the Jaccard Distance for Mail Filtering,” 34th Int. Tech. Conf. Circuits/Systems, Comput. Commun. ITC-CSCC 2019, pp. 3–6, 2019, doi: 10.1109/ITC-CSCC.2019.8793419.
[16] P. Sangwan and R. Behl, “Truth Detection in Social Media Posts using Jaccard Algorithm with SRTD and Word Net Concept,” Proc. Int. Conf. Res. Manag. Technovation 2020, vol. 24, pp. 103–107, 2020, doi: 10.15439/2020km24.
[17] T. Winarti, J. Kerami, and S. Arief, “Determining Term on Text Document Clustering using Algorithm of Enhanced Confix Stripping Stemming,” Int. J. Comput. Appl., vol. 157, no. 9, pp. 8–13, 2017, doi: 10.5120/ijca2017912761.
[18] J. Asian, H. E. Williams, and S. M. M. Tahaghoghi, “Stemming Indonesian,” Conf. Res. Pract. Inf. Technol. Ser., vol. 38, no. 4, pp. 307–314, 2005, doi: 10.1145/1316457.1316459.
[19] T. Granskogen and J. A. Gulla, “Fake news detection: Network data from social media used to predict fakes,” CEUR Workshop Proc., vol. 2041, no. 1, pp. 59–66, 2017.
[20] A. F. Iskandar, E. Utami, and A. B. Prasetio, “Word Analysis of Indonesian Keirsey Temperament,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), Vol. 14, No. 4, pp. 365–376, 2020. doi: d10.22146/ijccs.58595.
DOI: https://doi.org/10.22146/ijccs.67392
Article Metrics
Abstract views : 2701 | views : 2719Refbacks
- There are currently no refbacks.
Copyright (c) 2021 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1