Comparison of SVM and LIWC for Sentiment Analysis of SARA
AAIN Eka Karyawati(1*), Prasetyo Adi Utomo(2), I Gede Arta Wibawa(3)
(1) Informatics Study Program, FMIPA, Universitas Udayana, Bali
(2) Informatics Study Program, FMIPA, Universitas Udayana, Bali
(3) Informatics Study Program, FMIPA, Universitas Udayana, Bali
(*) Corresponding Author
Abstract
SARA is a sensitive issue based on sentiments about self-identity regarding ancestry, religion, nationality or ethnicity. The impact of the issue of SARA is conflict between groups that leads to hatred and division. SARA issues are widely spread through social media, especially Twitter. To overcome the problem of SARA, it is necessary to develop an effective method to filter negative SARA. This study aims to analyze Indonesian-language tweets and determine whether the tweet contains positive or negative SARA or does not contain SARA (neutral). Machine learning (i.e., SVM) and lexicon-based method (i.e., LIWC) were compared based on 450 tweet data to determine the best approach for each sentiment (positive, negative, and neutral). The best evaluation results are shown in the negative SARA classification using SVM with λ = 3 and γ = 0.1, where Precision = 0.9, Recall = 0.6, and F1-Score = 0.72. The best results from the positive SARA classification were shown in the LIWC method, where Precision = 0.6, Recall = 0.8, and F1-Score = 0.69. The best evaluation results for neutral classification are shown in SVM with λ = 3 and γ = 0.1, with Precision = 0.52, Recall = 0.87, and F1-Score = 0.65.
Keywords
Full Text:
PDFReferences
P. A. Utomo and AAIN E. Karyawati, “Sentiment Analysis Of Tribal, Religion, And Race With LIWC” JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) Vol. 9, No. 3. February 2021 [Online].
Available: https://ojs.unud.ac.id/index.php/JLK/article/view/64361/38478
F. R. Saputra, Indriati, and Sutrisno, “Klasifikasi Isu Suku, Antar Golongan, Ras, Agama (SARA) pada Twitter Berbahasa Indonesia menggunakan Metode Improved K-Nearest Neighbor (K-NN)” JPTIIK (Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer),Vol. 4, No. 1, pp. 373-380, Januari 2020 [Online].
Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/6917/3359
W. M. Baihaqi, M. Pinilih, and M. Rohmah, “Kombinasi k-Means dan Support Vector Machine (SVM) untuk Memprediksi Unsur SARA pada Tweet”, JPTIIK (Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer), Vol. 7, No. 3, pp. 501-510, Juni 2018 [Online]. Available: https://jtiik.ub.ac.id/index.php/jtiik/article/view/2126/pdf
F. J. Damanik1, and D. B. Setyohadi, “Analysis Of Public Sentiment About Covid-19 In Indonesia On Twitter Using Multinomial Naive Bayes And Support Vector Machine”, IOP Conf. Ser.: Earth Environ. Sci. 704, 2021 [Online].
Available: https://iopscience.iop.org/article/10.1088/1755-1315/704/1/012027/pdf
F. Alvianda and P. P. Adikara, “Analisis Sentimen Konten Radikal Di Media Sosial Twitter Menggunakan Metode Support Vector Machine (SVM),” JPTIIK (Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer), Vol. 3, No. 1, pp. 241–246, 2019 [Online]. Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/4084/1877
M. Ahmad, S. Aftab, and I. Ali, “Sentiment Analysis of Tweets using SVM”, IJCA (International Journal of Computer Applications), Vol. 177, No.5, November 2017 [Online]. Available: https://www.ijcaonline.org/archives/volume177/number5/ahmad-2017-ijca-915758.pdf
R. Cervero, “Use of Lexical and Psycho-Emotional Information to Detect Hate Speech Spreaders on Twitter”, CLEF 2021 – Conference and Labs of the Evaluation Forum, 2021 [Online]. Available: http://ceur-ws.org/Vol-2936/paper-161.pdf
J. Hartmann, J. Huppertz, C. Schamp, and M. Heitmann, “Comparing Automated Text classification Methods”, IJRM (International Journal of Research in Marketing), Vol 36 pp. 20–38, 2019 [Online].
Available: https://www.sciencedirect.com/science/article/pii/S0167811618300545
S. Qaiser, N. Yusoff, R. Ali, M.A. Remli, and H.K. Adli, “A Comparison of Machine Learning Techniques for Sentiment Analysis”,Turkish Journal of Computer and Mathematics Education, Vol. 12, No. 3, pp. 1738-1744, 2021 [Online].
Available: https://turcomat.org/index.php/turkbilmat/article/view/999/788
M.T.H.K. Tusar and M.T. Islam, “A Comparative Study of Sentiment Analysis Using NLP and Different Machine Learning Techniques on US Airline Twitter Data” Proceeding of the International Conference on Electronics, Communications and Information Technology (ICECIT), 14-16 September 2021 [Online].
Available: https://arxiv.org/pdf/2110.00859.pdf
P. Sudhir and V.D. Suresh, “Comparative study of various approaches, applications and classifiers for sentiment analysis”, Global Transitions Proceedings Vol. 2, pp. 205–211, 2021 [Online].
Available: https://www.sciencedirect.com/science/article/pii/S2666285X21000327
J. Wang and B. Xia, “Relationships of Cohen’s Kappa, Sensitivity, and Specificity for Unbiased Annotations”, Proceedings of the 2019 4th International Conference on Biomedical Signal and Image Processing (ICBIP 2019), pp. 98–101, 2019 [Online].
Available: https://dl.acm.org/doi/pdf/10.1145/3354031.3354040
C. Manning, P. Raghavan, and H. Schütze, “An Introduction to Information Retrieval”, Cambridge University Press, Cambridge, 2009.
C.C. Aggarwal and C. Zhai, “Mining Text Data”, Kluwer Academic Publishers, 2012, doi:10.1007/978-1-4614-3223-4.
P.-N. Tan, M. Steinbach, and V. Kumar, “Introduction to Data Mining”, Person Education Limited, 2014.
D.L. Olson and D. Delen, “Advanced Data Mining Techniques”, Springer-Verlag Berlin Heidelberg, 2008.
J. W., Pennebaker, R. L. Boyd, K. Jordan, and K. Blackburn, “The development and psychometric properties of LIWC2015”. Technical report, Austin, TX: University of Texas at Austin, 2015 [Online].
Available: https://repositories.lib.utexas.edu/bitstream/handle/2152/31333/LIWC2015_LanguageManual.pdf?sequence=3&isAllowed=y
Liu, Bing, Hu, Minqing, and Cheng, Junsheng, "Opinion Observer: Analyzing and Comparing Opinions on the Web." Proceedings of the 14th International World Wide Web Conference (WWW-2005), May 10-14, Chiba, Japan, 2005.
D.H. Wahid and S.N. Azhari, “Peringkasan Sentimen Esktraktif di Twitter Menggunakan Hybrid TF-IDF dan Cosine Similarity”, IJCCS (Indonesian Journal of Computing and Cybernetics Systems), Vol. 10, No. 2, pp. 207-21, 2016 [Online]Available: https://jurnal.ugm.ac.id/ijccs/article/view/16625/11694
DOI: https://doi.org/10.22146/ijccs.69617
Article Metrics
Abstract views : 3260 | views : 2668Refbacks
- There are currently no refbacks.
Copyright (c) 2022 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1