Hate Speech Detection in Indonesian Twitter using Contextual Embedding Approach
Guntur Budi Herwanto(1*), Annisa Maulida Ningtyas(2), I Gede Mujiyatna(3), Kurniawan Eka Nugraha(4), I Nyoman Prayana Trisna(5)
(1) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(2) Department of Health Information and Services, Universitas Gadjah Mada Yogyakarta, Indonesia
(3) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(4) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(5) Department of Computer Science and Electronics, FMIPA UGM, Yogyakarta
(*) Corresponding Author
Abstract
Keywords
Full Text:
PDFReferences
[1] W. Warner and J. Hirschberg, "Detecting hate speech on the world wide web," in Proceedings of the second workshop on language in social media, 2012, pp. 19–26, [Online]. Available: https://www.aclweb.org/anthology/W12-2103/.
[2] L. S. Widayati, “Ujaran Kebencian: Batasan Pengertian dan Larangannya,” Info Singk. Kaji. Singk. terhadap isu Aktual dan Strateg., 2018, [Online]. Available: http://berkas.dpr.go.id/puslit/files/info_singkat/Info Singkat-X-6-II-P3DI-Maret-2018-186.pdf.
[3] J. Garland, K. Ghazi-Zahedi, J.-G. Young, L. Hébert-Dufresne, and M. Galesic, "Countering hate on social media: Large scale classification of hate and counter speech." 2020, [Online]. Available: https://arxiv.org/abs/2006.01974.
[4] E. Spertus, "Smokey : Automatic cogniti ostile Messages," 1997, [Online]. Available: https://www.aaai.org/Papers/IAAI/1997/IAAI97-209.pdf.
[5] F. Del Vigna, A. Cimino, and F. D. Orletta, "Hate me , hate me not : Hate speech detection on Facebook Hate me , hate me not : Hate speech detection on Facebook," no. May, 2017, [Online]. Available: http://ceur-ws.org/Vol-1816/paper-09.pdf.
[6] Z. Waseem and D. Hovy, "Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter," in Proceedings of the NAACL Student Research Workshop, Jun. 2016, pp. 88–93, doi: 10.18653/v1/N16-2013.
[7] N. Djuric, J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati, "Hate Speech Detection with Comment Embeddings," in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 29–30, doi: 10.1145/2740908.2742760.
[8] C. Nobata, J. Tetreault, A. Thomas, Y. Mehdad, and Y. Chang, "Abusive Language Detection in Online User Content," in Proceedings of the 25th International Conference on World Wide Web, 2016, pp. 145–153, doi: 10.1145/2872427.2883062.
[9] H. Watanabe, M. Bouazizi, and T. Ohtsuki, "Hate Speech on Twitter : A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection," IEEE Access, vol. 6, pp. 13825–13835, 2018, doi: 10.1109/ACCESS.2018.2806394.
[10] P. Badjatiya, S. Gupta, M. Gupta, and V. Varma, "Deep Learning for Hate Speech Detection in Tweets," no. 2, 2017, [Online]. Available: https://dl.acm.org/doi/abs/10.1145/3041021.3054223.
[11] T. L. Sutejo and D. P. Lestari, "Indonesia Hate Speech Detection using Deep Learning," 2018 Int. Conf. Asian Lang. Process., pp. 39–43, 2018, [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8629154/.
[12] B. Gambäck and U. K. Sikdar, "Using Convolutional Neural Networks to Classify Hate-Speech," in Proceedings of the First Workshop on Abusive Language Online, Aug. 2017, pp. 85–90, doi: 10.18653/v1/W17-3013.
[13] J. H. Park and P. Fung, "One-step and Two-step Classification for Abusive Language Detection on {{}T{}}witter," in Proceedings of the First Workshop on Abusive Language Online, Aug. 2017, pp. 41–45, doi: 10.18653/v1/W17-3006.
[14] Z. Zhang, D. Robinson, and J. Tepper, "Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network," 2018, [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-319-93417-4_48.
[15] S. Agrawal and A. Awekar, "Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms." 2018, [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-319-76941-7_11.
[16] I. Alfina, R. Mulia, M. I. Fanany, and Y. Ekanata, "Hate speech detection in the Indonesian language: A dataset and preliminary study," in 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS), 2017, pp. 233–238, doi: 10.1109/ICACSIS.2017.8355039.
[17] M. O. Ibrohim and I. Budi, "Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter," in Proceedings of the Third Workshop on Abusive Language Online, Aug. 2019, pp. 46–57, doi: 10.18653/v1/W19-3506.
[18] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching Word Vectors with Subword Information," CoRR, vol. abs/1607.0, 2016, [Online]. Available: http://arxiv.org/abs/1607.04606.
[19] E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, "Learning Word Vectors for 157 Languages," 2018, [Online]. Available: https://arxiv.org/abs/1802.06893.
[20] A. Akbik, D. Blythe, and R. Vollgraf, "Contextual String Embeddings for Sequence Labeling," in Proceedings of the 27th International Conference on Computational Linguistics, Aug. 2018, pp. 1638–1649, [Online]. Available: https://www.aclweb.org/anthology/C18-1139.
[21] A. Akbik, T. Bergmann, and R. Vollgraf, "Pooled Contextualized Embeddings for Named Entity Recognition," in Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jun. 2019, pp. 724–728, doi: 10.18653/v1/N19-1078.
[22] D. Tang, B. Qin, and T. Liu, "Document modeling with gated recurrent neural network for sentiment classification," in Proceedings of the 2015 conference on empirical methods in natural language processing, 2015, pp. 1422–1432, [Online]. Available: https://www.aclweb.org/anthology/D15-1167.pdf.
[23] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling," arXiv Prepr. arXiv1412.3555, 2014, [Online]. Available: https://arxiv.org/abs/1412.3555.
[24] G. B. Herwanto, A. M. Ningtyas, K. E. Nugraha, and I. N. P. Trisna, "Hate speech and abusive language classification using fastText," in 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), 2019, pp. 69–72, [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9034560/.
[25] A. Akbik, T. Bergmann, D. Blythe, K. Rasul, S. Schweter, and R. Vollgraf, “FLAIR: An easy-to-use framework for state-of-the-art NLP,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Demonstr. Sess., pp. 54–59, 2019, [Online]. Available: https://www.aclweb.org/anthology/N19-4010.pdf.
[26] Ž. Agić and I. Vulić, "JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Jul. 2019, pp. 3204–3210, doi: 10.18653/v1/P19-1310.
DOI: https://doi.org/10.22146/ijccs.64916
Article Metrics
Abstract views : 5043 | views : 3777Refbacks
- There are currently no refbacks.
Copyright (c) 2021 IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats1