Analisis Fitur Kalimat untuk Peringkas Teks Otomatis pada Bahasa Indonesia

https://doi.org/10.22146/ijccs.2019

Badrus Zaman(1*), Edi Winarko(2)

(1) 
(2) 
(*) Corresponding Author

Abstract


Abstract Automatic Text Summarization (ATS) is a technique to create a summary of the document automatically by using computer applications to produce the most important information from the original document. Features are required to perform weighting of sentences, including Log-TFISF (term frequency index sentence frequency), sentence location, sentence overlap, title overlap and sentence relative length. This research conducted an analysis of five features in order to determine the weights of each feature that will get the results of a coherent summary. The five features are implemented in automated text summarization system in Indonesian language that was developed using the method of relative importance of topics. Results from experiments show that sentence location feature has the highest F-Measures namely 0.46 and then consecutive sentence overlap, title overlap, sentence relative length and Log-TFISF, with a value of 0.42, 0.42, 0.35 and 0.32. Relative weights of feature extraction consecutive from the largest are sentence location, sentence overlap, title overlap, sentence relative length and Log-TFISF with a value of 0.25, 0.22, 0.22, 0.19 and 0.12. These relative weights are implemented on ATS, so we get accuracy of 70.62%. It is more accurate 2,86% than without relative weights which accuracy of 67,72%..

.

Keywords Automatic Text Summarization (ATS), Log-TFISF, sentence location, sentence overlap, title overlap, sentence relative length, bahasa Indonesia

Full Text:

PDF



DOI: https://doi.org/10.22146/ijccs.2019

Article Metrics

Abstract views : 5862 | views : 3466

Refbacks

  • There are currently no refbacks.




Copyright (c) 2011 IJCCS - Indonesian Journal of Computing and Cybernetics Systems

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Copyright of :
IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
ISSN 1978-1520 (print); ISSN 2460-7258 (online)
is a scientific journal the results of Computing
and Cybernetics Systems
A publication of IndoCEISS.
Gedung S1 Ruang 416 FMIPA UGM, Sekip Utara, Yogyakarta 55281
Fax: +62274 555133
email:ijccs.mipa@ugm.ac.id | http://jurnal.ugm.ac.id/ijccs



View My Stats1
View My Stats2