Peningkatan Akurasi Penerjemah Bahasa Daerah dengan Optimasi Korpus Paralel

  • Herry Sujaini Universitas Tanjungpura
Keywords: mesin penerjemah statistik, optimasi korpus, korpus paralel, bahasa Indonesia-Melayu


Statistical Machine Translation (SMT) quality is influenced by several factors. The most fundamental factor is quantity of corpus used as base material for building translational and language model in SMT. Quantity of corpus is a major factor in ensuring quality of the translation, but quality of corpus can not be ignored either. Checking the source and translation sentences manually in a parallel corpus of course will be very difficult and require a lot of resources. This paper reports the experimental results using a quality improvement strategy of Indonesian-Malay and Indonesia-Javanesse corpus without having to examine and correct the sentences that exist on the corpus. The filter used is the minimum value of each sentence tested by the Bilingual Evaluation Understudy (BLEU) method. Experimental results show that parallel corpus optimization can improve the level of accuracy of Indonesian-Malay translation by 6.97%and Indonesian-Javanesse translation by 5.55%.


