Analisis Kinerja LSTM dan GRU sebagai Model Generatif untuk Tari Remo
Abstract
Creating dance animations can be done manually or using a motion capture system. An intelligent system that able to generate a variety of dance movements should be helpful for this task. The recurrent neural network such as Long Short-Term Memory (LSTM) or Gated Recurrent Unit (GRU) could be trained as a generative model. This model is able to memorize the training data set and reiterate its memory as the output with arbitrary length. This ability makes the model feasible for generating dance animation. Remo is a dance that comprises several repeating basic moves. A generative model with Remo moves as training data set should make the animation creating process for this dance simpler. Because the generative model for this kind of problem involves a probabilistic function in form of Mixture Density Models (MDN), the random effects of that function also affect the model performance. This paper uses LSTM and GRU as generative models for Remo dance moves and tests their performance. SGD, Adagrad, and Adam are also used as optimization algorithms and drop-out is used as the regulator to find out how these algorithms affect the training process. The experiment results show that LSTM outperforms GRU in term of the number of successful training. The trained models are able to create unlimited dance moves animation. The quality of the animations is assessed by using visual and dynamic time warping (DTW) method. The DTW method shows that on average, GRU results have 116% greater variance than LSTM’s.
References
L.A. Gatys, “Image Style Transfer Using Convolutional Neural Networks,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, hal. 2414–2423.
B. He, F. Gao, D. Ma, B. Shi, dan L. Duan, “ChipGAN : A Generative Adversarial Network for Chinese Ink Wash Painting Style Transfer,” in 2018 ACM Multimedia Conference on Multimedia Conference, 2018, hal. 1172–1180.
T. Zhou, C. Fang, Z. Wang, J. Yang, B. Kim, Z. Chen, J. Brandt, dan D. Terzopoulos, “Learning to Sketch with Deep Q Networks and Demonstrated Strokes,” ArXiv: 1810.05977, 2018.
P. Isola, J. Zhu, T. Zhou, dan A.A. Efros, “Image-to-Image Translation with Conditional Adversarial Networks.,” arXiv Comput. Vis. Pattern Recognit., 2018.
L.-C. Yang, S.-Y. Chou, dan Y.-H. Yang, “MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation,” arXiv:1703.10847 [cs.SD], 2017.
O. Alemi, J. Françoise, dan P. Pasquier, “GrooveNet : Real-Time Music-Driven Dance Movement Generation using Artificial Neural Networks,” Proc. SIGKDD 2017Workshop Mach. Learn. Creat., 2017, hal. 1-6.
C. Chan, S. Ginosar, T. Zhou, dan A.A. Efros, “Everybody Dance Now,” arXiv:1808.07371, Vol. 1, No. 1, hal. 1-9. 2018.
J.L. Elman, “Finding Structure in Time,” Cogn. Sci., Vol. 14, No. 2, hal. 179–211, 1990.
S. Hochreiter dan J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., Vol. 9, No. 8, hal. 1735–1780, 1997.
K. Cho, D. Bahdanau, F. Bougares, H. Schwenk, dan Y. Bengio, “Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation,” arXiv:1406.1078 [cs.CL], 2014.
J. Wang, L.-C. Yu, K.R. Lai, dan X. Zhang, “Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model,” Proc. 54th Annu. Meet. Assoc. Comput. Linguist, 2016, hal. 225–230.
I. Sutskever, O. Vinyals, dan Q.V. Le, “Sequence to Sequence Learning with Neural Networks,” Adv. Neural Inf. Process. Syst., hal. 3104–3112, 2014.
K. Xu, J.L. Ba, R. Kiros, K.Cho, A. Courville, R. Salakhutdinov, R.S. Zemel, dan Y. Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention,” Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2015, Vol. 37 hal. 2048-2057.
A. Graves, “Generating Sequences with Recurrent Neural Networks,” arXiv Prepr. arXiv1308.0850, hal. 1–43, 2013.
E. Hegarini, Dharmayanti, dan A. Syakur, “Indonesian Traditional Dance Motion Capture Documentation,” International Conference on Science and Technology-Computer (ICST), 2016, hal. 1-4.
M. Müller, Information Retrieval for Music and Motion, Heidelberg, Germany: Springer-Verlag, 2007.
J. Martens, “Generating Text with Recurrent Neural Networks,” Neural Networks, Vol. 131, No. 1, hal. 1017–1024, 2011.
S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Darrell, dan K. Saenko, “Sequence to Sequence - Video to Text,” Proceedings of the IEEE International Conference on Computer Vision, 2015, hal. 4534–4542.
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, dan M. Riedmiller, “Playing Atari with Deep Reinforcement Learning,” arXiv:1312.5602 [cs.LG], hal. 1-9, 2013.
T. Reil dan P. Husbands, “Evolution of Central Pattern Generators for Bipedal Walking in a Real-Time Physics Environment,” IEEE Trans. Evol. Comput., Vol. 6, No. 2, hal. 159–168, 2002.
D. Holden, J. Saito, dan T. Komura, “A Deep Learning Framework for Character Motion Synthesis and Editing,” ACM Trans. Graph., Vol. 35, No. 4, hal. 1–11, 2016.
F. Ofli, E. Erzin, Y. Yemez, dan A.M. Tekalp, “Learn2Dance: Learning Statistical Music-To-Dance Mappings for Choreography Synthesis,” IEEE Trans. Multimed., Vol. 14, No. 3, hal. 747–759, 2012.
L. Crnkovic-Friis dan L. Crnkovic-Friis, “Generative Choreography Using Deep Learning,” Proc. of 7th Int. Conf. Comput. Creat., 2016, hal. 272-277.
J. Duchi, E. Hazan, dan Y. Singer, “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization,” J. Mach. Learn. Res., Vol. 12, hal. 2121–2159, 2011.
D.P. Kingma dan J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv:1412.6980 [cs.LG], hal. 1–15, 2014.
A.T.R. Sari dan W. Wahyudi, “Rekonstruksi Gerak Pada Tari Remo Tawi Jombang,” Joged, Vol. 10, No. 2, hal. 577-590, 2018.
T. Wibisono, Tari Remo di Surabaya: Dari Terob, Tobong, Menuju Kelas, Surabaya, Indonesia: SatuKata, 2015.
Wahyudianto, “Karakteristik Ragam Gerak dan Tatarias-Busana Tari Ngremo sebagai Wujud Presentasi Simbolis Sosio Kultural,” IMAJI, Vol. 4, No. 2, hal. 136-156, 2006.
C. Brakel-Papenhuijzen, Classical Javanese Dance: The Surakarta Tradition and Its Terminology, Leiden, The Netherlands: KITLV Press, 1995.
C.M. Bishop, “Mixture Density Networks,” Aston University, Birmingham, UK, NCRG Report, hal. 1-25, 1994.
L. Zaman, S. Sumpeno, dan M. Hariadi, “Training Strategies for Remo Dance on Long Short- Term Memory Generative Model,” International Conference on Computer Engineering, Network, and Intelligent Multimedia, 2018, hal. 1-5.
J. Chung, C. Gulcehre, K. Cho, dan Y. Bengio, “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling,” arXiv:1412.3555v1 [cs.NE], 2014.
R. Salakhutdinov, N. Srivastava, G. Hinton, A. Krizhevsky, dan I. Sutskever, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” J. Mach. Learn. Res., Vol. 15, hal. 1929–1958, 2014.
© Jurnal Nasional Teknik Elektro dan Teknologi Informasi, under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License.