Implementasi Q-Learning dan Backpropagation pada Agen yang Memainkan Permainan Flappy Bird

Ardiansyah; Ednawati Rainarli

Ardiansyah Universitas Komputer Indonesia
Ednawati Rainarli Universitas Komputer Indonesia

Keywords: Flappy Bird, Q-Learning, Value-Function Approximation, Artificial Neural Netowrk, Backpropagation

Abstract

This paper shows how to implement a combination of Q-learning and backpropagation on the case of agent learning to play Flappy Bird game. Q-learning and backpropagation are combined to predict the value-function of each action, or called value-function approximation. The value-function approximation is used to reduce learning time and to reduce weights stored in memory. Previous studies using only regular reinforcement learning took longer time and more amount of weights stored in memory. The artificial neural network architecture (ANN) used in this study is an ANN for each action. The results show that combining Q-learning and backpropagation can reduce agent’s learning time to play Flappy Bird up to 92% and reduce the weights stored in memory up to 94%, compared to regular Q-learning only. Although the learning time and the weights stored are reduced, Q-learning combined with backpropagation have the same ability as regular Q-learning to play Flappy Bird game.

References

Y. Shu et al, “Obstacles Avoidance with Machine Learning Control Methods in Flappy Birds Setting,” Univ. of Stanford, CS229 Machine Learning Final Projects Stanford University, 2014.

S. Vaish. Flappy Bird RL by SarvagyaVaish. [Online], http://sarvagyavaish.github.io/FlappyBirdRL/, tanggal akses 8 Maret 2016.

M. Hatem and F. Abdessemed, “Simulation of the Navigation of a Mobile Robot by the Q-Learning using Artificial Neuron Networks,” CEUR Workshop Proceeding Conférence Internationale sur l'Informatique et ses Applications, vol. 547, paper 81, 2009.

R. Jaksa et al. “Backpropagation in Supervised and Reinforcement Learning for Mobile Robot Control,” Proceedings of Computational Intelligence for Modelling, Control and Automation, 1999.

B. Huang et al. “Reinforcement Learning Neural Network to the Problem of Autonomous Mobile Robot Obstacle Avoidance,” Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, hal. 85-89, 2005.

S. Dini and M. Serrano, “Combining Q-Learning with Artificial Neural Networks in an Adaptive Light Seeking Robot,” Swarthmore College, CS81 Adaptive Robotics Final Projects Swarthmore College, 2012.

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, London, England: MIT press, 1998.

R. Rojas, Neural Networks A Systematic Introduction, Berlin, Germany: Springer-Verlag, 1996.

Journal Metrics (January 2024)
Acceptance Rate	29%
Submission to First Decision	± 36 days
Acceptance to Publication	± 30 days
Acreditation	Sinta 2
h-index	29
5 Year Citations	3549

Username
Password
Remember me
Register