ASYNCHRONOUS METHOD FOR ACTOR-CRITIC REINFORCEMENT LEARNING
Keywords:
reinforcement learning, actor-critic algorithm, parallel learning, asynchronous gradient descentAbstract
The method of deep reinforcement learning is considered in the work, which implements an asynchronous approach with the use of gradient descent. An actor-critic algorithm based on the proposed method is constructed. Parallel asynchronous algorithm is characterized by a greater stabilizing effect during learning process compared with existing parallel methods. In addition, the proposed approach allows parallelizing the learning process by applying the multi-core properties of modern computers
References
Mnih V., Kavukcuoglu K., Silver D., et al.–Nature, 2015.–vol.518, №7540.– P.529-533 28 ICSFTI2019 Plenary Section
Van Hasselt H., Guez A., Silver D. Deep reinforcement learning with double q-learning, 2015.– preprint arXiv:1509.06461.
Nair A., Srinivasan P., Blackwell S., et al. – Massively Parallel Methods for Deep Reinforcement Learning, 2015. – arXiv:1507.04296v2.
Grounds M., Kudenko D. Parallel reinforcement learning with linear function approximation // Proceedings of the 5th, 6th and 7th European Conference on Adaptive and Learning Agents and Multi-agent Systems: Adaptation and Multi-agent Learning. – Springer-Verlag: 2008, P. 60–74.
Mnih V., Badia A., Mirza M., et, al. Asynchronous Methods for Deep Reinforcement Learning, 2016. – arXiv: 1602.01783 [cs.LG]
Sutton, R. and Barto, A. Reinforcement Learning: an Introduction. MIT Press, 1998. – 548 p.