Learning to Play RoboCup Soccer from Scratch using Deep Learning

The novel approach of Deep Deterministic Policy Gradient (DDPG) presented in [\cite{lillicrap2015continuous}] has shown great success in tackling Reinforcement Learning (RL) problems featuring continuous state and action spaces. [\cite{hausknecht2015deep}] extended the DDPG algorithm for solving multiple parameterized continuous action space implemented in the Half-Field-Offense (HFO) RoboCup environment. In this paper we propose an alternative architecture to solve the HFO RL problem in terms of network size and training time while preserving state of the art performance. We also present an implementation of skills method which show great promise towards solving multiple agents problems in RoboCup environment but is not limited to RoboCup. To the extent of our knowledge the skills method wasn't implemented before in a continuous state and multiple parameterized continuous action space.