题名: | Evaluation of Using Deterministic Heuristics to Accelerate Reinforcement Learning. |
作者: | Walton, G. M. |
关键词: | Artificial neural networks, Machine learning, Training, Artificial intelligence, Video games, Deep learning, Reinforcement learning, Iterative distillation, Deepmind, Atari learning environment, Convolutional neural networks, Back propagation, Heuristic |
摘要: | Neural networks frequently face long training times based on the corpus of data available to them. Reinforcement learning in particular can take a long time to attain satisfactory performance. Recent efforts to incorporate deterministic logical rules and physical laws into a neural network have met with promising results. From an existing baseline neural network that is designed to learn Pong strictly from pixel representation of the game board, this thesis adds a ball trajectory-based heuristic into the learning process and evaluates its performance. The evaluation initially shows game score improvements, but demonstrates a sharp score degradation after about 25,000 games. Another evaluation shows the heuristic incurs a training time increase of approximately 35%. More work remains for assessing the long-term viability of this approach. |
报告类型: | 科技报告 |