原文传递 Human Aided Reinforcement Learning in Complex Environments.
题名: Human Aided Reinforcement Learning in Complex Environments.
作者: Burn, C. B.
关键词: Algorithms, Curriculum, Learning, Mathematical models, Sequences, Training, Instructors, Multiagent systems, Artificial intelligence, Computer programs, Teaching, Machine learning, Trident scholar project report, Reinforcement learning, Agent teaching, Advice taking, Time warp, Curriculum planning
摘要: Reinforcement learning algorithms enable computer programs (agents) to learn to solve tasks through a trial-and-error process. As an agent takes actions inan environment, it receives positive and negative signals that shape its future behavior. To assist the process of learning, and to learn the task faster andmore accurately, a human expert can be added to the system to guide an agent in solving the task. This project seeks to expand on current systems thatcombine a human expert with a reinforcement learning agent. Current systems use human input to modify the signal the agent receives from theenvironment, which works particularly well for reactive tasks. In more complex tasks, these systems do not work as intended. The manipulation of theenvironment's signal structure results in undesired and unexpected results for the agent's behavior following human training. Our systems attempt toincorporate humans in ways that do not modify the environment, but rather modify the decisions the agent makes at critical times in training. One of oursolutions (Time Warp) allows the human expert to revert back several seconds in the training of the agent to provide an alternate sequence of actions forthe agent to take. Another solution (Curriculum Development) allows the human expert to set up critical training points for the agent to learn. The agentthen learns how to solve these necessary subskills prior to training in the entire world. Our systems seek to solve the planning requirement by employing ahuman expert during critical times of learning, as the expert sees fit. Our approaches to the planning requirement will allow the human expert-agent modelto be expanded to more complex environments than the previous human systems developed.
报告类型: 科技报告
检索历史
应用推荐