Hyeong Soo chang
Computer Science and Engineering
Sogang University
Seoul, Korea
Reinforcement Learning with Supervision by Combining Multiple Learnings and Expert Advices
In this talk, we provide a formal coherent learning framework where reinforcement learning is combined with multiple learning and expert advice toward accelerating convergence speed of learning. Our approach is simply to use a nonstationary “potential-based reinforcement function” for shaping the reinforcement signal given to the learning “base-agent”. The base-agent employs SARSA(0) or adaptive asynchronous value iteration (VI), and the supervised inputs to the base-agent from the “subagents” involved with other parallel independent reinforcement learnings and if available, from experts are “merged” into the potential-based reinforcement function value and the value is put into the update equation of SARSA(0) for the Q-function estimate or of adaptive asynchronous VI for the optimal value function estimate. The resulting SARSA(0) and adaptive asynchronous VI converge to an optimal policy, respectively.