CDS Lecture Series


Hyeong Soo chang
Computer Science and Engineering
Sogang University
Seoul, Korea

Reinforcement Learning with Supervision by Combining Multiple Learnings and Expert Advices
In this talk, we provide a formal coherent learning framework where reinforcement learning is combined with multiple learning and expert advice toward accelerating convergence speed of learning. Our approach is simply to use a nonstationary “potential-based reinforcement function” for shaping the reinforcement signal given to the learning “base-agent”. The base-agent employs SARSA(0) or adaptive asynchronous value iteration (VI), and the supervised inputs to the base-agent from the “subagents” involved with other parallel independent reinforcement learnings and if available, from experts are “merged” into the potential-based reinforcement function value and the value is put into the update equation of SARSA(0) for the Q-function estimate or of adaptive asynchronous VI for the optimal value function estimate. The resulting SARSA(0) and adaptive asynchronous VI converge to an optimal policy, respectively.

Back to CDS Lecture Series
Back to Intelligent Servosystems Laboratory