Vivek S. Borkar
School of Technology and Computer Science
Tata Institute of Fundamental Research
Bombay, India
This talk will outline the basic philosophy behind reinforcement learning algorithms for Markov descision processes and sketch the techniques for their convergence analysis. With this backdrop, some recent extensions will be discussed.