CCSP Seminar: K. A. Sankararaman, "Adversarial Bandits with Knapsacks"

Thursday, March 14, 2019
5:00 p.m.-6:30 p.m.
AVW 2168
Ajaykrishnan Nageswaran
301 405 3661
ajayk@umd.edu
http://www.ece.umd.edu/seminars/ccsp/

Communication, Control and Signal Processing Seminar

Adversarial Bandits with Knapsacks

Karthik Abinav Sankararaman
University of Maryland

Abstract
In this talk, we will discuss the multi-armed bandits problem with resource constraints, in the adversarial setting. In this problem, we have an interactive and repeated game between the algorithm and an adversary. Given $T$ time-steps, $d$ resources, $m$ actions and budgets $B_1, B_2,\ldots B_d$, the algorithm chooses one of the $m$ actions at each time-step. An adversary then reveals a reward and consumption for each of the $d$ resources, corresponding to this action. The time-step at which the algorithm runs out of the $d$ resources (i.e., the total consumption for resource $j > B_j$), the game stops and the total reward is the sum of rewards obtained until the stopping time. The goal is to maximize the competitive ratio; the ratio of the total reward of the algorithm to the expected reward of a fixed distribution that knows the rewards and consumption for all actions and time-steps beforehand. We give an algorithm for this problem whose competitive ratio is tight (matches the lower-bound). The algorithm achieves the optimal bound, both in expectation and with high-probability. Moreover, the algorithmic framework is modular and can be used in a black-box manner to obtain algorithms to many related problems such as the stochastic, contextual, semi-bandit settings as well as to the more general online constrained convex optimization setting.

The talk is based on this paper.

Audience: Graduate Undergraduate Faculty Post-Docs

Browse All Events

April 2024

SU	MO	TU	WE	TH	FR	SA
31	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	1	2	3	4

Submit an Event