SEMINAR

Adversarially robust stochastic multi-armed bandits

Thursday, Jan 16 2020 - 6:23 pm (GMT + 7)
Speaker
Julian Zimmert
Working
University of Copenhagen
Timeline
Thu, Jan 16 2020 - 10:00 am (GMT + 7)
About Speaker

Julian Zimmert received his Masters degree in Mathematics at the Humboldt University of Berlin and is now a final year PhD student at the University of Copenhagen working under supervision of Yevgeny Seldin. His main area of research is robust algorithms for ranges of environments, in particular algorithms for multi-armed bandits in adversarial and stochastic settings. Recently, he did an internship at DeepMind in the Foundations group of Csaba Szepesvari working on a connection between mirror descent and the information theoretic analysis of Thompson sampling.

Abstract

Multi-armed bandits are a popular framework for optimal experimental design with applications in digital advertising and website optimisation. Traditionally, the bandit literature separates between two distinct forms of environments: The stochastic setting assumes that the data is generated by an i.i.d. process, which allows specialised algorithms to learn quickly. At the other extreme, the adversarial setting only assumes boundedness. This makes learning extremely robust, but comes at the cost of significantly slower convergence to the optimal solution. Real world applications are typically somewhere in between. While it might be reasonable to assume the data is close to i.i.d., the distribution might be influenced by hidden confounders or undergo unforeseen changes. Practically this means that stochastic bandit algorithms might fail even to approach a good solution. This poses a serious dilemma to the practitioners. Should one prioritise fast or robust learning? But why not both? This talk presents a recent breakthrough in practical all-purpose algorithms.

Related seminars

Coming soon
Niranjan Balasubramanian

Stony Brook University

Towards Reliable Multi-step Reasoning in Question Answering
Fri, Nov 03 2023 - 10:00 am (GMT + 7)
Nghia Hoang

Washington State University

Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
Fri, Oct 27 2023 - 10:00 am (GMT + 7)
Jey Han Lau

University of Melbourne

Rumour and Disinformation Detection in Online Conversations
Thu, Sep 14 2023 - 10:00 am (GMT + 7)
Tan Nguyen

National University of Singapore

Principled Frameworks for Designing Deep Learning Models: Efficiency, Robustness, and Expressivity
Mon, Aug 28 2023 - 10:00 am (GMT + 7)