Machine Learning in the Bandit Setting: Algorithms, Evaluation, and Case Studies (CS Seminar Lecture Series)

Li, Lihong

Machine Learning in the Bandit Setting: Algorithms, Evaluation, and Case Studies (CS Seminar Lecture Series)

Files

February 10 Lecture - Broadband Low.mp4 (908.58 MB)

Downloads: 1141

February 10 Lecture - Broadband Low.webm (125.97 MB)

Downloads: 1672

February 10 Lecture - Broadband Low.mp4-en.vtt (80 KB)

Downloads: 93

Date

2012-02-10

Authors

Li, Lihong

Abstract

Much of machine-learning research is about discovering patterns---building intelligent agents that learn to predict future accurately from historical data. While this paradigm has been extremely successful in numerous applications, complex real-world problems such as content recommendation on the Internet often require the agents to learn to act optimally through autonomous interaction with the world they live in, a problem known as reinforcement learning. Using a news recommendation module on Yahoo!'s front page as a running example, the majority of the talk focuses on the special case of contextual bandits that have gained substantial interests recently due to their broad applications. We will highlight a fundamental challenge known as the exploration/exploitation tradeoff, present a few newly developed algorithms with strong theoretical guarantees, and demonstrate their empirical effectiveness for personalizing content recommendation at Yahoo!. At the end of the talk, we will also summarize (briefly) our earlier work on provably data-efficient algorithms for more general reinforcement-learning problems modeled as Markov decision processes.

Bio: Lihong Li is a Research Scientist in the Machine Learning group at Yahoo! Research. He obtained a PhD degree in Computer Science from Rutgers University, advised by Michael Littman. Before that, he obtained a MSc degree from the University of Alberta, advised by Vadim Bulitko and Russell Greiner, and BE from the Tsinghua University. In the summers of 2006-2008, he enjoyed interning at Google, Yahoo! Research, and AT&T Shannon Labs, respectively. His main research interests are in machine learning with interaction, including reinforcement learning, multi-armed bandits, online learning, active learning, and their numerous applications on the Internet. He is the winner of an ICML'08 Best Student Paper Award, a WSDM'11 Best Paper Award, and an AISTATS'11 Notable Paper Award.

The Computer Science Seminar Lecture Series is a collection of weekly lectures about topics at the forefront of contemporary computer science research, given by speakers knowledgeable in their field of study. These speakers come from a variety of different technical and geographic backgrounds, with many of them traveling from other universities across the globe to come here and share their knowledge. These weekly lectures were recorded with an HD video camera, edited with Apple Final Cut Pro X, and outputted in such a way that the resulting .mp4 video files were economical to store and stream utilizing the university's limited bandwidth and disk space resources.

Keywords

Computer Science Lecture Series

Persistent link

http://hdl.handle.net/10919/19039

Collections

Computer Science Seminar Series

Full item page

Machine Learning in the Bandit Setting: Algorithms, Evaluation, and Case Studies (CS Seminar Lecture Series)

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections