|
|
Lecturer: Haipeng Luo
When: See detailed schedule below Where: Teaching Building 3, 201 |
|
Overview: This course focuses on the foundation of the theory of online learning/online convex optimization/sequential decision making, which has been playing a crucial role in machine learning and many real-life applications. The main theme of the course is to study algorithms whose goal is to minimize "regret" when facing against a possibly adversarial environment with possibly limited (a.k.a. "bandit") feedback, and to understand their theoretical guarantees. Some connections to game theory, boosting and other learning problems will also be covered. This is a mini version of a graduate course taught at USC. Learning Objectives: At a high-level, through this course you will have a concrete idea of what online learning is about, what the classic algorithms are, and how they are usually analyzed. Specifically, you will learn about algorithms such as exponential weights, follow-the-regularized-leader, UCB, EXP3, SCRiBLe, and others, as well as general techniques for proving regret upper and lower bounds. The broader goal is to cultivate the ability to think about machine learning in a more rigorous and principled way and to design provable and practical machine learning algorithms. Prerequisites: Familiarity with probability, convex analysis, calculus, and analysis of algorithms. Some basic understandings of machine learning would be very helpful. Readings: Their is no official textbook for this course, but the following books/surveys are very helpful in general:
Schedule: (the first half of the course focuses on problems with full information feedback, while the second half focuses on bandit feedback.)
|