|
Introduction to Online Optimization/Learning (Spring 2026)
Project Overview
The final project is to present and summarize one online learning paper, in a team of three (except for one team that has four members since we have 49 students in total). Each of you has been assigned two/three teammates, a paper, and a date for presentation, based on alphabetical order of last names.
The project includes two parts:
(20%) an in-class joint presentation on the assigned paper, on the assigned date.
(20%) a joint final report that summarizes the assigned paper, due on 05/08 (submit through Gradescope as a team; no late day is allowed).
Assigned Papers and Schedule
| Date |
Papers |
Presenters |
| 04/13 |
Optimistic Online-to-Batch Conversions for Accelerated Convergence and Universality |
Mihir Agarwal, Arnav Agrawal, Rishabh Agrawal |
| Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion |
Rohan Badal, Cagan Bakirci, Alec Bernardi |
| Computing Optimal Regularizers for Online Linear Optimization |
Xinjie Chen, Xiwen Chen, Shupeng Cheng |
| On the Convergence of Adam and Beyond |
Jacob Fein-Ashley, Matthew Finlayson, Atul Ganju |
| Dynamic Regret Reduces to Kernelized Static Regret |
Parth Gosar, Zhuoer Gu, Devansh Gupta |
| The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits |
Utkarsh Gupta, Soumita Hait, Lydia Ignatova |
| 04/20 |
On the O(1/T) Convergence of Alternating Gradient Descent–Ascent in Bilinear Games |
Zarif Ikram, Hao Jiang, Shuaiyu Jiang |
| Cautious Optimism: A Meta-Algorithm for Near-Constant Regret in General Games (10 extra minutes) |
Bryan Kim, Jieming Kong, Ryan Lee, Athan Li |
| Versatile Dueling Bandits: Best-of-both World Analyses for Online Learning from Relative Preferences |
Jiate Li, Yuecheng Li, Zeman Li |
| Nash learning from human feedback |
Clifford Lin, Yuzhou Liu, Travis McVoy |
| A minimaximalist approach to reinforcement learning from human feedback |
Tets Nagashima, Kien Nguyen, Steven Nguyen |
| 04/27 |
Online Newton Method for Bandit Convex Optimisation |
Siddharth Raipal, Alfredo Reina Corona, Dexin Ren |
| Is Q-learning Provably Efficient? |
Spandan Senapati, Ania Serbina, Yashaswi Sharma |
| Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability |
Varun Singhal, Vasilis Varsamis, Li-Heng Wang |
| Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff |
Xu Wang, Peter Wu, Zimin Yang |
| Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning |
Zitao Yu, Xiaole Zhang, Kexin Zheng |
Guidance on Reading the Papers
Note that these papers are closely related to and extend the materials covered in the lectures. So when reading them, try to find the connections with what you have learned from the lectures.
Some papers could be notation-heavy and get rather technical at some parts. You are by no mean required to understand every single detail of the paper. Instead, try to focus on the key ideas/messages. It is fine that you only skim some parts of the paper, as long as you spend time in carefully reading and understanding some other main parts.
You should discuss with your teammates for better understanding of the paper!
If you need further help understanding the papers, feel free to ask on Piazza or come to our office hours.
Instructions on the Presentation
The goal of this part is to give you an opportunity to practice the skill of giving talks, which is an extremely important part of your study (but is often overlooked). Follow the instructions below to make sure you and your teammates give a smooth talk.
Each presentation is about 25 minutes long plus 5 minutes of Q&A and transition (except that the 4-student team has 10 extra minutes). Each member should present for roughly 10 minutes, in whatever manner you think works best. It is recommended that you roughly break down the talk into following parts:
(~5 minutes) introduction: what is the topic you are presenting? how is this related to the lectures? what are the main results of the paper in a few words? why are they significant? what will you cover in the rest of the talk?
(~8 minutes) setting up: describe the problem and introduce necessary notations.
(~10 minutes) main results: describe the main algorithms and/or theorems, with emphasis on explaining the key ideas.
(~2 minutes) conclusion: conclude you talk and highlight the key take-home message.
In addition, keep the following in mind when preparing the talks:
You can either give a board talk (just like our lectures) or use slides for the presentation. For a board talk, think carefully about how you are going to use the board: how much space do I need? what should I put in each part of the board? how do I avoid erasing stuff that I need later? If you are using slides instead, please let me know before the lecture so we can make sure there is no problem in setting up the projector.
Anticipate questions from the audience. Just like the lectures, everyone is encouraged to ask questions during the talk. Make sure you understand the question before answering, and try to answer it concisely so you can finish the talk in time. It is perfectly fine if you do not know the answer for tough questions — simply acknowledge it in this case and move on. But make sure you address simple clarification questions, especially at the beginning of the talk when setting up the problems.
If you have any other questions, feel free to ask on Piazza!
Instructions on Writing the Report
The final report has to be written in Latex and must be at least 6-page long, excluding references. You can use the NeurIPS format that we have been using for the homework, or some other similar styles. The report summarizes and distills the main results of your assigned paper, and should include the following components:
A short abstract, summarizing the entire survey.
Introduction: Introduce the main topic of you assigned paper. Try to put it into the context of online learning literature and explain how it relates to topics covered in our lectures. Briefly mention the high-level results and explain the significance of the results (such as improvement over prior work).
Problem setup: For this and the next two parts (“Main results” and “Proofs”), you can use the lecture notes as examples. For problem setup, you should describe the problems in detail using necessary notation. Once again, you are not required to cover everything in the paper, so only describe in detail what you plan to cover in this short report.
Main results: Describe the main algorithms/theorems. For algorithms, describe what they are doing at each step and what the key ideas are. For theorems, after the formal statement, explain in words what the statement really means and what the implications are.
Proofs: Try to distill some proofs from the paper and reproduce them in the report. Pick the ones that you think are the most important or inspiring to you. If the original proofs are long and complicated, try to break it down into several parts (in the form of lemmas for example), and only present the proofs for some of these parts. Again, use the lecture notes as examples.
A short conclusion, highlighting the main message again. In addition, provide a short contribution statement that explains which parts are written by whom (it is okay that some parts are jointly written).
References.
The following two are optional and considered as bonus tasks.
Open questions: Identify interesting and concrete open questions in the same direction that are not mentioned in the papers already. Mention briefly what you think the potential approaches are to tackle these open questions and/or why you think these are hard problems that require new techniques beyond what the paper already presents.
More papers in the same direction: Read more related papers on the same topic and summarize what they are about and how they are related/compared to your assigned paper. To find these other papers, you can use the reference list of your assigned paper, use Google Scholar to see which papers cite your assigned paper, or do a general online search on the related topic.
If you have any other questions, feel free to ask on Piazza!
|