Theoretical Machine Learning (Fall 2024)

Project Overview

The final project is to present and summarize one machine learning theory paper, in a team of two. Each of you has been assigned a teammate, a paper, and a date for presentation, based on alphabetical order of last names (mostly, with some individual requests on dates taken into account). The project includes two parts:

(30%) an in-class joint presentation on the assigned paper, on the assigned date.
(30%) a joint final report that summarizes the assigned paper, due on 12/17 (submit through Brightspace; no late day is allowed).

Assigned Papers and Schedule

Date	Papers	Presenters
11/15	Learnability, Stability and Uniform Convergence	Xu Wang and Yanze Wang
	Multiclass Learnability and the ERM Principle	Matin Amini and Dimitrios Andreadis
	Surprises in High-Dimensional Ridgeless Least Squares Interpolation	Merve Atasever and Amin Banayeeanzade
	From Tempered to Benign Overfitting in ReLU Neural Networks	Jie Cai and Junyang Cai
	Adversarially Robust Learning: A Generic Minimax Optimal Learner and Characterization	Weike Fang and Eddie Lee
	A Theory of Interpretable Approximations	Soumita Hait and Saba Hashemi
11/22	Multiclass Online Learning and Uniform Convergence	Sayan Ghosh and Qingpei Li
	Private and Online Learnability are Equivalent	Yahan Li and Jiaqi Lu
	On the convergence of Adam and Beyond	Josh Shangguan and Gaozhan Wang
	A Trichotomy for Transductive Online Learning	Mevan Pathirannahelage and Dylan Rowe
	Online Isotonic Regression	Mahdi Salmani and Sajjad Shahabi
	Online Learning with Simple Predictors and a Combinatorial Characterization of Minimax in 0/1 Games	Yue Wu and Ao Xu
12/06	Bayesian Design Principles for Frequentist Sequential Learning	Guangxu Yang and Yuxin Yang
	Bandit Learnability can be Undecidable	Akhil Agnihotri and Rishabh Agrawal
	Linear Partial Monitoring for Sequential Decision Making Algorithms, Regret Bounds and Applications	Ryan Marr and Dhruv Parikh
	Bandit-Feedback Online Multiclass Classification: Variants and Tradeoffs	Ludwika Twardecka and Huijuan Wang
	Optimal Rates for Bandit Nonstochastic Control	Dongze Ye and Yihang Zhang

Guidance on Reading the Papers

Note that these papers are closely related to and extend the materials covered in the lectures. So when reading them, try to find the connections with what you have learned from the lectures.
Some papers could be notation-heavy and get rather technical at some parts. You are by no mean required to understand every single detail of the paper. Instead, try to focus on the key ideas/messages. It is fine that you only skim some parts of the paper, as long as you spend time in carefully reading and understanding some other main parts.
It is highly recommended that you discuss with your teammate for better understanding of the paper!
If you need further help understanding the paper, feel free to ask on Piazza or come to our office hours.

Instructions on the Presentation

The goal of this part is to give you an opportunity to practice the skill of giving talks, which is an extremely important part of your PhD training (but is often overlooked). Follow the instructions below to make sure you and your teammate give a smooth talk.

Each presentation is about 25 minutes long plus 5 minutes of Q&A and transition. The two of you should roughly present for half of the time, in whatever manner you think works best (e.g., one goes first, followed by the other). It is recommended that you roughly break down the talk into following parts:

(~5 minutes) introduction: what is the topic you are presenting? how is this related to the lectures? what are the main results of the assigned paper in a few words? why are they significant? what will you cover in the rest of the talk?
(~8 minutes) setting up: describe the problem and introduce necessary notations.
(~10 minutes) main results: describe the main algorithms and/or theorems, with emphasis on explaining the key ideas.
(~2 minutes) conclusion: conclude the talk and highlight key take-home messages.

In addition, keep the following in mind when preparing the talks:

You can either give a board talk (just like our lectures) or use slides for the presentation. For a board talk, think carefully about how you are going to use the board: how much space do I need? what should I put in each part of the board? how do I avoid erasing stuff that I need later? If you are using slides instead, please let me know before the lecture so we can make sure there is no problem in setting up the projector.

Anticipate questions from the audience. Just like the lectures, everyone is encouraged to ask questions during the talk. Make sure that you understand the question before answering, and try to answer it concisely so you can finish the talk in time. It is perfectly fine if you do not know the answer for tough questions — simply acknowledge it in this case and move on. But make sure you address simple clarification questions, especially at the beginning of the talk when setting up the problems.

By no mean you need to cover everything in your assigned paper! In fact, most likely you will only have time to cover one or two main results. Keep this in mind and make the key message of your talk clear.

Even if the paper is notation-heavy, it does not mean your talk has to be so. Try to use a minimum amount of notation to convey what you want to say. It is common to make simplification in talks even if that means sacrificing generality or even rigorousness.

If you have any other questions, feel free to ask us on Piazza!

Instructions on Writing the Report

The final joint report has to be written in Latex and must be at least 6 pages long, excluding references. You can use the NeurIPS format that we have been using for the homework, or some other similar styles. The report summarizes and distills the main results of your assigned paper, and should include the following components (for an example, see Lecture notes 4 again where I summarized Bartlett-Foster-Telgarsky’17 in the exact same way):

A short abstract, summarizing the entire report.
Introduction: Introduce the main topic of you assigned paper, and explain how it relates to topics covered in our lectures. Briefly mention the high-level results and explain the significance of the results (such as improvement over prior work).
Problem setup: Describe the problems in detail using necessary notation. Once again, you are not required to cover everything in the paper, so only describe in detail what you plan to cover in this short report.
Main results: Describe the main algorithms/theorems. For algorithms, describe what they are doing at each step and what the key ideas are. For theorems, after the formal statement, explain in words what the statement really means and what the implications are.
Proofs: Try to distill some proofs from the paper and reproduce them in the report. Pick the ones that you think are the most important or inspiring to you. If the original proofs are long and complicated, try to break it down into several parts (in the form of lemmas for example), and only present the proofs for some of these parts. Again, use the lecture notes as examples.
A short conclusion, highlighting the main message again. In addition, provide a short contribution statement that explains which parts are written by whom (it is okay that some parts are jointly written by both of you).
References.

The following two are optional and considered as bonus tasks.

Open questions: Identify interesting and concrete open questions in the same direction that are not mentioned in the papers already. Mention briefly what you think the potential approaches are to tackle these open questions and/or why you think these are hard problems that require new techniques beyond what the paper already presents.
More papers in the same direction: Read more related papers on the same topic and summarize what they are about and how they are related/compared to your assigned paper. To find these other papers, you can use the reference list of your assigned paper, use Google Scholar to see which papers cite your assigned paper, or do a general online search on the related topic.

It is sufficient that either one of the team members makes a final submission of the report through Brightspace. If you have any other questions, feel free to ask us on Piazza!