Publications
Preprints
[arXiv]
Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games.
Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, and Weiqiang Zheng.
Conference Papers
2024:
[NeurIPS 2024]
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms.
Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, and Weiqiang Zheng.
[NeurIPS 2024]
Tractable Local Equilibria in Non-Concave Games.
Yang Cai, Constantinos Daskalakis, Haipeng Luo, Chen-Yu Wei, and Weiqiang Zheng.
[NeurIPS 2024]
Provably Efficient Interactive-Grounded Learning with Personalized Reward.
Mengxiao Zhang, Yuheng Zhang, Haipeng Luo, and Paul Mineiro.
[ICML 2024]
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback.
Asaf Cassel, Haipeng Luo, Aviv Rosenberg, and Dmitry Sotnikov.
[ICML 2024]
Efficient Contextual Bandits with Uninformed Feedback Graphs.
Mengxiao Zhang, Yuheng Zhang, Haipeng Luo, and Paul Mineiro.
[AISTATS 2024 Oral]
Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games.
Yang Cai, Haipeng Luo, Chen-Yu Wei, and Weiqiang Zheng.
2023:
[NeurIPS 2023 spotlight]
Regret Matching+: (In)Stability and Fast Convergence in Games.
Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, and Haipeng Luo.
[NeurIPS 2023]
No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions.
Tiancheng Jin, Junyan Liu, Chloé Rouyer, William Chang, Chen-Yu Wei, and Haipeng Luo.
[NeurIPS 2023]
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games.
Yang Cai, Haipeng Luo, Chen-Yu Wei, and Weiqiang Zheng.
[NeurIPS 2023]
Practical Contextual Bandits with Feedback Graphs.
Mengxiao Zhang, Yuheng Zhang, Olga Vrousgou, Haipeng Luo, and Paul Mineiro.
[ICML 2023]
Refined Regret for Adversarial MDPs with Linear Function Approximation.
Yan Dai, Haipeng Luo, Chen-Yu Wei, and Julian Zimmert.
[ALT 2023]
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs.
Haipeng Luo, Hanghang Tong, Mengxiao Zhang, and Yuheng Zhang.
[AISTATS 2023]
No-Regret Learning in Two-Echelon Supply Chain with Unknown Demand Distribution.
Mengxiao Zhang, Shi Chen, Haipeng Luo, and Yingfei Wang.
[UAI 2023]
Posterior Sampling-based Online Learning for the Stochastic Shortest Path Model.
Mehdi Jafarnia-Jahromi, Liyu Chen, Rahul Jain, and Haipeng Luo.
2022:
[NeurIPS 2022 Oral]
Uncoupled Learning Dynamics with O(log T) Swap Regret in Multiplayer Games.
Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Chung-Wei Lee, Haipeng Luo, and Tuomas Sandholm.
[NeurIPS 2022]
Near-Optimal No-Regret Learning for General Convex Games.
Gabriele Farina, Ioannis Anagnostides, Haipeng Luo, Chung-Wei Lee, Christian Kroer, and Tuomas Sandholm.
[NeurIPS 2022]
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback.
Tiancheng Jin, Tal Lancewicki, Haipeng Luo, Yishay Mansour, and Aviv Rosenberg.
[NeurIPS 2022 OPT Workshop]
Clairvoyant Regret Minimization: Equivalence with Nemirovski’s Conceptual Prox Method and Extension to General Convex Games.
Gabriele Farina, Christian Kroer, Chung-Wei Lee, and Haipeng Luo.
[COLT 2022]
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits.
Haipeng Luo, Mengxiao Zhang, Peng Zhao, and Zhi-Hua Zhou.
[ICML 2022]
No-Regret Learning in Time-Varying Zero-Sum Games.
Mengxiao Zhang, Peng Zhao, Haipeng Luo, and Zhi-Hua Zhou.
[ICML 2022]
Kernelized Multiplicative Weights for 0/1-Polyhedral Games: Bridging the Gap Between Learning in Extensive-Form and Normal-Form Games.
Gabriele Farina, Chung-Wei Lee, Haipeng Luo, and Christian Kroer
2021:
[NeurIPS 2021]
Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path.
Liyu Chen, Mehdi Jafarnia-Jahromi, Rahul Jain, and Haipeng Luo.
[COLT 2021]
Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games.
Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, and Haipeng Luo
[ICML 2021]
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously.
Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, and Xiaojin Zhang
[ICLR 2021]
Linear Last-iterate Convergence in Constrained Saddle-point Optimization.
Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, and Haipeng Luo.
[AISTATS 2021]
Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation.
Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, and Rahul Jain.
[AISTATS 2021]
Active Online Learning with Hidden Shifting Domains.
Yining Chen, Haipeng Luo, Tengyu Ma, and Chicheng Zhang.
[ALT 2021]
Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds.
Ehsan Emamjomeh-Zadeh, Chen-Yu Wei, Haipeng Luo, and David Kempe.
2020:
[NeurIPS 2020 Oral]
Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs.
Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, and Mengxiao Zhang.
[ICML 2020]
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition.
Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, and Tiancheng Yu.
[ICML 2020]
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes.
Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Hiteshi Sharma, and Rahul Jain.
[UAI 2020]
Fair Contextual Multi-Armed Bandits: Theory and Experiments.
Yifang Chen, Alex Cuellar, Haipeng Luo, Jignesh Modi, Heramb Nemlekar, and Stefanos Nikolaidis.
2019:
[NeurIPS 2019]
Equipping Experts/Bandits with Long-term Memory.
Kai Zheng, Haipeng Luo, Ilias Diakonikolas, and Liwei Wang.
[NeurIPS 2019]
Hypothesis Set Stability and Generalization.
Dylan J. Foster, Spencer Greenberg, Satyen Kale, Haipeng Luo, Mehryar Mohri, and Karthik Sridharan.
[COLT 2019]
Improved Path-length Regret Bounds for Bandits.
Sébastien Bubeck, Yuanzhi Li, Haipeng Luo, and Chen-Yu Wei.
[COLT 2019]
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free.
Yifang Chen, Chung-Wei Lee, Haipeng Luo, and Chen-Yu Wei.
[COLT 2019 joint extended abstract]
Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information.
Peter Auer, Yifang Chen, Pratik Gajane, Chung-Wei Lee, Haipeng Luo, Ronald Ortner, and Chen-Yu Wei.
2018:
[COLT 2018 Best Student Paper Award]
Logistic Regression: The Importance of Being Improper.
Dylan J. Foster, Satyen Kale, Haipeng Luo, Mehryar Mohri, and Karthik Sridharan.
[COLT 2018]
Efficient Contextual Bandits in Non-stationary Worlds.
Haipeng Luo, Chen-Yu Wei, Alekh Agarwal, and John Langford.
[ICML 2018]
Practical Contextual Bandits with Regression Oracles.
Dylan J. Foster, Alekh Agarwal, Miroslav Dudik, Haipeng Luo, and Robert E. Schapire.
2017 and Before:
[FOCS 2017, JACM]
Oracle-Efficient Online Learning and Auction Design.
Miroslav Dudík, Nika Haghtalab, Haipeng Luo, Robert E. Schapire, Vasilis Syrgkanis, and Jennifer Wortman Vaughan.
[COLT 2017]
Corralling a Band of Bandit Algorithms.
Alekh Agarwal, Haipeng Luo, Behnam Neyshabur, and Robert E. Schapire.
[NeurIPS 2016]
Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits.
Vasilis Syrgkanis, Haipeng Luo, Akshay Krishnamurthy, and Robert E. Schapire.
[NeurIPS 2016]
Efficient Second Order Online Learning via Sketching.
Haipeng Luo, Alekh Agarwal, Nicolò Cesa-Bianchi, and John Langford.
[NeurIPS 2015 Best Paper Award]
Fast Convergence of Regularized Learning in Games.
Vasilis Syrgkanis, Alekh Agarwal, Haipeng Luo, and Robert E. Schapire.
[NeurIPS 2015]
Online Gradient Boosting.
Alina Beygelzimer, Elad Hazan, Satyen Kale, and Haipeng Luo.
Open Problems
[COLT 2017]
Open Problem: First-Order Regret Bounds for Contextual Bandits. [A solution by Allen-Zhu, Bubeck, and Li]
Alekh Agarwal, Akshay Krishnamurthy, John Langford, Haipeng Luo, and Robert E. Schapire.
PhD Thesis
Misc.
|