Publications

Preprints

  • [arXiv] Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications.
    Liyu Chen, Haipeng Luo, and Chen-Yu Wei.

  • [arXiv] Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition.
    Liyu Chen, Haipeng Luo, and Chen-Yu Wei.

  • [arXiv] Finding the Stochastic Shortest Path with Low Regret: The Adversarial Cost and Unknown Transition Case.
    Liyu Chen and Haipeng Luo.

  • [arXiv] Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach.
    Chen-Yu Wei and Haipeng Luo.

  • [arXiv] Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously.
    Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, and Xiaojin Zhang

  • [arXiv] Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games.
    Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, and Haipeng Luo

Conference Papers

  • [ICLR 2021] Linear Last-iterate Convergence in Constrained Saddled-point Optimization.
    Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, and Haipeng Luo.

  • [AISTATS 2021] Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation.
    Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, and Rahul Jain.

  • [AISTATS 2021] Active Online Learning with Hidden Shifting Domains.
    Yining Chen, Haipeng Luo, Tengyu Ma, and Chicheng Zhang.

  • [ALT 2021] Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds.
    Ehsan Emamjomeh-Zadeh, Chen-Yu Wei, Haipeng Luo, and David Kempe.

  • [NeurIPS 2020 Oral] Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs.
    Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, and Mengxiao Zhang.

  • [NeurIPS 2020 spotlight] Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition.
    Tiancheng Jin and Haipeng Luo.

  • [NeurIPS 2020] Comparator-Adaptive Convex Bandits.
    Dirk van der Hoeven, Ashok Cutkosky, and Haipeng Luo.

  • [COLT 2020] Taking a Hint: How to Leverage Loss Predictors in Contextual Bandits?
    Chen-Yu Wei, Haipeng Luo, and Alekh Agarwal.

  • [COLT 2020] A Closer Look at Small-loss Bounds for Bandits with Graph Feedback.
    Chung-Wei Lee, Haipeng Luo, and Mengxiao Zhang.

  • [ICML 2020] Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition.
    Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, and Tiancheng Yu.

  • [ICML 2020] Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes.
    Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Hiteshi Sharma, and Rahul Jain.

  • [UAI 2020] Fair Contextual Multi-Armed Bandits: Theory and Experiments.
    Yifang Chen, Alex Cuellar, Haipeng Luo, Jignesh Modi, Heramb Nemlekar, and Stefanos Nikolaidis.

  • [NeurIPS 2019 spotlight] Model Selection for Contextual Bandits.
    Dylan J. Foster, Akshay Krishnamurthy, and Haipeng Luo.

  • [NeurIPS 2019] Equipping Experts/Bandits with Long-term Memory.
    Kai Zheng, Haipeng Luo, Ilias Diakonikolas, and Liwei Wang.

  • [NeurIPS 2019] Hypothesis Set Stability and Generalization.
    Dylan J. Foster, Spencer Greenberg, Satyen Kale, Haipeng Luo, Mehryar Mohri, and Karthik Sridharan.

  • [COLT 2019] Improved Path-length Regret Bounds for Bandits.
    Sébastien Bubeck, Yuanzhi Li, Haipeng Luo, and Chen-Yu Wei.

  • [COLT 2019] A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free.
    Yifang Chen, Chung-Wei Lee, Haipeng Luo, and Chen-Yu Wei.

  • [COLT 2019 joint extended abstract] Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information.
    Peter Auer, Yifang Chen, Pratik Gajane, Chung-Wei Lee, Haipeng Luo, Ronald Ortner, and Chen-Yu Wei.

  • [ICML 2019 long talk] Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously.
    Julian Zimmert, Haipeng Luo, and Chen-Yu Wei.

  • [NeurIPS 2018 spotlight] Efficient Online Portfolio with Logarithmic Regret.
    Haipeng Luo, Chen-Yu Wei, and Kai Zheng.

  • [COLT 2018 Best Student Paper Award] Logistic Regression: The Importance of Being Improper.
    Dylan J. Foster, Satyen Kale, Haipeng Luo, Mehryar Mohri, and Karthik Sridharan.

  • [COLT 2018] More Adaptive Algorithms for Adversarial Bandits.
    Chen-Yu Wei and Haipeng Luo.

  • [COLT 2018] Efficient Contextual Bandits in Non-stationary Worlds.
    Haipeng Luo, Chen-Yu Wei, Alekh Agarwal, and John Langford.

  • [ICML 2018] Practical Contextual Bandits with Regression Oracles.
    Dylan J. Foster, Alekh Agarwal, Miroslav Dudik, Haipeng Luo, and Robert E. Schapire.

  • [FOCS 2017, JACM] Oracle-Efficient Online Learning and Auction Design.
    Miroslav Dudík, Nika Haghtalab, Haipeng Luo, Robert E. Schapire, Vasilis Syrgkanis, and Jennifer Wortman Vaughan.

  • [COLT 2017] Corralling a Band of Bandit Algorithms.
    Alekh Agarwal, Haipeng Luo, Behnam Neyshabur, and Robert E. Schapire.

  • [NeurIPS 2016] Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits.
    Vasilis Syrgkanis, Haipeng Luo, Akshay Krishnamurthy, and Robert E. Schapire.

  • [NeurIPS 2016] Efficient Second Order Online Learning via Sketching.
    Haipeng Luo, Alekh Agarwal, Nicolò Cesa-Bianchi, and John Langford.

  • [ICML 2016] Variance-Reduced and Projection-Free Stochastic Optimization.
    Elad Hazan and Haipeng Luo.

  • [NeurIPS 2015 Best Paper Award] Fast Convergence of Regularized Learning in Games.
    Vasilis Syrgkanis, Alekh Agarwal, Haipeng Luo, and Robert E. Schapire.

  • [NeurIPS 2015] Online Gradient Boosting.
    Alina Beygelzimer, Elad Hazan, Satyen Kale, and Haipeng Luo.

  • [COLT 2015] Achieving All with No Parameters: AdaNormalHedge.
    Haipeng Luo and Robert E. Schapire.

  • [NeurIPS 2014] A Drifting-Games Analysis for Online Learning and Applications to Boosting.
    Haipeng Luo and Robert E. Schapire.

  • [NeurIPS 2014 OPT workshop] Accelerated Parallel Optimization Methods for Large Scale Machine Learning.
    Haipeng Luo, Patrick Haffner, and Jean-Francois Paiement.

  • [ICML 2014] Towards Minimax Online Learning with Unknown Time Horizon.
    Haipeng Luo and Robert E. Schapire.

Open Problems

  • [COLT 2020] Open Problem: Model Selection for Contextual Bandits.
    Dylan Foster, Akshay Krishnamurthy, and Haipeng Luo.

  • [COLT 2017] Open Problem: First-Order Regret Bounds for Contextual Bandits. [a solution by Allen-Zhu, Bubeck, and Li]
    Alekh Agarwal, Akshay Krishnamurthy, John Langford, Haipeng Luo, and Robert E. Schapire.

PhD Thesis

Misc.