Background

Dynamic programming (DP) based algorithms, which apply various forms of the Bellman operator, dominate the literature on model-free reinforcement learning (RL). While DP is powerful, the value function estimate can oscillate or even diverge when function approximation is introduced with off-policy data, except in special cases. This problem has been well-known for decades (referred to as the deadly triad in the literature), and has remained a critical open fundamental problem in RL.

More recently, the community witnessed a fast-growing trend that frames RL problems as well-posed optimization problems, in which a proper objective function is proposed whose minimization results in the optimal value function. Such an optimization-based approach provides a promising perspective that brings mature mathematical tools to bear on integrating linear/nonlinear function approximation with off-policy data, while avoiding DP’s inherent instability. Moreover, the optimization perspective is naturally extensible to incorporating constraints, sparsity regularization, distributed multi-agent scenarios, and other new settings.

In addition to being able to apply powerful optimization techniques to a variety of RL problems, the special recursive structure and restricted exploration sampling in RL also naturally raises the question of whether tailored algorithms can be developed to improve sample efficiency, convergence rates, and asymptotic performance, under the guidance of the established optimization techniques.

The goal of this workshop is to catalyze the collaboration between reinforcement learning and optimization communities, pushing the boundaries from both sides. It will provide a forum for establishing a mutually accessible introduction to current research on this integration, and allow exploration of recent advances in optimization for potential application in reinforcement learning. It will also be a window to identify and discuss existing challenges and forward-looking problems of interest in reinforcement learning to the optimization community.

Invited Speakers

Shipra Agrawal (Columbia University)
Sham Kakade (University of Washington)
Benjamin Van Roy (DeepMind & Stanford University)
Mengdi Wang (Princeton University)
Huizhen Yu (University of Alberta)

Invited Panelist

Richard Sutton (DeepMind & University of Alberta)
Doina Precup (McGill University)

Dates

Submission deadline: ~~September 10th~~, ~~September 17th, 2019~~ (~~11:59 pm AOE~~)
Notifications: ~~October 1st, 2019~~
Camera ready: ~~November 15th, 2019~~ (~~11:59 pm AOE~~)
Workshop: December 14th, 2019

Awards

We will provide student travel awards to the a few of authors of the accepted papers.

To apply for the travel awards, please send the following information to optrl2019@gmail.com by the application deadline ~~Oct. 28th, 2019~~:

Title with [OptRL 2019 Student Travel Awards Application].
Your paper ID and title.
A brief bio, no more than one paragraph.
Student status certificate, such as photocopies of student ID or university website.

Committees

Organizers

Bo Dai (Google Brain)
Niao He (University of Illinois at Urbana-Champaign)
Nicolas Le Roux (Google Brain)
Lihong Li (Google Brain)
Dale Schuurmans (Google Brain & University of Alberta)
Martha White (University of Alberta)

Program committee

Alekh Agarwal
Zafarali Ahmed
Kavosh Asad
Marlos C. Machado
Jianshu Chen
Yinlam Chow
Adithya Devraj
Thinh Doan
Simon Du
Yihao Feng
Roy Fox
Matthieu Geist
Saeed Ghadimi

Shixiang Gu
Botao Hao
Nan Jiang
Ajin Joseph
Donghwan Lee
Alex Lewandowski
Vincent Liu
Rupam Mahmood
Jincheng Mei
Ofir Nachum
Gergely Neu
Mohammad Norouzi
Andrew Patterson

Yash Satsangi
Matthew Schlegel
Karan Singh
Ziyang Tang
Valentin Thomas
Sergio Valcarcel Macua
Junfeng Wen
Zheng Wen
Adam White
Tengyang Xie
Zhuoran Yang
Shangtong Zhang
Tuo Zhao

For questions, please contact us: optrl2019@gmail.com

NeurIPS 2019 Optimization Foundations of Reinforcement Learning Workshop

Workshop at NeurIPS 2019, Dec 14th, 2019
West Ballroom A, Vancouver Convention Center, Vancouver, Canada