Publications

Improved Discretization Analysis for Underdamped Langevin Monte Carlo

Underdamped Langevin Monte Carlo (ULMC) is an algorithm used to sample from unnormalized densities by leveraging the momentum of a particle moving in a potential well. Our analysis achieves the state of art dimension dependence, and is also flexible enough to handle weakly smooth potentials. It also enjoys better condition number dependence than prior works on ULMC.

Matthew (Shunshi) Zhang, Sinho Chewi, Mufan Bill Li, Krishnakumar Balasubramanian, Murat A. Erdogdu

Tight Regret and Complexity Bounds for Thompson Sampling via Langevin Monte Carlo

We propose and analyze the use of Markov Chain Monte Carlo methods as an implementation of Thompson sampling. Our algorithm attains optimal regret bounds, while also providing explicit bounds on the number of data evaluations for LMC and MALA. Finally, we validate our findings through numerical simulations and show that we outperform vanilla Thompson sampling in high dimensions.

Tom Huix, Matthew (Shunshi) Zhang, Alain Durmus

Towards a Theory of Non-Log-Concave Sampling: First-Order Stationarity Guarantees for Langevin Monte Carlo

We provide an upper bound for the relative Fisher information after some number of iterations of LMC. This is the first known sampling analogue of complexity bounds for finding approximate first-order stationary points in non-convex optimization.

Krishnakumar Balasubramanian, Sinho Chewi, Murat A. Erdogdu, Adil Salim, Matthew (Shunshi) Zhang

Analysis of Langevin Monte Carlo from Poincaré to Log-Sobolev

We provide the first convergence guarantees for the Langevin Monte Carlo algorithm under an interpolated inequality between log-Sobolev and Poincaré, with improvements on both rates and assumptions seen in prior work.

Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li, Ruoqi Shen, Matthew (Shunshi) Zhang

Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings

We analyze the convergence of policy gradient and natural policy gradient when the policy class satisfies weak smoothness guarantees and L2 integrability, under a variety of learning rate decay functions. This is much more general than existing work, where Lipschitz gradients and absolute boundedness are required. We also provide analysis for the optimality of stationary points, and supplement our analysis with empirical verification.

Matthew (Shunshi) Zhang, Murat Erdogdu, Animesh Garg

Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence

We analyze the Langevin Monte Carlo algorithm in the strong divergence measures of Renyi and Chi-Square, and obtain improved rates under the log-Sobolev inequality. In contrast to prior work, our assumptions permit some non-convex potentials, such as those that are strongly convex outside a compact set.

Murat A. Erdogdu, Rasa Hosseinzadeh, Matthew (Shunshi) Zhang

One-shot pruning of recurrent neural networks by jacobian spectrum evaluation

We propose a method for generating sparse RNN parameterizations with minimum performance degradation. This relies on preservation of the temporal Jacobian spectrum, which measures information loss across long time horizons. This allows for more efficient parameter utilization in sequential learning environments.

Matthew (Shunshi) Zhang, Bradley Stadie

One-shot pruning of recurrent neural networks by jacobian spectrum evaluation

Benchmarking Model-Based Reinforcement Learning

We evaluate the performance of various standard model-based reinforcement learning algorithms, on a robust set of benchmarks. Based on these collected insights, we summarize the relative merits of each approach and propose algorithmic improvements.

Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Matthew (Shunshi) Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba

Benchmarking Model-Based Reinforcement Learning