This talk will be held in a hybrid format, both in person at AIP Open Space of RIKEN AIP (Nihonbashi office) and online by Zoom. AIP Open Space: *only available to AIP researchers.
DATE & TIME
May 29, 2024: 13:30 pm - 14:30 pm (JST)
TITLE
Optimal Sets and Solution Paths of ReLU Networks
SPEAKER
Aaron Mishkin (Stanford University)
ABSTRACT
We develop an analytical framework to characterize the set of optimal ReLU neural networks by reformulating the non-convex training problem as a convex program. We show that the global optima of the convex parameterization are given by a polyhedral set and then extend this characterization to the optimal set of the non-convex training objective. Since all stationary points of the ReLU training problem can be represented as optima of sub-sampled convex programs, our work provides a general expression for all critical points of the non-convex objective. We then leverage our results to provide an optimal pruning algorithm for computing minimal networks, establish conditions for the regularization path of ReLU networks to be continuous, and develop sensitivity results for minimal ReLU networks.
BIOGRAPHY
Aaron Mishkin is a fourth-year PhD student in the Department of Computer Science at Stanford University, where he is supervised by Mert Pilanci. His work at Stanford focuses on the theory and applications of convex optimization, including iterative algorithms and convex reformulations of neural networks. Before Stanford, Aaron completed a BSc and MSc in computer science at the University of British Columbia, where he worked with Mark Schmidt on the convergence of stochastic gradient descent under interpolation. As part of his undergraduate degree, he was fortunate to intern with Emtiyaz Khan on Bayesian neural networks at RIKEN AIP. Outside of research, Aaron enjoys climbing, hiking, and cooking.