1st Talk (45 min talk + 15 min Q&A):
Speaker:
Denny Wu (University of Toronto)
http://www.cs.toronto.edu/~dennywu/
Title:
An Asymptotic View on the Generalization of Overparameterized Least Squares Regression: Explicit and Implicit Regularization
Abstract:
We study the generalization properties of the generalized ridge regression estimator in the overparameterized regime. We derive the exact prediction risk in the proportional asymptotic limit, which allows us to rigorously characterize the surprising empirical observation that the optimal ridge regularization strength can be negative, as well as the benefit of weighted regularization.
We then connect the ridgeless limit of this regression estimator to the implicit bias of preconditioned gradient descent (e.g., natural gradient descent). We compare the generalization performance of first- and second-order optimizers, and identify different factors that affect this comparison. We empirically validate our theoretical findings in various neural network experiments.
This presentation is based on the following works:
https://arxiv.org/abs/2006.05800
https://arxiv.org/abs/2006.10732
==============================================
2nd Talk (45 min talk + 15 min Q&A):
Speaker:
Benjamin Poignard (Osaka University)
https://sites.google.com/site/poignardbenjamin
Title:
High Dimensional Vector Autoregression via Sparse Precision Matrix
Abstract:
We consider the problem of estimating sparse vector autoregression (VAR) processes. To do so, using the SCAD, MCP and Lasso regularisers, we rely on sparse precision matrix within a general Bregman divergence setting, so that the corresponding Cholesky decomposition provides the VAR coefficients. Under suitable regularity conditions, we derive error bounds for the regularised precision matrix for each Bregman divergence case. Moreover, we establish the support recovery property, including the case when the regulariser is non-convex. These theoretical results are supported by empirical studies.