High-dimensional Statistical Modeling Team Seminar (Talk by Mr. Ziyin Liu, University of Tokyo)

Tue, 21 Sep 2021 16:00 - 17:00 JST
Online Link visible to participants

Registration is closed

Get invited to future events

Free admission
-Time Zone:JST -The seats are available on a first-come-first-served basis. -When the seats are fully booked, we may stop accepting applications. -Simultaneous interpretation will not be available.


Title: Stochastic Gradient Descent with Multiplicative Noise

Stochastic gradient descent (SGD) is the main optimization algorithm behind the success of deep learning. Recently, it is shown that the stochastic noise in SGD is multiplicative, i.e., the strength of the noise crucially depends on the model parameter. In this talk, we show that the dynamics of SGD can be very surprising and unintuitive when the noise is multiplicative. For example, we show that (1) SGD may converge to a local maximum; (2) SGD may escape a saddle point arbitrarily slowly; (3) SGD may prefer sharp minima over the flat ones; and (4) AMSGrad may converge to a local maximum. If time allows, we also present some recent results that shed light on how SGD works under the multiplicative noise. This presentation is mainly based on the following three works of the speaker.

Liu Ziyin.

About this community



Public events of RIKEN Center for Advanced Intelligence Project (AIP)

Join community