Doorkeeper

High-dimensional Statistical Modeling Team Seminar (Talk by Mr. Ziyin Liu, University of Tokyo)

Tue, 21 Sep 2021 16:00 - 17:00 JST
Online Link visible to participants
Register

Registration is closed

Get invited to future events

Free admission
-Time Zone:JST -The seats are available on a first-come-first-served basis. -When the seats are fully booked, we may stop accepting applications. -Simultaneous interpretation will not be available.

Description

Title: Stochastic Gradient Descent with Multiplicative Noise

Abstract:
Stochastic gradient descent (SGD) is the main optimization algorithm behind the success of deep learning. Recently, it is shown that the stochastic noise in SGD is multiplicative, i.e., the strength of the noise crucially depends on the model parameter. In this talk, we show that the dynamics of SGD can be very surprising and unintuitive when the noise is multiplicative. For example, we show that (1) SGD may converge to a local maximum; (2) SGD may escape a saddle point arbitrarily slowly; (3) SGD may prefer sharp minima over the flat ones; and (4) AMSGrad may converge to a local maximum. If time allows, we also present some recent results that shed light on how SGD works under the multiplicative noise. This presentation is mainly based on the following three works of the speaker.
[1] https://arxiv.org/abs/2107.11774
[2] https://arxiv.org/abs/2105.09557
[3] https://arxiv.org/abs/2012.03636

Bio:
Liu Ziyin. http://cat.phys.s.u-tokyo.ac.jp/~zliu/

About this community

RIKEN AIP Public

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

Join community