RIKEN AIP Public

Tensor Learning Team Seminar (Talk by Dr. Vidyadhar Upadhya, RIKEN AIP).

Name: Tensor Learning Team Seminar (Talk by Dr. Vidyadhar Upadhya, RIKEN AIP).
Start: 2022-01-12T10:00:00+09:00
End: 2022-01-12T11:00:00+09:00

Wed, 12 Jan 2022 10:00 - 11:00 JST

Add to Google Calendar

Online Link visible to participants

Registration is closed

Get invited to future events

Free admission

Free admission -Time Zone: JST -The seats are available on a first-come-first-served basis. -When the seats are fully booked, we may stop accepting applications.

Description

Speaker: Dr. Vidyadhar Upadhya, RIKEN AIP

Title: Learning Gaussian-Binary Restricted Boltzmann Machines using Difference of Convex Functions Optimization

Abstract:
Probabilistic generative models learn useful representations from unlabeled data which can be used for subsequent problem-specific tasks, such as classification, regression or information retrieval. One such energy based probabilistic generative model is the Restricted Boltzmann machine (RBM) which forms the building block for several deep generative models. However, it is difficult to learn RBMs because the computation of the gradient of the RBM's log-likelihood function involves the intractable partition function (the normalizing constant in the RBM's distribution function). Therefore, developing efficient algorithms to learn RBMs is an important research direction.

In this talk, I explore the maximum likelihood learning of Gaussian-binary RBMs (GB-RBM). Firstly, I will demonstrate how to exploit the property that RBM's log-likelihood function could be expressed as a difference of convex functions w.r.t. the weights and hidden biases, under the assumption that the conditional distribution of the visible units have a fixed variance. Then, I will show how to devise a stochastic variant of the standard difference of convex functions (DC) optimization algorithm/programming, termed stochastic-DCP (S-DCP), to learn RBMs. We shall see that in this algorithm, the convex optimization problem at each iteration is approximately solved through a few iterations of stochastic gradient descent. The contrastive divergence (CD) algorithm, the current standard algorithm for learning RBMs, can be derived as a special case of the S-DCP algorithm. We shall furthermore see how to modify the S-DCP algorithm to also learn the variance parameter of visible units instead of assuming it to be fixed.

In this presentation, I shall demonstrate that the S-DCP algorithm provides a faster convergence and achieve a higher log-likelihood compared to the baseline algorithms, through extensive empirical studies on a number of benchmark datasets.

Share Tweet

About this community

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

Join community