Doorkeeper

Tensor Learning Team Seminar (Talk by Dr. Vidyadhar Upadhya, RIKEN AIP).

2022-01-12(水)10:00 - 11:00 JST
オンライン リンクは参加者だけに表示されます。
申し込む

申し込み受付は終了しました

今後イベント情報を受け取る

参加費無料
Free admission -Time Zone: JST -The seats are available on a first-come-first-served basis. -When the seats are fully booked, we may stop accepting applications.

詳細

Speaker: Dr. Vidyadhar Upadhya, RIKEN AIP

Title: Learning Gaussian-Binary Restricted Boltzmann Machines using Difference of Convex Functions Optimization

Abstract:
Probabilistic generative models learn useful representations from unlabeled data which can be used for subsequent problem-specific tasks, such as classification, regression or information retrieval. One such energy based probabilistic generative model is the Restricted Boltzmann machine (RBM) which forms the building block for several deep generative models. However, it is difficult to learn RBMs because the computation of the gradient of the RBM's log-likelihood function involves the intractable partition function (the normalizing constant in the RBM's distribution function). Therefore, developing efficient algorithms to learn RBMs is an important research direction.

In this talk, I explore the maximum likelihood learning of Gaussian-binary RBMs (GB-RBM). Firstly, I will demonstrate how to exploit the property that RBM's log-likelihood function could be expressed as a difference of convex functions w.r.t. the weights and hidden biases, under the assumption that the conditional distribution of the visible units have a fixed variance. Then, I will show how to devise a stochastic variant of the standard difference of convex functions (DC) optimization algorithm/programming, termed stochastic-DCP (S-DCP), to learn RBMs. We shall see that in this algorithm, the convex optimization problem at each iteration is approximately solved through a few iterations of stochastic gradient descent. The contrastive divergence (CD) algorithm, the current standard algorithm for learning RBMs, can be derived as a special case of the S-DCP algorithm. We shall furthermore see how to modify the S-DCP algorithm to also learn the variance parameter of visible units instead of assuming it to be fixed.

In this presentation, I shall demonstrate that the S-DCP algorithm provides a faster convergence and achieve a higher log-likelihood compared to the baseline algorithms, through extensive empirical studies on a number of benchmark datasets.

コミュニティについて

RIKEN AIP Public

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

メンバーになる