Registration is closed
Speaker: Dr. Vidyadhar Upadhya, RIKEN AIP
Title: Learning Gaussian-Binary Restricted Boltzmann Machines using Difference of Convex Functions Optimization
Abstract:
Probabilistic generative models learn useful representations from unlabeled data which can be used for subsequent problem-specific tasks, such as classification, regression or information retrieval. One such energy based probabilistic generative model is the Restricted Boltzmann machine (RBM) which forms the building block for several deep generative models. However, it is difficult to learn RBMs because the computation of the gradient of the RBM's log-likelihood function involves the intractable partition function (the normalizing constant in the RBM's distribution function). Therefore, developing efficient algorithms to learn RBMs is an important research direction.
In this talk, I explore the maximum likelihood learning of Gaussian-binary RBMs (GB-RBM). Firstly, I will demonstrate how to exploit the property that RBM's log-likelihood function could be expressed as a difference of convex functions w.r.t. the weights and hidden biases, under the assumption that the conditional distribution of the visible units have a fixed variance. Then, I will show how to devise a stochastic variant of the standard difference of convex functions (DC) optimization algorithm/programming, termed stochastic-DCP (S-DCP), to learn RBMs. We shall see that in this algorithm, the convex optimization problem at each iteration is approximately solved through a few iterations of stochastic gradient descent. The contrastive divergence (CD) algorithm, the current standard algorithm for learning RBMs, can be derived as a special case of the S-DCP algorithm. We shall furthermore see how to modify the S-DCP algorithm to also learn the variance parameter of visible units instead of assuming it to be fixed.
In this presentation, I shall demonstrate that the S-DCP algorithm provides a faster convergence and achieve a higher log-likelihood compared to the baseline algorithms, through extensive empirical studies on a number of benchmark datasets.
Public events of RIKEN Center for Advanced Intelligence Project (AIP)
Join community