RIKEN, Center for Advanced Intelligence Project at the open space
Nihonbashi 1-chome Mitsui Building, 15th floor,1-4-1 Nihonbashi, Chuo-ku, Tokyo
1.
Speaker: Kohei Miyaguchi ( IBM Research - Tokyo ) Time:13:00-14:30
Title:
PAC-Bayesian Transportation Bound
Abstract:
The PAC-Bayesian analysis is empirically known to produce tight non-asymptotic risk bounds for practical machine learning algorithms. However, in its naive form, it can only deal with stochastic predictors while such predictors are rarely used and deterministic predictors often performs well in practice. On the other hand, the risk of deterministic predictors have been studied around the notion of Dudley's entropy integral in the field of statistics, whereas little attention have been payed for the non-asymptotic tightness.
Our study fills this gap by developing a new generalization error bound that unifies these two independently-developed tools for risk analysis, namely the PAC-Bayesian bound and Dudley's entropy integral, in view of transportation. The new bound, called the PAC- Bayesian transportation bound, evaluates the additional risk incurred by transporting one predictor to another over continuous loss surfaces, thereby allowing us to de-randomize any stochastic predictors with meaningful risk guarantees. We also discuss implications and possible applications of the proposed bound.
2.
Speaker: Yusuke Hayashi (Japan Digital Design Inc.) Time:15:00-16:30
Title:
Neural Demon: Maxwell's Demon as a Meta-Learner
Abstract:
In recent years, deep learning has achieved remarkable success in supervised and reinforcement learning problems, such as image classification, speech recognition
and game playing. However, these models are specialized for the single task they are trained for. Meta-learning or few-shot learning offers a potential solution to this problem: by learning to learn across data from multiple previous tasks, few-shot meta-learning algorithms can discover the structure among tasks to enable fast learning of new tasks.
In this presentation, we consider a hierarchical Bayesian model with a global latent variable
(meta-parameter) θ and task-specific latent variables φ={φ^(t)}.
First, we show that when the distribution of latent variables in the decoder p(θ, φ) is equal to the marginal distribution of the encoder q(θ, φ),
the thermodynamic costs of the meta-learning process provide an upper bound on the amount of information that the model is able to learn from its teacher.
This allows us to introduce the second law of information thermodynamics on meta- learning.
Next, we propose the application of this model to multi-task image classification.
Public events of RIKEN Center for Advanced Intelligence Project (AIP)
Join community