This talk will be held in a hybrid format, both in person at AIP Open Space of RIKEN AIP (Nihonbashi office) and online by Zoom. AIP Open Space: *only available to AIP researchers.
DATE & TIME
February 14, 2024: 11:00 am - 12:30 am (JST)
TITLE
Scalable approaches for Bayesian pseudocoresets
SPEAKER
Prof. Juho Lee, KAIST
ABSTRACT
A coreset is a small-sized subset of a given large dataset, where the Bayesian posterior conditioned on them closely approximates the full-data posterior. It has recently been demonstrated that learning a synthetic pseudocoreset to minimize particular objectives, rather than selecting a coreset as a subset, scales better for high-dimensional models. Concurrently, dataset distillation methods seek to distill concise summaries from large datasets, ensuring comparable performance to algorithms trained on the complete dataset. In this talk, we reveal a fundamental connection, establishing that many existing dataset distillation algorithms are approximate versions of Bayesian pseudocoreset algorithms. Leveraging this insight, we introduce innovative techniques to enhance the scalability of pseudocoreset learning algorithms, specifically addressing challenges in learning with large-scale Bayesian neural networks. Our contributions include novel divergence measures for efficient Bayesian pseudocoreset learning and a function-space posterior matching scheme for robust and scalable pseudocoreset learning. This exploration not only bridges seemingly unrelated methodologies but also offers practical advancements in optimizing pseudocoreset learning for large-scale Bayesian models.
BIOGRAPHY
Juho Lee is an associate professor in the Kim Jaechul Graduate School of AI at KAIST. He received his Ph.D. degree in the computer science & engineering department at POSTECH and worked as a postdoc in the computational statistics & machine learning group at the University of Oxford. Before joining KAIST, he worked as a research scientist at AITRICS, an AI-based healthcare startup. He focuses on integrating Bayesian statistics with deep learning and is actively involved in machine learning research, particularly in generative models, meta-learning, and developing safe and reliable AI.
All participants are required to agree with the AIP Seminar Series Code of Conduct.
Please see the URL below.
https://aip.riken.jp/event-list/termsofparticipation/?lang=en
RIKEN AIP will expect adherence to this code throughout the event. We expect cooperation from all participants to help ensure a safe environment for everybody.