In this event, we will have three talks by Ph.D. students (from the US and Japan) on transfer learning between largely different distributions.
Each talk will be 30 minutes including questions.
(The talk order may change)
------------------[Talk 1]------------------
Speaker: Dimitris Tsipras (MIT) http://people.csail.mit.edu/tsipras/ , Shibani Santurkar (MIT) http://people.csail.mit.edu/shibani/
Title: BREEDS: Benchmarks for Subpopulation Shift
Abstract: How do machine learning models perform when faced with unseen data subpopulations?
In this work, we present a general methodology for assessing model robustness to subpopulation shift. Our approach leverages the class structure underlying existing datasets to control the data subpopulations that comprise the training and test distributions. This enables us to synthesize realistic distribution shifts whose sources can be precisely controlled and characterized, within existing large-scale datasets. We apply this methodology to the ImageNet dataset, creating a suite of subpopulation shift benchmarks that we then use to measure the sensitivity of standard model architectures as well as the effectiveness of off-the-shelf train-time robustness interventions.
Joint work with Shibani Santurkar and Aleksander Madry.
------------------[Talk 2]------------------
Speaker: Ananya Kumar (Stanford University) https://ananyakumar.wordpress.com/
Title:
Understanding Self-Training for Gradual Domain Adaptation
Abstract:
How can we adapt to test distributions that are very different from training examples in a principled way?
Traditional domain adaptation is only guaranteed to work when the distribution shift is small; empirical methods combine several heuristics for larger shifts but can be dataset specific. In many real applications like self-driving cars, brain-machine interfaces, and sensor networks, the domain shift does not happen at one time, but happens gradually. We consider gradual domain adaptation, where the goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain. We prove the first non-vacuous upper bound on the error of self-training with gradual shifts, under settings where directly adapting to the target domain can result in unbounded error. The theoretical analysis leads to algorithmic insights, highlighting that regularization and label sharpening are essential even when we have infinite data. This leads to higher accuracies on a rotating MNIST dataset, a forest Cover Type dataset, and a Portraits dataset.
Joint work with Percy Liang and Tengyu Ma.
------------------[Talk 3]------------------
Speaker: Takeshi Teshima (UTokyo) https://takeshi-teshima.github.io
Title: Few-shot Domain Adaptation by Causal Mechanism Transfer
Abstract: How can we transfer knowledge across different data distributions when they share a common data generating process?
We study few-shot supervised domain adaptation (DA) for regression problems, where only a few labeled target domain data and many labeled source domain data are available. Many of the current DA methods base their transfer assumptions on either parametrized distribution shift or apparent distribution similarities, e.g., identical conditionals or small distributional discrepancies. However, these assumptions may preclude the possibility of adaptation from intricately shifted and apparently very different distributions. To overcome this problem, we propose mechanism transfer, a metadistributional scenario in which a data generating mechanism is invariant across domains. This transfer assumption can accommodate nonparametric shifts resulting in apparently different distributions while providing a solid statistical basis for DA. We take the structural equations in causal modeling as an example and propose a novel DA method, which is shown to be useful both theoretically and experimentally. Our method can be seen as the first attempt to fully leverage the structural causal models for DA.
Joint work with Issei Sato and Masashi Sugiyama.
Event timeline
The event will be held at:
Oct 21, 05:00 PM - 07:00 PM (PDT)
= Oct 21, 08:00 PM - 10:00 PM (EDT)
= Oct 22, 09:00 AM - 11:00 AM (JST).
Public events of RIKEN Center for Advanced Intelligence Project (AIP)
Join community