RIKEN AIP (Meeting room 3)
Nihonbashi 1-chome Mitsui Building, 15th floor, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
Speaker: Dr. Stephan Zheng (http://www.stephanzheng.com/)
Title: Exploiting Structure for Efficient and Robust Deep Learning
Abstract: Deep learning has seen great success in training neural networks for complex prediction problems, such as large-scale image recognition, time-series forecasting and learning single-agent behavioral models. However, neural networks have a number of weaknesses: 1) they are not sample-efficient and 2) they are often not robust against (adversarial) input perturbations. Hence, it is challenging to apply deep learning to problems with exponential complexity, such as multi-agent games, complex long-term spatiotemporal dynamics or noisy high-resolution data. To address these issues, I will present methods that exploit structure to improve the sample efficiency, expressive power and robustness of neural networks in both supervised and reinforcement learning paradigms.
First, I will demonstrate two structured learning methods: 1) hierarchical neural networks that model long-term goals, which can learn human-level multi-agent behavioral models that can fool domain experts (e.g., basketball team policies that fool professional sports analysts), and 2) structured exploration with hierarchical policies for faster multi-agent reinforcement learning.
Second, I will showcase two methods to improve the robustness of neural networks: 1) stability training, for large-scale robustness against weak adversarial perturbations, and 2) neural fingerprinting, a method to detect strong adversarial examples with >99% AUC-ROC detection scores.
Bio: Stephan Zheng obtained his PhD in 2018 in the Machine Learning group at Caltech, advised by Professor Yisong Yue. His research focuses on 1) deep structured learning in multi-agent environments and 2) improving the robustness of deep learning (e.g. detecting adversarial examples). He has also worked on multi-resolution learning methods for spatiotemporal tensor models, long-term forecasting models and applications of deep learning to large-scale particle physics at the LHC.
Previously, Stephan received an MSc (Theoretical Physics) and BSc (Physics, Mathematics) from Utrecht University, read Part III Mathematics at the University of Cambridge and was a visiting student at Harvard University. He received the 2011 Lorenz Prize in Theoretical Physics from the Dutch Academy of Arts and Sciences, and was twice a research intern with Google Research and Google Brain.
Public events of RIKEN Center for Advanced Intelligence Project (AIP)Join community