Nihonbashi 1-chome Mitsui Building, 15th floor,1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
Speaker: Tal Linzen, Assistant Professor, Johns Hopkins
Title : How well do neural NLP systems generalize?
Abstract:
Neural networks have rapidly become central to NLP systems. While such systems perform well on typical test set examples, their generalization abilities are often poorly understood. In this talk, I will demonstrate how experimental paradigms from psycholinguistics can help us characterize the gaps between the abilities of neural systems and those of humans, by focusing on interpretable axes of generalization from the training set rather than on average test set performance. I will show that recurrent neural network (RNN) language models are able to process syntactic dependencies in typical sentences with considerable success, but when evaluated on more complex syntactically controlled materials, their error rate increases sharply. Likewise, neural systems trained to perform natural language inference generalize much more poorly than their test set performance would suggest. Finally, I will discuss a novel method for measuring compositionality in neural network representations; using this method, we show that the sentence representations acquired by neural natural language inference systems are not fully compositional, in line with their limited generalization abilities.
Bio:
Tal Linzen is an Assistant Professor of Cognitive Science and Computer Science at Johns Hopkins University. Before moving to Johns Hopkins in 2017, he was a postdoctoral researcher at the École Normale Supérieure in Paris, where he worked with Emmanuel Dupoux and Benjamin Spector; before that he obtained his PhD from the Department of Linguistics at New York University in 2015, under the supervision of Alec Marantz. At JHU, Dr. Linzen directs the Computation and Psycholinguistics Lab; the lab develops computational models of human language comprehension and acquisition, as well as methods for interpreting, evaluating and extending neural network models for natural language processing.
Public events of RIKEN Center for Advanced Intelligence Project (AIP)
Join community