Doorkeeper

High-dimensional Statistical Modeling Team Seminar (Fangyu Liu, University of Cambridge)

Tue, 22 Mar 2022 16:00 - 17:00 JST
Online Link visible to participants
Register

Registration is closed

Get invited to future events

Free admission
-Time Zone:JST -The seats are available on a first-come-first-served basis. -When the seats are fully booked, we may stop accepting applications. -Simultaneous interpretation will not be available.

Description

Title: Learning Text Representations from Pre-trained Language Models via Contrastive Learning and Self-Distillation

Abstract:
Pretrained Language Models (PLMs) have revolutionised NLP in recent years. However, previous work has indicated that off-the-shelf PLMs are not effective as universal text encoders without further task-specific fine-tuning on NLI, sentence similarity, or paraphrasing tasks using annotated task data. In this talk, I will introduce two of our recent works on converting pre-trained language models into universal text encoders through unsupervised fine-tuning. First, I will talk about Mirror-BERT (EMNLP 2021), an extremely simple, fast, and effective contrastive learning technique that fine-tunes BERT/RoBERTa into strong lexical and sentence encoders in 20-30 seconds. Second, I will introduce Trans-Encoder (ICLR 2022), which extends Mirror-BERT to achieve even better sentence-pair modelling performance through self-distillation under a bi- and cross-encoder iterative learning paradigm. Both approaches have set the unsupervised state-of-the-art on sentence similarity benchmarks such as STS.

Bio:
Fangyu Liu is a second-year PhD student in NLP at the Language Technology Lab, University of Cambridge, supervised by Professor Nigel Collier. His research centres around multi-modal NLP, self-supervised representation learning and model interpretability. He is a Trust Scholar funded by Grace & Thomas C.H. Chan Cambridge Scholarship. Besides Cambridge, he also spend(t) time at Microsoft Research, Amazon, EPFL, and the University of Waterloo. He won the Best Long Paper Award at EMNLP 2021.

About this community

RIKEN AIP Public

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

Join community