RIKEN AIP Public

Deep Learning Theory Team Seminar (Talk by Dr. Stefano Massaroli,Mila–Quebec AI Institute / Université de Montréal ).

Name: Deep Learning Theory Team Seminar (Talk by Dr. Stefano Massaroli,Mila–Quebec AI Institute / Université de Montréal ).
Start: 2023-04-06T16:00:00+09:00
End: 2023-04-06T17:00:00+09:00

Thu, 06 Apr 2023 16:00 - 17:00 JST

Add to Google Calendar

Online Link visible to participants

Registration is closed

Get invited to future events

Free admission

Description

This is an online seminar. Registration is required.
【Deep Learning Theory Team】
【Date】2023/April/6(Thu) 16:00-17:00(JST)
*【Speaker】Stefano Massaroli, Mila–Quebec AI Institute / Université de Montréal *

Title: Toward Large Convolutional Sequence Models

Abstract
In the realm of deep learning, large Transformers have proven effective due to their ability to learn at scale. However, the attention operator, which is a core building block of Transformers, exhibits quadratic cost in sequence length, making it challenging to access large contexts. In this talk, we will explore how long convolutions may provide a subquadratic drop-in replacement for attention. We will start by discussing classic signal processing arguments and follow our research journey, which began with continuous-depth learning and neural differential equations. and culminated in the development of “Hyena”, our latest sequence architecture. Hyena leverages implicitly parametrized convolutions interleaved with data-controlled gating and matches the performances of large Transformers on long-range reasoning and natural language modeling tasks. We will provide insights into the inner workings of Hyena through the lens of system theory, shedding light on how it enables efficient learning at scale. Join us as we explore the power of large convolutional sequence models and our journey to develop Hyena architecture.

Share Tweet

About this community

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

Join community