RIKEN AIP Public

Deep Learning Theory Team Seminar (Talk by Dr. Stefano Massaroli,Mila–Quebec AI Institute / Université de Montréal ).

Name: Deep Learning Theory Team Seminar (Talk by Dr. Stefano Massaroli,Mila–Quebec AI Institute / Université de Montréal ).
Start: 2023-04-06T16:00:00+09:00
End: 2023-04-06T17:00:00+09:00

2023-04-06（木）16:00 - 17:00 JST

Google カレンダーに追加

オンラインリンクは参加者だけに表示されます。

申し込む

申し込み受付は終了しました

今後イベント情報を受け取る

参加費無料

詳細

This is an online seminar. Registration is required.
【Deep Learning Theory Team】
【Date】2023/April/6(Thu) 16:00-17:00(JST)
*【Speaker】Stefano Massaroli, Mila–Quebec AI Institute / Université de Montréal *

Title: Toward Large Convolutional Sequence Models

Abstract
In the realm of deep learning, large Transformers have proven effective due to their ability to learn at scale. However, the attention operator, which is a core building block of Transformers, exhibits quadratic cost in sequence length, making it challenging to access large contexts. In this talk, we will explore how long convolutions may provide a subquadratic drop-in replacement for attention. We will start by discussing classic signal processing arguments and follow our research journey, which began with continuous-depth learning and neural differential equations. and culminated in the development of “Hyena”, our latest sequence architecture. Hyena leverages implicitly parametrized convolutions interleaved with data-controlled gating and matches the performances of large Transformers on long-range reasoning and natural language modeling tasks. We will provide insights into the inner workings of Hyena through the lens of system theory, shedding light on how it enables efficient learning at scale. Join us as we explore the power of large convolutional sequence models and our journey to develop Hyena architecture.

シェアツイート

コミュニティについて

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

メンバーになる