RIKEN AIP Public

Sequential Decision Making Team Seminar (Talk by Canzhe Zhao, Shanghai Jiao Tong University).

Name: Sequential Decision Making Team Seminar (Talk by Canzhe Zhao, Shanghai Jiao Tong University).
Start: 2025-10-24T13:00:00+09:00
End: 2025-10-24T14:00:00+09:00

Fri, 24 Oct 2025 13:00 - 14:00 JST

Add to Google Calendar

Online Link visible to participants

Registration is closed

Get invited to future events

Free admission

Description

Sequential Decision Making Team Seminar (Talk by Canzhe Zhao, Shanghai Jiao Tong University).
This is an online seminar. Registration is required.

【Sequential Decision Making Team】
【Date】2025/October 24 (Fri) 13:00-14:00(JST)
【Speaker】Canzhe Zhao, Shanghai Jiao Tong University, Department of Computer Science and Engineering

Title: Scalable Online Learning in Adversarial Environments: from Single-Agent to Multi-Agent

Abstract:Practical applications of sequential decision-making in complex and dynamic environments face critical challenges, including the curse of dimensionality and adversarial loss functions. In this talk, I will present a unified research program on scalable online learning in adversarial environments, addressing these core challenges from both single-agent reinforcement learning (RL) and multi-agent gametheoretic perspectives. The first part of the talk focuses on adversarial bandits and RL with function approximation. I will introduce our advances on learning in adversarial linear mixture MDPs and low-rank MDPs. In addition, I will present our best-of-both-worlds algorithms for linear bandits, which achieve (nearly) optimal regret in both stochastic and adversarial environments, even under heavy-tailed noise distributions. The second part of the talk extends to partially observable Markov games (POMGs). I will present the first algorithm achieving last-iterate convergence in POMGs under bandit feedback, alongside pioneering algorithms for
learning POMGs with linear function approximation. These algorithms enable scalable and efficient learning in high-dimensional game environments.
Collectively, these advancements demonstrate how principled algorithmic designs can overcome fundamental limitations in online learning, leading to scalable and robust decision-making in complex and dynamic environments. The contributions presented in this talk have been published in premier machine learning venues, including ICML, ICLR, NeurIPS, UAI, and AAAI.

Share Tweet

About this community

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

Join community