Doorkeeper

Imperfect Information Learning Team Seminar (Talk by Guillaume Braun, Inria Lille - Nord Europe ).

Fri, 04 Nov 2022 16:30 - 17:30 JST
Online Link visible to participants
Register

Registration is closed

Get invited to future events

Free admission

Description

Speaker: Guillaume Braun, Inria Lille - Nord Europe

Title: Efficient methods for provably clustering and matching graphs

Abstract: Graph-structured datasets arise naturally in many fields including biology with protein-to-protein interaction networks, ecology with predator-prey networks, and economy with financial market networks. In order to extract relevant information from these networks, one often uses clustering methods to gather nodes having similar connectivity patterns into communities. Numerous clustering algorithms have been proposed in the past decade and have been analyzed under the Stochastic Block Model (SBM), a popular random graph model. However, in practice, one often has access to side information, and it is typically unclear how this information can be incorporated into existing methods and to what extent it can help to improve the clustering performance.

We will first address this question under the Contextual SBM (CSBM) -- a simple extension of the SBM with independent Gaussian covariates associated with each node -- and propose an iterative refinement method that is fast and achieves the information-theoretic threshold for exact recovery. Our method is inspired by the Generalized Power Method (GPM) and principles of EM-type algorithms. Next, we extend the method to high-dimensional bipartite graphs, hence showing the flexibility of the approach.

In the second part, we will consider graphs with partial side information for edges, that can be represented by multilayer graphs with missing nodes, and derive methods with good clustering accuracy.

Finally, we consider the situation where the side information is in the form of a graph but without knowledge of the nodes' correspondence. In order to take advantage of this side information, it is natural to first estimate the nodes' correspondence. To this end, we propose a method based on the GPM and analyze it under the Correlated Wigner Model.

This is based on joint work with H.Tyagi, C. Biernacki, and E. Araya.

About this community

RIKEN AIP Public

RIKEN AIP Public

Public events of RIKEN Center for Advanced Intelligence Project (AIP)

Join community