Detecting communities, and labeling nodes, is a ubiquitous problem in the study of networks. Recently, we developed scalable Belief Propagation algorithms that update probability distributions of node labels until they reach a fixed point. In addition to being of practical use, these algorithms can be studied analytically, revealing phase transitions in the ability of any algorithm to solve this problem. Specifically, there is a detectability transition in the stochastic block model, below which no algorithm can label nodes better than chance. This transition was subsequently established rigorously by Mossel, Neeman, and Sly, and Massoulie.
I'll explain this transition, and give an accessible introduction to Belief Propagation and the analogy with free energy and the cavity method of statistical physics. We'll see that the consensus of many good solutions is a better labeling than the "best" solution --- something that is true for many real-world optimization problems. While many algorithms overfit, and find "communities" even in random graphs where none exist, our method lets us focus on statistically-significant communities. In physical terms, we focus on the free energy rather than the ground state energy.
I'll then turn to spectral methods. It's popular to classify nodes according to the first few eigenvectors of the adjacency matrix or the graph Laplacian. However, in the sparse case these operators get confused by localized eigenvectors, focusing on high-degree nodes or dangling trees rather than large-scale communities. As a result, they fail significantly above the detectability transition. I will describe a new spectral algorithm based on the non-backtracking matrix, which avoids these localized eigenvectors: it appears to be optimal in the sense that it succeeds all the way down to the transition. Making this rigorous will require us to prove an interesting conjecture in the theory of random matrices and random graphs.
This is joint work with Aurelien Decelle, Florent Krzakala, Elchanan Mossel, Joe Neeman, Mark Newman, Allan Sly, Lenka Zdeborova, and Pan Zhang.
Cristopher Moore is a Professor at the Santa Fe Institute. He received his B.A. in Physics, Mathematics, and Integrated Science from Northwestern University, and his Ph.D. in Physics from Cornell. In 2000, he joined the University of New Mexico faculty, with joint appointments in Computer Science, and Physics and Astronomy. In 2012, Moore left the University of New Mexico and became full-time resident faculty at the Santa Fe Institute. He has published over 120 papers at the boundary between physics and computer science, ranging from quantum computing, to phase transitions in NP-complete problems, to the theory of social networks and efficient algorithms for analyzing their structure. With Stephan Mertens, he is the author of The Nature of Computation, published by Oxford University Press.