Center for Applied Mathematics Colloquium

Sid BanerjeeCornell University

Fast bidirectional estimation in Markov chains

Friday, September 11, 2015 - 3:30pm

Rhodes 655

A fundamental problem in Markov Chains is of estimating the probability of transitioning from a given starting state to a given terminal state in k steps. This has received a lot of attention in recent years due to the success of PageRank and related Markov-Chain based centrality measures. Standard approaches to this problem use either linear-algebraic methods (such as the power iteration) or Monte Carlo. Both however need 1/p samples for estimating probabilities on the order of p. This turns out to be too slow for large networks - it is particularly problematic for estimating Personalized PageRank, an ego-centric generalization of PageRank, which has long been recognized as an effective measure for ranking search results, but is rarely used in practice as existing algorithms are too slow.

I'll present a new bi-directional technique which combines linear algebraic techniques with random walks, and show that in sparse graphs, it returns an accurate estimate of transition probabilities of the order of p in time O(1/\sqrt{p}). In particular, our approach provides the first algorithm for PageRank estimation which has sublinear running-time guarantees in theory, and which is much faster than existing algorithms in practice (for example, it is 100 times faster on the Twitter 2010 network). Moreover, our approach extends to general Markov chains -- this may have applications in many diverse settings, and I look forward to some suggestions from the audience.

Joint work with Peter Lofgren, Ashish Goel and C. Seshadri.