Title: A structural result for Personalized PageRank and its
algorithmic consequences
Abstract: Many systems, including the Internet, social networks, and the power
grid, can be represented as graphs. When analyzing graphs, it is often useful
to compute scores describing the relative importance or distance between nodes.
One example is Personalized PageRank (PPR), which assigns to each node v a
vector whose i-th entry describes the importance of
the i-th node from the perspective of v. PPR has
proven useful in many applications, such as recommending who users should
follow on social networks (if this i-th entry is
large, v may be interested in following the i-th
user). Unfortunately, computing n such PPR vectors (where n is the number of
nodes) is infeasible for many graphs of interest.
In this work, we argue that the situation is not so dire. Our main result shows
that the dimensionality of the set of PPR vectors scales sublinearly
in n with high probability, for a certain class of random graphs and for a
notion of dimensionality similar to rank. Put differently, we argue that the
effective dimension of this set is much less than n, despite the fact that the
matrix containing these vectors has rank n. Furthermore, we show this
dimensionality measure relates closely to the complexity of a PPR estimation
scheme that was proposed (but not analyzed) by Jeh
and Widom. This allows us to argue that accurately
estimating all n PPR vectors amounts to computing a vanishing fraction of the
n^2 vector elements (when the technical assumptions of our main result are
satisfied). Finally, we demonstrate empirically that similar conclusions hold
when considering real-world networks, despite the assumptions of our theory not
holding.
This is joint work with Daniel Vial, University of Michigan and will appear in
ACM Sigmetrics 2019.
Bio: Vijay Subramanian is an Associate Professor in the EECS Department at the
University of Michigan since 2014. After graduating with his Ph.D. from UIUC in
1999, he did a few stints in industry, research institutes and universities in
the US and Europe before his current position . His main research interests are
in stochastic modeling, communications, information theory and applied
mathematics. A large portion of his past work has been on probabilistic
analysis of communication networks, especially analysis of scheduling and routing
algorithms. In the past he has also done some work with applications in
immunology and coding of stochastic processes. His current research interests
are on game theoretic and economic modeling of socio-technological systems and
networks, and the analysis of associated stochastic processes.