Web Graphs — PageRank-like Importance Measures
FR

Appendix B — A Short Study of PageRank on the INRIA Website

Starting from a snapshot of the graph of the website http://www.inria.fr, we applied a PageRank-type algorithm to determine which pages received the highest ranking. Several interesting observations emerged:

Results It seems quite interesting to analyse the top ten URLs returned by our PageRank algorithm (see Table 2.1). One can observe that, while naturally correlated with the in-degree ranking, it differs from it significantly (compare Table 2.1 and Table 2.2).

URL (http://www.inria.fr/…) Local PR Google PR In-deg
index.fr.html 608
rapportsactivite/RA94/RA94.kw.html 327
actualites/index.fr.html 367
fonctions/plan.fr.html 297
valorisation/index.fr.html 302
travailler/index.fr.html 312
recherche/index.fr.html 297
publications/index.fr.html 294
inria/index.fr.html 229
rapportsactivite/RA94/RA94.pers.html 320
Table 2.1: The top ten URLs of www.inria.fr according to a local PageRank. Comparison with Google.

In terms of relevance, the pages returned by our PageRank appear to be well chosen overall (homepage in first place, “index” or “site map” type pages), with the notable exception of two pages:

Upon verification, and as one might expect, these two pages turn out to be the two main nodes of a near-sink, namely rapportsactivite/RA94/. These two pages, having both a high in-degree and being located in a near-sink, appear very difficult to filter out using only a local PageRank.

URL (http://www.inria.fr/…) In-deg
index.fr.html 608
index.en.html 391
actualites/index.fr.html 367
rapportsactivite/RA94/RA94.kw.html 327
rapportsactivite/RA94/RA94.pers.html 320
travailler/index.fr.html 312
valorisation/index.fr.html 302
fonctions/recherche.fr.html 299
fonctions/annuaire.fr.html 297
fonctions/plan.fr.html 297
Table 2.2: The ten URLs with the highest in-degree

Comparison with Google Google assigns a ranking of 9/10 to the INRIA homepage and 8/10 to the other top ten pages of the local PageRank, with the exception of

which receive a score of 6/10. Two main observations:

Esc