Back to Glossary
SEO

PageRank

PageRank is Google's founding algorithm, devised by Larry Page and Sergey Brin in 1998, that treats links between web pages as votes to calculate each page's relative importance. The public toolbar score was retired in 2016, but Google says it still uses PageRank as an internal ranking signal.

  • PageRank was introduced in a 1998 paper by Larry Page and Sergey Brin at Stanford University and became the starting point for Google Search.
  • Its core idea treats a link to a page as a recommendation vote, but weights each vote by the importance of the linking page and the number of links that page sends out, rather than counting every link equally.
  • It models the entire web as one giant link graph and uses iterative computation to find the probability that a hypothetical visitor clicking links at random ends up on a given page.
  • The 0-10 toolbar PageRank score that was once public was retired in 2016, but Google has confirmed it still uses PageRank internally for ranking.
  • The PageRank patent was held by Stanford University and expired in September 2019.

Overview

PageRank is an algorithm that analyzes the link structure between web pages and assigns each page a numeric score for its relative importance. The underlying idea is simple: the more links a page receives, and the more those links come from important pages, the more important that page is considered to be. In other words, a link is read as a kind of recommendation vote, with votes cast by influential pages carrying greater weight.

The algorithm was introduced in the 1998 paper "The Anatomy of a Large-Scale Hypertextual Web Search Engine" by Larry Page and Sergey Brin at Stanford University, and it went on to form the foundation of the Google search engine. The name refers at once to a "web page" and to co-founder Larry Page.

How It Works

PageRank models the entire web as a massive graph made up of nodes (pages) and edges (links). Intuitively, it is described through the "random surfer" model. Imagine a visitor who moves around by clicking links at random on a page and, with some probability, stops clicking and jumps to an arbitrary page instead. Each page's PageRank value corresponds to the probability that this visitor eventually lands on that page.

The probability that the surfer keeps following links is captured by a damping factor d, commonly set to 0.85. The probability of jumping to a random page is then 1 - d. The PageRank of page A can be expressed as follows.

PR(A) = (1 - d) / N + d * ( PR(B)/L(B) + PR(C)/L(C) + ... )

  d    = damping factor (typically 0.85)
  N    = total number of pages
  L(x) = number of outbound links from page x

The key point is that each page passes along its PageRank divided by the number of links it sends out. A link from a page that sends out few links therefore transfers more importance to its target. Because every page's value depends on the others, the answer cannot be found in a single pass; instead it is converged through an iterative method, also known as the power method, that repeats the same calculation until the values stabilize. Even on a web-scale graph, convergence is known to take roughly 45 to 52 iterations.

PageRank can also be seen as an extension of traditional academic citation analysis. It differs from a simple citation count in that it does not weigh every link equally but normalizes each by the number of links on the page that made it.

History and Present Status

The PageRank patent belonged to Stanford University rather than Google, and all related patents expired on September 24, 2019. For a time, Google made the 0-10 PageRank score publicly available through the Google Toolbar, and the SEO industry treated that number as a headline indicator of a page's authority.

However, in early 2016 Google confirmed that it was discontinuing the public display of toolbar PageRank data. According to a Search Engine Land report (March 2016), Google was only removing the public score display and stated that it "still uses PageRank data within its internal ranking algorithm." In other words, the simple 0-to-10 external score is gone, but the PageRank actually used internally lives on in a far more complex form. Wikipedia likewise notes that PageRank continues to provide the basis underlying Google's web search tools.

References

Related terms