The eigenvectors corresponding to the second eigenvalue of the Google matrix and their relation to link spamming
作者:
Highlights:
•
摘要
Google uses the PageRank algorithm to determine the relative importance of a website. Link spamming is the name for putting links between websites with no other purpose than to increase the PageRank value of a website. To give a fair result to a search query it is important to detect whether a website is link spammed so that it can be filtered out of the search result.While the dominant eigenvector of the Google matrix determines the PageRank value, the second eigenvector can be used to detect a certain type of link spamming. We will describe an efficient algorithm for computing a complete set of independent eigenvectors for the second eigenvalue, and explain how this algorithm can be used to detect link spamming. We will illustrate the performance of the algorithm on web crawls of millions of pages.
论文关键词:Google PageRank,Link spamming,Second eigenvector,Markov chains,Irreducible closed subsets
论文评审过程:Received 10 January 2014, Revised 26 July 2014, Available online 28 September 2014.
论文官网地址:https://doi.org/10.1016/j.cam.2014.09.014