<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Is PageRank just good marketing?</title>
	<atom:link href="http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/</link>
	<description>Daniel Lemire's blog is about life in academia, research in Computer Science, wondering how we can reconcile fast databases and algorithms with the informal and asemantic nature of the world around us. It is broadcasted from Montreal (Canada).</description>
	<pubDate>Fri, 21 Nov 2008 19:25:41 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.3</generator>
		<item>
		<title>By: Jean Véronis</title>
		<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49606</link>
		<dc:creator>Jean Véronis</dc:creator>
		<pubDate>Mon, 03 Dec 2007 08:35:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49606</guid>
		<description>True. My comment was not in defence of PageRank. The simple fact that Google need to supplement it with several dozens of other criteria shows that it is not ideal ;-) In a way, Upstill said something right with a disputable methodology.</description>
		<content:encoded><![CDATA[<p>True. My comment was not in defence of PageRank. The simple fact that Google need to supplement it with several dozens of other criteria shows that it is not ideal <img src='http://www.daniel-lemire.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> In a way, Upstill said something right with a disputable methodology.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Lemire</title>
		<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49602</link>
		<dc:creator>Daniel Lemire</dc:creator>
		<pubDate>Mon, 03 Dec 2007 00:02:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49602</guid>
		<description>Interesting observation, Jean, but the paper by Najork et al. (HITS on the Web: How does it Compare?) support the claim that PageRank is not even as accurate as in-degree.</description>
		<content:encoded><![CDATA[<p>Interesting observation, Jean, but the paper by Najork et al. (HITS on the Web: How does it Compare?) support the claim that PageRank is not even as accurate as in-degree.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jean Véronis</title>
		<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49601</link>
		<dc:creator>Jean Véronis</dc:creator>
		<pubDate>Sun, 02 Dec 2007 19:49:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49601</guid>
		<description>I agree that PageRank has become mainly a marketing tool. However, there is a flaw in Upstill's work. He doesn't compare in-degree with PageRank but with the score given in Google's Toolbar, called "PageRank". Nobody knows what this score is exactly. In particular, nothing proves that it is the real "pure" PageRank as described in the original PageRank paper. I suspect that it is (a downgraded version of) the score that Google uses for ranking, which is a mixture of many factors, in which PageRank plays some (unknown) role.</description>
		<content:encoded><![CDATA[<p>I agree that PageRank has become mainly a marketing tool. However, there is a flaw in Upstill&#8217;s work. He doesn&#8217;t compare in-degree with PageRank but with the score given in Google&#8217;s Toolbar, called &#8220;PageRank&#8221;. Nobody knows what this score is exactly. In particular, nothing proves that it is the real &#8220;pure&#8221; PageRank as described in the original PageRank paper. I suspect that it is (a downgraded version of) the score that Google uses for ranking, which is a mixture of many factors, in which PageRank plays some (unknown) role.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Panos Ipeirotis</title>
		<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49600</link>
		<dc:creator>Panos Ipeirotis</dc:creator>
		<pubDate>Sun, 02 Dec 2007 03:44:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49600</guid>
		<description>Just to offer some anecdotal (and unconfirmed) piece of information: it is claimed that the original Pagerank was not exactly the one described in the WWW97 paper. 

In the plain vanilla implementation, the underlying model of Pagerank corresponds to a "random surfer" that follows hyperlinks and with probability 0.85 gets bored and jumps to a random page. I have heard that in the actual implementation, the random surfer jumps only to pages in the "edu" domain. (This idea is similar to the TrustRank algorithm.)

Of course, since 1996 many things have changed and today there are so many other factors that are taken into consideration during ranking that it is almost certain that PageRank is mainly a marketing tool.</description>
		<content:encoded><![CDATA[<p>Just to offer some anecdotal (and unconfirmed) piece of information: it is claimed that the original Pagerank was not exactly the one described in the WWW97 paper. </p>
<p>In the plain vanilla implementation, the underlying model of Pagerank corresponds to a &#8220;random surfer&#8221; that follows hyperlinks and with probability 0.85 gets bored and jumps to a random page. I have heard that in the actual implementation, the random surfer jumps only to pages in the &#8220;edu&#8221; domain. (This idea is similar to the TrustRank algorithm.)</p>
<p>Of course, since 1996 many things have changed and today there are so many other factors that are taken into consideration during ranking that it is almost certain that PageRank is mainly a marketing tool.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Turney</title>
		<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49595</link>
		<dc:creator>Peter Turney</dc:creator>
		<pubDate>Wed, 28 Nov 2007 21:27:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49595</guid>
		<description>Interesting post. I used Google Scholar to find all citations of "Predicting fame and fortune: Pagerank or indegree". Google found 16 citations:

http://scholar.google.com/scholar?hl=en&#38;lr=&#38;cites=5736996577557537352

I skimmed some of the citations, and two seemed particularly relevant: (1) Hits on the web: how does it compare? (2) Beyond PageRank: Machine Learning for Static Ranking. I was about to post this comment, when I saw that two previous comments gave exactly the same two references. Now I'm posting this comment anyway, to say that Google PageRank may be bogus, but Google Scholar seems to work just fine. :-)</description>
		<content:encoded><![CDATA[<p>Interesting post. I used Google Scholar to find all citations of &#8220;Predicting fame and fortune: Pagerank or indegree&#8221;. Google found 16 citations:</p>
<p><a href="http://scholar.google.com/scholar?hl=en&amp;lr=&amp;cites=5736996577557537352" rel="nofollow">http://scholar.google.com/scholar?hl=en&amp;lr=&amp;cites=5736996577557537352</a></p>
<p>I skimmed some of the citations, and two seemed particularly relevant: (1) Hits on the web: how does it compare? (2) Beyond PageRank: Machine Learning for Static Ranking. I was about to post this comment, when I saw that two previous comments gave exactly the same two references. Now I&#8217;m posting this comment anyway, to say that Google PageRank may be bogus, but Google Scholar seems to work just fine. <img src='http://www.daniel-lemire.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fernando Diaz</title>
		<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49594</link>
		<dc:creator>Fernando Diaz</dc:creator>
		<pubDate>Wed, 28 Nov 2007 17:31:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49594</guid>
		<description>IR folks long-suspected PageRank to be a red herring but was not confirmed until the last few years.  The reference I like to use comes from MSR and was published at WWW06,

M. Richardson, A. Prakash, and E. Brill, "&lt;a&gt;Beyond pagerank: machine learning for static ranking&lt;/a&gt;,” in WWW ’06: Proceedings of the 15th international conference on World Wide Web, (New York, NY, USA), pp. 707–715, ACM Press, 2006.

The authors demonstrate that structure-independent features, combined with page's popularity significantly outperformed PageRank.  Informal conversations with engine architects and SEO folks confirms this.  

It's helpful to interpret these results in the context of a random walk on the web graph.  PageRank is the stationary distribution of a random walker on the web graph.  In situations where you have no knowledge about page visitation , this is a reasonable surrogate.  However, in the presence of real user data (gathered through a toolbar or OS), the random walk model seems less attractive than models which incorporate visitation data.

That said, it also seems likely that actual effectiveness of search engines has more to do with using massive amounts of click data to train classic IR features and query triage schemes.</description>
		<content:encoded><![CDATA[<p>IR folks long-suspected PageRank to be a red herring but was not confirmed until the last few years.  The reference I like to use comes from MSR and was published at WWW06,</p>
<p>M. Richardson, A. Prakash, and E. Brill, &#8220;<a>Beyond pagerank: machine learning for static ranking</a>,” in WWW ’06: Proceedings of the 15th international conference on World Wide Web, (New York, NY, USA), pp. 707–715, ACM Press, 2006.</p>
<p>The authors demonstrate that structure-independent features, combined with page&#8217;s popularity significantly outperformed PageRank.  Informal conversations with engine architects and SEO folks confirms this.  </p>
<p>It&#8217;s helpful to interpret these results in the context of a random walk on the web graph.  PageRank is the stationary distribution of a random walker on the web graph.  In situations where you have no knowledge about page visitation , this is a reasonable surrogate.  However, in the presence of real user data (gathered through a toolbar or OS), the random walk model seems less attractive than models which incorporate visitation data.</p>
<p>That said, it also seems likely that actual effectiveness of search engines has more to do with using massive amounts of click data to train classic IR features and query triage schemes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sérgio Nunes</title>
		<link>http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49593</link>
		<dc:creator>Sérgio Nunes</dc:creator>
		<pubDate>Wed, 28 Nov 2007 17:11:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/#comment-49593</guid>
		<description>Hi again,

Sorry for lack of details about me. My name is Sérgio Nunes and I'm a PhD student in the field of WebIR.

Also sorry for the lack of a proper reference on my statement. This is a recent experimental work by Marc Najork that delves into this issue:

"HITS on the Web: How does it Compare?"
http://research.microsoft.com/research/pubs/view.aspx?0rc=p&#38;type=Publication&#38;id=1734</description>
		<content:encoded><![CDATA[<p>Hi again,</p>
<p>Sorry for lack of details about me. My name is Sérgio Nunes and I&#8217;m a PhD student in the field of WebIR.</p>
<p>Also sorry for the lack of a proper reference on my statement. This is a recent experimental work by Marc Najork that delves into this issue:</p>
<p>&#8220;HITS on the Web: How does it Compare?&#8221;<br />
<a href="http://research.microsoft.com/research/pubs/view.aspx?0rc=p&amp;type=Publication&amp;id=1734" rel="nofollow">http://research.microsoft.com/research/pubs/view.aspx?0rc=p&amp;type=Publication&amp;id=1734</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
