<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: How many users are needed for an efficient collaborative filtering system?</title>
	<atom:link href="http://lemire.me/blog/archives/2008/02/06/ow-many-users-are-needed-for-an-efficient-collaborative-filtering-system/feed/" rel="self" type="application/rss+xml" />
	<link>http://lemire.me/blog/archives/2008/02/06/ow-many-users-are-needed-for-an-efficient-collaborative-filtering-system/</link>
	<description>Computer Scientist and Open Scholar: Databases, Information Retrieval, Business Intelligence.</description>
	<lastBuildDate>Wed, 23 May 2012 16:42:57 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<item>
		<title>By: Daniel Lemire</title>
		<link>http://lemire.me/blog/archives/2008/02/06/ow-many-users-are-needed-for-an-efficient-collaborative-filtering-system/comment-page-1/#comment-49727</link>
		<dc:creator>Daniel Lemire</dc:creator>
		<pubDate>Wed, 06 Feb 2008 22:04:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2008/02/06/ow-many-users-are-needed-for-an-efficient-collaborative-filtering-system/#comment-49727</guid>
		<description>&lt;i&gt; I am interested in how you use &#039;accuracy&#039; here - since there is no &#039;right&#039; answer for a recommender, accuracy is hard to measure, let alone improve.  I suspect that you are really talking about predicting ratings (such as one can do for the Netflix prize).&lt;/i&gt;

Yes. I am. And I agree with you. A friend of mine, Peter Turney, who also reads this blog, might answer something along the line  that an incomplete metric is better than no metric at all.

 
&lt;i&gt; I think that the rating prediction accuracy is a vastly overrated metric for evaluating recommender systems. This metric ignores all sorts of aspects of recommendation that can add or detract  from the quality of recommendation: novelty, transparency, resistance to hacking and shilling, diversity all contribute to the quality of a recommendation.&lt;/i&gt;

I agree 100%. I have written about this on my blog in the past.

&lt;i&gt; The canonical wisdom for CF systems is that more data is better - and if you are just predicting ratings, then I agree, but I think we&#039;ve seen many examples of recommendation in the wild where more users result in poorer recommendations.  Just look at the diversity of recommendations at sites like Digg or Last.fm.  As their user base goes up, the diversity of recommendations goes down, the recommender hacks goes up, and the overall recommender experience gets worse.  Look at the top 10 tracks at last.fm this week. As the size Last.fm user base has increased it has become a very homogenized music site.&lt;/i&gt;

Very interesting comment. And I agree.</description>
		<content:encoded><![CDATA[<p><i> I am interested in how you use &#8216;accuracy&#8217; here &#8211; since there is no &#8216;right&#8217; answer for a recommender, accuracy is hard to measure, let alone improve.  I suspect that you are really talking about predicting ratings (such as one can do for the Netflix prize).</i></p>
<p>Yes. I am. And I agree with you. A friend of mine, Peter Turney, who also reads this blog, might answer something along the line  that an incomplete metric is better than no metric at all.</p>
<p><i> I think that the rating prediction accuracy is a vastly overrated metric for evaluating recommender systems. This metric ignores all sorts of aspects of recommendation that can add or detract  from the quality of recommendation: novelty, transparency, resistance to hacking and shilling, diversity all contribute to the quality of a recommendation.</i></p>
<p>I agree 100%. I have written about this on my blog in the past.</p>
<p><i> The canonical wisdom for CF systems is that more data is better &#8211; and if you are just predicting ratings, then I agree, but I think we&#8217;ve seen many examples of recommendation in the wild where more users result in poorer recommendations.  Just look at the diversity of recommendations at sites like Digg or Last.fm.  As their user base goes up, the diversity of recommendations goes down, the recommender hacks goes up, and the overall recommender experience gets worse.  Look at the top 10 tracks at last.fm this week. As the size Last.fm user base has increased it has become a very homogenized music site.</i></p>
<p>Very interesting comment. And I agree.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paul</title>
		<link>http://lemire.me/blog/archives/2008/02/06/ow-many-users-are-needed-for-an-efficient-collaborative-filtering-system/comment-page-1/#comment-49726</link>
		<dc:creator>Paul</dc:creator>
		<pubDate>Wed, 06 Feb 2008 21:44:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.daniel-lemire.com/blog/archives/2008/02/06/ow-many-users-are-needed-for-an-efficient-collaborative-filtering-system/#comment-49726</guid>
		<description>Daniel:

I am interested in how you use &#039;accuracy&#039; here - since there is no &#039;right&#039; answer for a recommender, accuracy is hard to measure, let alone improve.  I suspect that you are really talking about predicting ratings (such as one can do for the Netflix prize).  

I think that the rating prediction accuracy is a vastly overrated metric for evaluating recommender systems. This metric ignores all sorts of aspects of recommendation that can add or detract  from the quality of recommendation: novelty, transparency, resistance to hacking and shilling, diversity all contribute to the quality of a recommendation.

The canonical wisdom for CF systems is that more data is better - and if you are just predicting ratings, then I agree, but I think we&#039;ve seen many examples of recommendation in the wild where more users result in poorer recommendations.  Just look at the diversity of recommendations at sites like Digg or Last.fm.  As their user base goes up, the diversity of recommendations goes down, the recommender hacks goes up, and the overall recommender experience gets worse.  Look at the top 10 tracks at last.fm this week. As the size Last.fm user base has increased it has become a very homogenized music site.

http://www.last.fm/music/+charts/track/


(well, sorry for the rant, thanks for the interesting and provocative list).</description>
		<content:encoded><![CDATA[<p>Daniel:</p>
<p>I am interested in how you use &#8216;accuracy&#8217; here &#8211; since there is no &#8216;right&#8217; answer for a recommender, accuracy is hard to measure, let alone improve.  I suspect that you are really talking about predicting ratings (such as one can do for the Netflix prize).  </p>
<p>I think that the rating prediction accuracy is a vastly overrated metric for evaluating recommender systems. This metric ignores all sorts of aspects of recommendation that can add or detract  from the quality of recommendation: novelty, transparency, resistance to hacking and shilling, diversity all contribute to the quality of a recommendation.</p>
<p>The canonical wisdom for CF systems is that more data is better &#8211; and if you are just predicting ratings, then I agree, but I think we&#8217;ve seen many examples of recommendation in the wild where more users result in poorer recommendations.  Just look at the diversity of recommendations at sites like Digg or Last.fm.  As their user base goes up, the diversity of recommendations goes down, the recommender hacks goes up, and the overall recommender experience gets worse.  Look at the top 10 tracks at last.fm this week. As the size Last.fm user base has increased it has become a very homogenized music site.</p>
<p><a href="http://www.last.fm/music/+charts/track/" rel="nofollow">http://www.last.fm/music/+charts/track/</a></p>
<p>(well, sorry for the rant, thanks for the interesting and provocative list).</p>
]]></content:encoded>
	</item>
</channel>
</rss>

