关于推荐算法中rmse和rank的大讨论
没有时间写长篇大论了,摘抄一段最近看的一个博客 RecSys 2016 - Part I.,非常赞同以下的观点。Ghosts of the past (10 years)Before concluding this post, I’d like to highlight one of the long-standing problems of RecSys and recommender systems research in general. Some parts of this community are stuck in the past. In 2016, we still had papers that worked on explicit feedback data, did the rating prediction task and thus evaluated w.r.t. RMSE or MAE. This is the classic task that was popularized by the Netflix Prize 10 years ago.
The goal of a recommender system is to rank the items for the user (or in a situation, or to an item, etc.) and show the most relevant ones. This task is usually referred to as the top-N recommendation task. Several research papers showed that good rating prediction doesn’t necessarily mean good top-N recommendation and vice versa. In fact, the order of algorithms on these two tasks can be quite the opposite. Rating prediction is pretty much useless in 99% of the cases because a good recommender has to solve the top-N task. Of course, solving the top-N task is just a part of the whole recommender system; there are also other things to consider. It is also true that the results of offline evaluation, in general, should be handled with a grain of salt; but as I wrote in my post on RecSys 2015: to do research, you need some kind of well-defined evaluation, even if it is just an approximation of the final goal. The thing is: rating prediction is not an approximation of the final goal and is therefore now obsolete. Any paper that focuses on this task in 2016, shows that its authors have no clue about real recommender systems. That’s why this practice is constantly called out by researchers in the industry.
Note, that this doesn’t mean that explicit feedback is necessarily bad. You can do top-N recommendations based on explicit feedback as well. It will be less interesting for practitioners, because explicit feedback is usually hard to gather in the wild, and even if you have it in large quantities, you will only have it for a small portion of your user base. Now, there are several public implicit feedback datasets, so everyone may choose to switch to those. But doing top-N recommendations on explicit data is fine.
The worst thing about this is not that rating prediction papers are written, per se. The authors might be outside of the community or just getting into the field. They might think – based on the vast literature on rating prediction – that this is the problem they should try to solve. The problem is that these papers receive good reviews and get accepted to conferences like RecSys or published in journals. This depends on the reviewers, who should know better. I hope that we won’t see any rating prediction papers next year. I’ll do my part by calling out this practice because I think the whole community would benefit from banishing rating prediction papers.
=========================== 分 割 线 =================================
【关于推荐算法中rmse和rank的大讨论】
推荐阅读
- 过节■江苏省委省政府办公厅下发关于做好2021年元旦春节期间有关工作的通知
- 汽车知识|月薪5000元,推荐4款车型,可全款可分期,既有质量又有面子
- 环球车讯网|【帮你选车】简单粗暴 前后双电机+四驱 三款纯电SUV推荐
- |徐州市出台《关于优化创新创业生态系统 提升区域科技创新活力的实施意见》及实施细则
- 学图像处理有哪些不错的书推荐
- 环球车讯网|8万元最佳性价比大众车推荐丨桑塔纳和宝来
- 雨下|全球关于禁售燃油车只是理论上可行吗
- 关于用phpfsocket 写Post, 模拟http 报文怎样写入要传输的处理数据
- 人民车市|第12批新能源汽车推荐目录发布 众多热门车入选
- 婴儿|美国儿科学会: 1岁以下婴儿不推荐学习游泳