SlideShare a Scribd company logo
5
Most read
8
Most read
15
Most read
Item-based
Collaborative Filtering
Yusuke Yamamoto
Lecturer, Faculty of Informatics
yusuke_yamamoto@acm.org
Data Engineering (Recommender Systems 2)
2019.10.28
1
2
Problems on User-based
Collaborative Filtering
User-based Collaborative Filtering
3
Predicts a target user’s rating for an item
based on rating tendency of similar users
𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢), 𝑖 = 𝑟,-
+
∑,∈12
𝑠𝑖𝑚(𝑢), 𝑢) 7 (𝑟,,8 − 𝑟,-
)
∑,∈12
𝑠𝑖𝑚(𝑢), 𝑢)
Item5 sim Average Rating
Alice ? 1 4
User1 3 0.85 2.4
User2 5 0.71 3.8
Similar
users
Computation of similarity between users
4
Pearson’s correlation coefficient
𝑠𝑖𝑚 𝑢), 𝑢: =
∑8∈;(𝑟,-,8 − 𝑟,-
)(𝑟,<,8 − 𝑟,<
)
∑8∈; 𝑟,-,8 − 𝑟,-
=
∑8∈; 𝑟,<,8 − 𝑟,<
=
Item1 Item2 Item3 Item4
Alice 5 3 4 4
User1 3 1 2 3
User2 4 3 4 3
User3 3 3 1 5
User4 1 5 5 2
sim=0.71
sim=-0.79
Problems on User-based Collaborative Filtering (1/2)
5
Item1 Item2 Item3 Item4 item5 item6
Bob 3 2
User1 3 1 2 3
User2 4 3 4 3
User3 3 3 1 5
User4 1 5 5 2 5
• It is rare that two users rated the same item
• User similarity drastically changes if a few ratings are added
Impossible to
compute similarity
Is it possible to compute precise user
similarity by using rating scores for only one
common item?
If users haven’t rate the same items yet,
user similarity cannot be computed
Problems on User-based Collaborative Filtering (2/2)
6
#Users >> #Items
• In general, the number of users are much bigger than that of items
• Big computational cost of nearest neighbors (similar users)
Unstable user preference
User preferences (user features) often change, while item features
do not often change
2
7
Item-based
Collaborative Filtering
Idea about Item-based Collaborative Filtering
8
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
similar
Predicts unknown scores
based on rating tendency for similar items
similar
Advantages of Item-based Collaborative Filtering
9
Computational cost
In general, the number of items is much less than that of users, and so
the item-based CF’s computational cost is much smaller than the user-
based CF’s
Stable similarity computation
• Item features (vectors) do not often change and are stable
• Compared to user features (vectors) on a rating matrix, features
(vectors) have less N/A dimensions.
• It is possible to compute similarity between items by using enough
information
Computation of Similarity between Items (1/2)
10
Cosine similarity
𝑠𝑖𝑚 𝑖), 𝑖: = cos 𝜃 =
𝒗8-
7 𝒗8<
𝒗8-
∗ |𝒗8<
|
• Focuses on the angle between two vectors
• The similarity ranges between 0 and 1
• Best performance for item similarity calculation
:Item a, b𝑖), 𝑖:
:Item a, b’s rating vector𝒗8-
, 𝒗8-
0
:Angle between 𝒗8-
, 𝒗8-
𝜃
:Vector 𝒗’s length|𝒗|
Computation of Similarity between Items (2/2)
11
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
sim=?
𝑠𝑖𝑚 𝑖E, 𝑖F =
3×3 + 4×5 + 3×4 + 1×1
3= + 4= + 3= + 1=× 3= + 5= + 4= + 1=
= 0.99
Problem of using basic cosine similarity
12
0
1
2
3
4
5
6
Item1 Item2 Item3 Item4
Alice
User1
Ratingscore
Basic cosine similarity does not take the
difference in the average rating behavior of
the users into account
Alice rates easily, and User1 rates strictly. However, if
considering the difference from the average, the rating
for Item 1 does not vary between Alice and User 1
Adjusted Cosine Similarity (1/3)
13
Item1 Item2 Item3 Item4 Item5 Avg.
Alice 5 3 4 4 ? 4
User1 3 1 2 3 3 2.4
User2 4 3 4 3 5 3.8
User3 3 3 1 5 4 3.2
User4 1 5 5 2 1 2.8
Subtracts the user average from the ratings
and calculates cosine similarity using the
adjusted rating matrix
Adjusted Cosine Similarity (2/3)
14
Subtracts the user average from the ratings
and calculates cosine similarity using the
adjusted rating matrix
Item1 Item2 Item3 Item4 Item5 Avg.
Alice 5 3 4 4 ? 4
User1 3 1 2 3 3 2.4
User2 4 3 4 3 5 3.8
User3 3 3 1 5 4 3.2
User4 1 5 5 2 1 2.8
-4 -4 -4 -4
-2.4 -2.4 -2.4 -2.4
-3.8 -3.8 -3.8 -3.8
-3.2 -3.2 -3.2 -3.2
-2.8 -2.8 -2.8 -2.8
-2.4
-3.8
-3.2
-2.8
Adjusted Cosine Similarity (3/3)
15
Subtracts the user average from the ratings
and calculates cosine similarity using the
adjusted rating matrix
𝑠𝑖𝑚 𝑖E, 𝑖F
=
0.6×0.6 + 0.2×1.2 + (−0.2)×0.8 + (−1.8)×(−1.8)
0.6= + 0.2= + (−0.2)=+(−1.8)=× 0.6= + 1.2= + 0.8= + (−1.8)=
= 0.80
Item1 Item2 Item3 Item4 Item5 Avg.
Alice 1.0 -1.0 0.0 0.0 ? 4
User1 0.6 -1.4 -0.4 0.6 0.6 2.4
User2 0.2 -0.8 0.2 -0.8 1.2 3.8
User3 -0.2 -0.2 -2.2 2.8 0.8 3.2
User4 -1.8 2.2 2.2 -0.8 -1.8 2.8
Rating Prediction based on Item Similarity
16
Prediction Function (predicted scores are adjusted)
𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢), 𝑖R =
∑8∈;2
𝑠𝑖𝑚(𝑖R, 𝑖) 7 𝑟,-,8
∑8∈;2
𝑠𝑖𝑚(𝑖R, 𝑖)
: target user a𝑢)
𝑟,,8 : rating score of user u for item i
𝑖R: target item t
𝐼T : a set of similar items for a target item
Selection of Similar Item (nearest neighbor items)
17
Set a threshold for item similarity
Focus on top K similar items (kNN method)
If an item has higher similarity than a threshold,
it can be regarded as a “similar” item
• If an item ranks at the top K similarity, it can be regarded
as a similar item
• K is often set to between 50 〜 200
Summary of Item-based Collaborative Filtering
18
Basic Approach
• Item similarities are obtained from a rating matrix
• Based on rating scores of similar items, systems predict
a rating score of target user for a target item
Similarity Calculation
Cosine similarity is known best in practice
Selection of Similar Items
Top K items with high similarity are often selected as
similar items

More Related Content

PPT
Excel Conditional Formatting
PPT
Ll(1) Parser in Compilers
PPTX
NLP_KASHK:Finite-State Automata
PPTX
Machine Learning - Introduction to Convolutional Neural Networks
PDF
3b. LMD & RMD.pdf
PPTX
Natural Language parsing.pptx
PDF
Collaborative Filtering 1: User-based CF
PPT
Chapter 02 collaborative recommendation
Excel Conditional Formatting
Ll(1) Parser in Compilers
NLP_KASHK:Finite-State Automata
Machine Learning - Introduction to Convolutional Neural Networks
3b. LMD & RMD.pdf
Natural Language parsing.pptx
Collaborative Filtering 1: User-based CF
Chapter 02 collaborative recommendation

What's hot (20)

PPTX
NLP_KASHK:Morphology
PPT
PYTHON PROGRAMMING for first year cse students
PPTX
Regular expression (compiler)
PPTX
Lefmost rightmost TOC.pptx
PPSX
Prototype-based models in machine learning
PPT
PPTX
Role-of-lexical-analysis
PPTX
NLP_KASHK:Finite-State Morphological Parsing
PDF
Introduction of suffix tree
PPTX
Arabic Handwritten Text Recognition and Writer Identification
PPTX
Syntax Analysis - LR(0) Parsing in Compiler
PPTX
Lecture 1: Semantic Analysis in Language Technology
PPT
Theory of Automata Lesson 02
PPT
Regular expressions
PPT
Yacc lex
PPTX
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
PPTX
Deep ar presentation
PPTX
Graphs - CH10 - Discrete Mathematics
PPTX
Operator precedance parsing
PPTX
Lexical Analysis - Compiler Design
NLP_KASHK:Morphology
PYTHON PROGRAMMING for first year cse students
Regular expression (compiler)
Lefmost rightmost TOC.pptx
Prototype-based models in machine learning
Role-of-lexical-analysis
NLP_KASHK:Finite-State Morphological Parsing
Introduction of suffix tree
Arabic Handwritten Text Recognition and Writer Identification
Syntax Analysis - LR(0) Parsing in Compiler
Lecture 1: Semantic Analysis in Language Technology
Theory of Automata Lesson 02
Regular expressions
Yacc lex
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep ar presentation
Graphs - CH10 - Discrete Mathematics
Operator precedance parsing
Lexical Analysis - Compiler Design
Ad

Similar to Collaborative Filtering 2: Item-based CF (20)

PPT
Item based approach
PPTX
introduction to machine learning 3d-collab-filtering.pptx
PDF
Aaa ped-19-Recommender Systems: Neighborhood-based Filtering
PDF
IRJET- Book Recommendation System using Item Based Collaborative Filtering
PDF
Introduction to recommender systems
PPT
Project presentation
PDF
Collaborative filtering
PPT
Item basedcollaborativefilteringrecommendationalgorithms
PDF
Movie recommendation project
PDF
Speaker pham cong dinh
PDF
Recommender Systems! @ASAI 2011
PPTX
movierecommendationproject-171223181147.pptx
PDF
PPT by Jannach_organized.pdf presentation on the recommendation
PDF
A survey of memory based methods for collaborative filtering based techniques
PDF
Survey of Recommendation Systems
PDF
Book Recommendation Engine
PPT
Chapter 02 collaborative recommendation
PPTX
collaborativefiltering-150228122057-conversion-gate02.pptx
PPTX
Recommender systems: Content-based and collaborative filtering
PPT
Cs583 recommender-systems
Item based approach
introduction to machine learning 3d-collab-filtering.pptx
Aaa ped-19-Recommender Systems: Neighborhood-based Filtering
IRJET- Book Recommendation System using Item Based Collaborative Filtering
Introduction to recommender systems
Project presentation
Collaborative filtering
Item basedcollaborativefilteringrecommendationalgorithms
Movie recommendation project
Speaker pham cong dinh
Recommender Systems! @ASAI 2011
movierecommendationproject-171223181147.pptx
PPT by Jannach_organized.pdf presentation on the recommendation
A survey of memory based methods for collaborative filtering based techniques
Survey of Recommendation Systems
Book Recommendation Engine
Chapter 02 collaborative recommendation
collaborativefiltering-150228122057-conversion-gate02.pptx
Recommender systems: Content-based and collaborative filtering
Cs583 recommender-systems
Ad

More from Yusuke Yamamoto (20)

PDF
WISE2019 presentation
PDF
Link Analysis
PDF
Matrix Factorization
PDF
データ解析技術2019
PDF
研究室紹介資料2019
PDF
ACM WebSci 2018 presentation/発表資料
PDF
不便益システムシンポジウム2018発表資料
PDF
KURA HOUR拡大版・附属図書館研究開発室セミナー 20180319
PDF
批判的ウェブ情報探索リテラシー尺度の開発
PDF
東北地区大学図書館協議会 第72回総会講演資料20170922
PDF
WI2研究会 Vol.10発表資料20170708
PDF
情報学応用論20170622
PDF
情報学総論20170623
PDF
情報学総論20170616
PDF
ビッグデータとITイノベーション
PDF
ウェブと研究者との関わり方20150302
PDF
大学の研究力を考える
PDF
研究力DOWNシナリオ
PDF
URAかるた 〜URA業務の理解・共有を促進するゲーム教材
PDF
ポスター「科研費申請書の教科書 ~ 作成に意味はあったのか?」
WISE2019 presentation
Link Analysis
Matrix Factorization
データ解析技術2019
研究室紹介資料2019
ACM WebSci 2018 presentation/発表資料
不便益システムシンポジウム2018発表資料
KURA HOUR拡大版・附属図書館研究開発室セミナー 20180319
批判的ウェブ情報探索リテラシー尺度の開発
東北地区大学図書館協議会 第72回総会講演資料20170922
WI2研究会 Vol.10発表資料20170708
情報学応用論20170622
情報学総論20170623
情報学総論20170616
ビッグデータとITイノベーション
ウェブと研究者との関わり方20150302
大学の研究力を考える
研究力DOWNシナリオ
URAかるた 〜URA業務の理解・共有を促進するゲーム教材
ポスター「科研費申請書の教科書 ~ 作成に意味はあったのか?」

Recently uploaded (20)

PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Transcultural that can help you someday.
PPTX
modul_python (1).pptx for professional and student
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPT
Predictive modeling basics in data cleaning process
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
How to run a consulting project- client discovery
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Microsoft Core Cloud Services powerpoint
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Data_Analytics_and_PowerBI_Presentation.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Galatica Smart Energy Infrastructure Startup Pitch Deck
Transcultural that can help you someday.
modul_python (1).pptx for professional and student
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Predictive modeling basics in data cleaning process
Optimise Shopper Experiences with a Strong Data Estate.pdf
How to run a consulting project- client discovery
STERILIZATION AND DISINFECTION-1.ppthhhbx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Business Analytics and business intelligence.pdf
IBA_Chapter_11_Slides_Final_Accessible.pptx
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Microsoft Core Cloud Services powerpoint
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Pilar Kemerdekaan dan Identi Bangsa.pptx

Collaborative Filtering 2: Item-based CF

  • 1. Item-based Collaborative Filtering Yusuke Yamamoto Lecturer, Faculty of Informatics [email protected] Data Engineering (Recommender Systems 2) 2019.10.28
  • 3. User-based Collaborative Filtering 3 Predicts a target user’s rating for an item based on rating tendency of similar users 𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢), 𝑖 = 𝑟,- + ∑,∈12 𝑠𝑖𝑚(𝑢), 𝑢) 7 (𝑟,,8 − 𝑟,- ) ∑,∈12 𝑠𝑖𝑚(𝑢), 𝑢) Item5 sim Average Rating Alice ? 1 4 User1 3 0.85 2.4 User2 5 0.71 3.8 Similar users
  • 4. Computation of similarity between users 4 Pearson’s correlation coefficient 𝑠𝑖𝑚 𝑢), 𝑢: = ∑8∈;(𝑟,-,8 − 𝑟,- )(𝑟,<,8 − 𝑟,< ) ∑8∈; 𝑟,-,8 − 𝑟,- = ∑8∈; 𝑟,<,8 − 𝑟,< = Item1 Item2 Item3 Item4 Alice 5 3 4 4 User1 3 1 2 3 User2 4 3 4 3 User3 3 3 1 5 User4 1 5 5 2 sim=0.71 sim=-0.79
  • 5. Problems on User-based Collaborative Filtering (1/2) 5 Item1 Item2 Item3 Item4 item5 item6 Bob 3 2 User1 3 1 2 3 User2 4 3 4 3 User3 3 3 1 5 User4 1 5 5 2 5 • It is rare that two users rated the same item • User similarity drastically changes if a few ratings are added Impossible to compute similarity Is it possible to compute precise user similarity by using rating scores for only one common item? If users haven’t rate the same items yet, user similarity cannot be computed
  • 6. Problems on User-based Collaborative Filtering (2/2) 6 #Users >> #Items • In general, the number of users are much bigger than that of items • Big computational cost of nearest neighbors (similar users) Unstable user preference User preferences (user features) often change, while item features do not often change
  • 8. Idea about Item-based Collaborative Filtering 8 Item1 Item2 Item3 Item4 Item5 Alice 5 3 4 4 ? User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1 similar Predicts unknown scores based on rating tendency for similar items similar
  • 9. Advantages of Item-based Collaborative Filtering 9 Computational cost In general, the number of items is much less than that of users, and so the item-based CF’s computational cost is much smaller than the user- based CF’s Stable similarity computation • Item features (vectors) do not often change and are stable • Compared to user features (vectors) on a rating matrix, features (vectors) have less N/A dimensions. • It is possible to compute similarity between items by using enough information
  • 10. Computation of Similarity between Items (1/2) 10 Cosine similarity 𝑠𝑖𝑚 𝑖), 𝑖: = cos 𝜃 = 𝒗8- 7 𝒗8< 𝒗8- ∗ |𝒗8< | • Focuses on the angle between two vectors • The similarity ranges between 0 and 1 • Best performance for item similarity calculation :Item a, b𝑖), 𝑖: :Item a, b’s rating vector𝒗8- , 𝒗8- 0 :Angle between 𝒗8- , 𝒗8- 𝜃 :Vector 𝒗’s length|𝒗|
  • 11. Computation of Similarity between Items (2/2) 11 Item1 Item2 Item3 Item4 Item5 Alice 5 3 4 4 ? User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1 sim=? 𝑠𝑖𝑚 𝑖E, 𝑖F = 3×3 + 4×5 + 3×4 + 1×1 3= + 4= + 3= + 1=× 3= + 5= + 4= + 1= = 0.99
  • 12. Problem of using basic cosine similarity 12 0 1 2 3 4 5 6 Item1 Item2 Item3 Item4 Alice User1 Ratingscore Basic cosine similarity does not take the difference in the average rating behavior of the users into account Alice rates easily, and User1 rates strictly. However, if considering the difference from the average, the rating for Item 1 does not vary between Alice and User 1
  • 13. Adjusted Cosine Similarity (1/3) 13 Item1 Item2 Item3 Item4 Item5 Avg. Alice 5 3 4 4 ? 4 User1 3 1 2 3 3 2.4 User2 4 3 4 3 5 3.8 User3 3 3 1 5 4 3.2 User4 1 5 5 2 1 2.8 Subtracts the user average from the ratings and calculates cosine similarity using the adjusted rating matrix
  • 14. Adjusted Cosine Similarity (2/3) 14 Subtracts the user average from the ratings and calculates cosine similarity using the adjusted rating matrix Item1 Item2 Item3 Item4 Item5 Avg. Alice 5 3 4 4 ? 4 User1 3 1 2 3 3 2.4 User2 4 3 4 3 5 3.8 User3 3 3 1 5 4 3.2 User4 1 5 5 2 1 2.8 -4 -4 -4 -4 -2.4 -2.4 -2.4 -2.4 -3.8 -3.8 -3.8 -3.8 -3.2 -3.2 -3.2 -3.2 -2.8 -2.8 -2.8 -2.8 -2.4 -3.8 -3.2 -2.8
  • 15. Adjusted Cosine Similarity (3/3) 15 Subtracts the user average from the ratings and calculates cosine similarity using the adjusted rating matrix 𝑠𝑖𝑚 𝑖E, 𝑖F = 0.6×0.6 + 0.2×1.2 + (−0.2)×0.8 + (−1.8)×(−1.8) 0.6= + 0.2= + (−0.2)=+(−1.8)=× 0.6= + 1.2= + 0.8= + (−1.8)= = 0.80 Item1 Item2 Item3 Item4 Item5 Avg. Alice 1.0 -1.0 0.0 0.0 ? 4 User1 0.6 -1.4 -0.4 0.6 0.6 2.4 User2 0.2 -0.8 0.2 -0.8 1.2 3.8 User3 -0.2 -0.2 -2.2 2.8 0.8 3.2 User4 -1.8 2.2 2.2 -0.8 -1.8 2.8
  • 16. Rating Prediction based on Item Similarity 16 Prediction Function (predicted scores are adjusted) 𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢), 𝑖R = ∑8∈;2 𝑠𝑖𝑚(𝑖R, 𝑖) 7 𝑟,-,8 ∑8∈;2 𝑠𝑖𝑚(𝑖R, 𝑖) : target user a𝑢) 𝑟,,8 : rating score of user u for item i 𝑖R: target item t 𝐼T : a set of similar items for a target item
  • 17. Selection of Similar Item (nearest neighbor items) 17 Set a threshold for item similarity Focus on top K similar items (kNN method) If an item has higher similarity than a threshold, it can be regarded as a “similar” item • If an item ranks at the top K similarity, it can be regarded as a similar item • K is often set to between 50 〜 200
  • 18. Summary of Item-based Collaborative Filtering 18 Basic Approach • Item similarities are obtained from a rating matrix • Based on rating scores of similar items, systems predict a rating score of target user for a target item Similarity Calculation Cosine similarity is known best in practice Selection of Similar Items Top K items with high similarity are often selected as similar items