SlideShare a Scribd company logo
A Scalable Collaborative Filtering
Framework based on Co-clustering
Author: Thomas George, Srujana Merugu in ICDM’05.
Presenter: Rei-Zhe Liu. Date: 2010/10/26.
Outline
 Introduction
 System architecture
 Experiments and result
 Conclusion
2
Introduction
 We propose a dynamic collaborative filtering
approach that can support the entry of new users,
items and ratings using a hybrid of incremental
and batch versions of the co-clustering algorithm.
 Empirical comparison of our approach with SVD,
NNMF and correlation-based collaborative filtering
techniques indicates comparable accuracy at a
much lower computational effort.
3
System architecture
Problem definition(1/2)
 The approximate matrix for prediction is given by
5
Problem definition(2/2)
 We can now pose the prediction of unknown ratings
as a co-clustering problem where we seek to find the
optimal user and item clustering such that the
approximation error with respect to the known
ratings of A is minimized,
 where ensures that only the known ratings contribute
to the loss function.

6
Algorithm(1/3)
7
Algorithm(2/3)
8
Algorithm(3/3)
9
System description
 P1 handles the prediction and
incremental training.
 P2 is responsible for the static
training.
 During incremental training P1,
also updates the raw ratings.
 P2 performs co-clustering
repeatedly by reading A(the
current ratings matrix) and
updating S(summary statistics)
when done.
 Data Objects A and S are stored
at 2 parts: (a)stable part
(b)increment part.
 At the end of each co-clustering
run, the two parts are merged to
obtain a new set of stable values.
10
Experiments and results
Data sets and Exp. Settings(1/2)
 Data set
 MovieLens: 943-1882 user-by-movie matrix. Totally
100,000 ratings. Rated from 1 to 5.
 Evaluation methodology
 The prediction accuracy was measured using the mean
absolute error (MAE), which is the average of the
absolute values of the errors over all the predictions.
 The static training time was estimated in terms of the
CPU time taken for the core training routines (viz. co-
clustering and SVD).
 The prediction time was estimated by averaging over the
response time taken for all the predictions.
12
Data sets and Exp. Settings(2/2)
 For evaluating the prediction accuracy, we created ten
80-20% random train-test splits of the datasets and
averaged the results over the various splits.
 We considered two scenarios, —(i) static testing, where
the known ratings do not change, and (ii) dynamic
testing, where the ratings are updated incrementally.
 Algorithms
 We compared the performance of our co-clustering
based approach with SVD [13], NNMF [10] and classic
correlation-based collaborative filtering [12].
 An incremental SVD-based approach [14] using a
folding in technique was also implemented in order to
evaluate the prediction accuracy in dynamic scenarios
with changing ratings.13
Evaluation(1/3)
k = l = SVD rank =NNMF rank=3
k = l = SVD rank=3
14
Evaluation(2/3)
Dataset:
Mov1
Dataset:
MovieLens
15
CoC: (m+n+kl-k-l)
NNMF, SVD:
(m+n)(k+l)
Evaluation(3/3)
16
Dataset:
MovieLens
Conclusion
Conclusion
 In this paper, we presented a new dynamic
collaborative filtering approach based on simultaneous
clustering of users and items.
 Empirical results indicate that our approach can provide
high quality predictions at a much lower computational
cost compared to traditional correlation and SVD-based
approaches.
18

More Related Content

What's hot (20)

PDF
mlsys_portrait
Ian Dewancker
 
PDF
SigOpt_Bayesian_Optimization_Primer
Ian Dewancker
 
PDF
fmelleHumanActivityRecognitionWithMobileSensors
Fridtjof Melle
 
PPTX
Meta learned Confidence for Few-shot Learning
KIMMINHA3
 
PDF
Differential evolution optimization technique
Siksha 'O' Anusandhan (Deemed to be University )
 
DOCX
Learning Methods in a Neural Network
Saransh Choudhary
 
PPTX
RapidMiner: Learning Schemes In Rapid Miner
DataminingTools Inc
 
PPTX
Feature Selection in Machine Learning
Upekha Vandebona
 
PPTX
Genetic algorithm for hyperparameter tuning
Dr. Jyoti Obia
 
PPTX
RapidMiner: Data Mining And Rapid Miner
DataminingTools Inc
 
PDF
Similarity learning
Learnbay Datascience
 
PDF
Adversarially Guided Actor-Critic, Y. Flet-Berliac et al, 2021
Chris Ohk
 
PDF
Improving K-NN Internet Traffic Classification Using Clustering and Principle...
journalBEEI
 
PDF
Machine learning Mind Map
Ashish Patel
 
PPT
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Mahmoud Alfarra
 
PDF
Deep learning MindMap
Ashish Patel
 
PDF
Caravan insurance data mining prediction models
Muthu Kumaar Thangavelu
 
PDF
Survey paper on Big Data Imputation and Privacy Algorithms
IRJET Journal
 
PPTX
Wasserstein 1031 thesis [Chung il kim]
Chung-Il Kim
 
mlsys_portrait
Ian Dewancker
 
SigOpt_Bayesian_Optimization_Primer
Ian Dewancker
 
fmelleHumanActivityRecognitionWithMobileSensors
Fridtjof Melle
 
Meta learned Confidence for Few-shot Learning
KIMMINHA3
 
Differential evolution optimization technique
Siksha 'O' Anusandhan (Deemed to be University )
 
Learning Methods in a Neural Network
Saransh Choudhary
 
RapidMiner: Learning Schemes In Rapid Miner
DataminingTools Inc
 
Feature Selection in Machine Learning
Upekha Vandebona
 
Genetic algorithm for hyperparameter tuning
Dr. Jyoti Obia
 
RapidMiner: Data Mining And Rapid Miner
DataminingTools Inc
 
Similarity learning
Learnbay Datascience
 
Adversarially Guided Actor-Critic, Y. Flet-Berliac et al, 2021
Chris Ohk
 
Improving K-NN Internet Traffic Classification Using Clustering and Principle...
journalBEEI
 
Machine learning Mind Map
Ashish Patel
 
Graph-Based Technique for Extracting Keyphrases In a Single-Document (GTEK)
Mahmoud Alfarra
 
Deep learning MindMap
Ashish Patel
 
Caravan insurance data mining prediction models
Muthu Kumaar Thangavelu
 
Survey paper on Big Data Imputation and Privacy Algorithms
IRJET Journal
 
Wasserstein 1031 thesis [Chung il kim]
Chung-Il Kim
 

Viewers also liked (20)

PDF
Brokerage 2007 presentation distributed
imec.archive
 
PDF
2 deus leaflet wp2
imec.archive
 
PDF
I Minds2009 Health Decision Support Prof Bart De Moor (Ibbt Esat Ku Leuven)
imec.archive
 
PDF
Acknowledge 08 Ontwikkeling Front End Benny Daems Ibbt Edm U Hasselt En Al...
imec.archive
 
PDF
Maduf07 Expert Opinion And Potential Estimation Lieven De Marez
imec.archive
 
PDF
I Lab4 Usecases
imec.archive
 
PPTX
T map 로그에서 발생한 Java Locale 문제들
Chanil Park
 
PDF
Qo E E2 E6 Slotevent Programma
imec.archive
 
PDF
Ddo1 Bernd Langeheine 081017 Ghent
imec.archive
 
PDF
Erfgoed2 0 6 Nieuwe Perspectieven Voor Digitaal Erfgoed Bart De Nil En Jero...
imec.archive
 
PPS
tviexpress
TVIRICH
 
PDF
Show Me the Outcomes - United States
Community TechKnowledge
 
PPTX
The Library's "Place"
LHPeaden
 
PDF
Mark Sterns : entrepreneurship and faithfulness
micahdavis
 
PDF
Brokerage2006 de logistieke keten
imec.archive
 
PDF
Zorg en technologie_IBBT_Brokerage_HS_Peter_Degadt0120416_
imec.archive
 
PDF
Brokerage 2007presentation user
imec.archive
 
PDF
Grid07 4 Tzannetakis
imec.archive
 
PDF
Analyse Gent M #11 & Launch Startup Garage
imec.archive
 
PDF
Brokerage2006 beheer van volgende generatie telecom services
imec.archive
 
Brokerage 2007 presentation distributed
imec.archive
 
2 deus leaflet wp2
imec.archive
 
I Minds2009 Health Decision Support Prof Bart De Moor (Ibbt Esat Ku Leuven)
imec.archive
 
Acknowledge 08 Ontwikkeling Front End Benny Daems Ibbt Edm U Hasselt En Al...
imec.archive
 
Maduf07 Expert Opinion And Potential Estimation Lieven De Marez
imec.archive
 
I Lab4 Usecases
imec.archive
 
T map 로그에서 발생한 Java Locale 문제들
Chanil Park
 
Qo E E2 E6 Slotevent Programma
imec.archive
 
Ddo1 Bernd Langeheine 081017 Ghent
imec.archive
 
Erfgoed2 0 6 Nieuwe Perspectieven Voor Digitaal Erfgoed Bart De Nil En Jero...
imec.archive
 
tviexpress
TVIRICH
 
Show Me the Outcomes - United States
Community TechKnowledge
 
The Library's "Place"
LHPeaden
 
Mark Sterns : entrepreneurship and faithfulness
micahdavis
 
Brokerage2006 de logistieke keten
imec.archive
 
Zorg en technologie_IBBT_Brokerage_HS_Peter_Degadt0120416_
imec.archive
 
Brokerage 2007presentation user
imec.archive
 
Grid07 4 Tzannetakis
imec.archive
 
Analyse Gent M #11 & Launch Startup Garage
imec.archive
 
Brokerage2006 beheer van volgende generatie telecom services
imec.archive
 
Ad

Similar to A scalable collaborative filtering framework based on co-clustering (20)

PPT
A scalable collaborative filtering framework based on co clustering
AllenWu
 
PPTX
Collaborative Filtering Recommendation System
Milind Gokhale
 
PDF
IRJET- Searching an Optimal Algorithm for Movie Recommendation System
IRJET Journal
 
PDF
Ijmet 10 02_050
IAEME Publication
 
PPTX
Recommender Systems: Advances in Collaborative Filtering
Changsung Moon
 
PPT
Collaborative filtering using orthogonal nonnegative matrix
AllenWu
 
PDF
IntroductionRecommenderSystems_Petroni.pdf
AlphaIssaghaDiallo
 
PPTX
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
Loc Nguyen
 
PDF
A Review Study OF Movie Recommendation Using Machine Learning
IRJET Journal
 
DOCX
Developing Movie Recommendation System
Mohammad Emrul Hassan Emon
 
PDF
IMPROVING COLLABORATIVE RECOMMENDATION VIA USER-ITEM SUBGROUPS
Nexgen Technology
 
PDF
Multidirectional Product Support System for Decision Making In Textile Indust...
IOSR Journals
 
PPT
Chapter 02 collaborative recommendation
Aravindharamanan S
 
PPT
Chapter 02 collaborative recommendation
Aravindharamanan S
 
PPTX
Lessons learnt at building recommendation services at industry scale
Domonkos Tikk
 
PDF
Survey of Recommendation Systems
youalab
 
PDF
Recommendation System Explained
Crossing Minds
 
PDF
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
PDF
Mobile App Recommendations Using Deep Learning and Big Data
Luís Pinto
 
PPTX
Collaborative filtering
Kishor Datta Gupta
 
A scalable collaborative filtering framework based on co clustering
AllenWu
 
Collaborative Filtering Recommendation System
Milind Gokhale
 
IRJET- Searching an Optimal Algorithm for Movie Recommendation System
IRJET Journal
 
Ijmet 10 02_050
IAEME Publication
 
Recommender Systems: Advances in Collaborative Filtering
Changsung Moon
 
Collaborative filtering using orthogonal nonnegative matrix
AllenWu
 
IntroductionRecommenderSystems_Petroni.pdf
AlphaIssaghaDiallo
 
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
Loc Nguyen
 
A Review Study OF Movie Recommendation Using Machine Learning
IRJET Journal
 
Developing Movie Recommendation System
Mohammad Emrul Hassan Emon
 
IMPROVING COLLABORATIVE RECOMMENDATION VIA USER-ITEM SUBGROUPS
Nexgen Technology
 
Multidirectional Product Support System for Decision Making In Textile Indust...
IOSR Journals
 
Chapter 02 collaborative recommendation
Aravindharamanan S
 
Chapter 02 collaborative recommendation
Aravindharamanan S
 
Lessons learnt at building recommendation services at industry scale
Domonkos Tikk
 
Survey of Recommendation Systems
youalab
 
Recommendation System Explained
Crossing Minds
 
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Mobile App Recommendations Using Deep Learning and Big Data
Luís Pinto
 
Collaborative filtering
Kishor Datta Gupta
 
Ad

Recently uploaded (20)

PPTX
Smart Factory Monitoring IIoT in Machine and Production Operations.pptx
Rejig Digital
 
PPTX
The birth and death of Stars - earth and life science
rizellemarieastrolo
 
PDF
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
PDF
TrustArc Webinar - Navigating APAC Data Privacy Laws: Compliance & Challenges
TrustArc
 
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
PDF
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
PDF
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
PDF
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
PDF
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
PPSX
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 
PDF
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PDF
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
PDF
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PDF
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
PPTX
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
PDF
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
PDF
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
PDF
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Smart Factory Monitoring IIoT in Machine and Production Operations.pptx
Rejig Digital
 
The birth and death of Stars - earth and life science
rizellemarieastrolo
 
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
TrustArc Webinar - Navigating APAC Data Privacy Laws: Compliance & Challenges
TrustArc
 
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 

A scalable collaborative filtering framework based on co-clustering

  • 1. A Scalable Collaborative Filtering Framework based on Co-clustering Author: Thomas George, Srujana Merugu in ICDM’05. Presenter: Rei-Zhe Liu. Date: 2010/10/26.
  • 2. Outline  Introduction  System architecture  Experiments and result  Conclusion 2
  • 3. Introduction  We propose a dynamic collaborative filtering approach that can support the entry of new users, items and ratings using a hybrid of incremental and batch versions of the co-clustering algorithm.  Empirical comparison of our approach with SVD, NNMF and correlation-based collaborative filtering techniques indicates comparable accuracy at a much lower computational effort. 3
  • 5. Problem definition(1/2)  The approximate matrix for prediction is given by 5
  • 6. Problem definition(2/2)  We can now pose the prediction of unknown ratings as a co-clustering problem where we seek to find the optimal user and item clustering such that the approximation error with respect to the known ratings of A is minimized,  where ensures that only the known ratings contribute to the loss function.  6
  • 10. System description  P1 handles the prediction and incremental training.  P2 is responsible for the static training.  During incremental training P1, also updates the raw ratings.  P2 performs co-clustering repeatedly by reading A(the current ratings matrix) and updating S(summary statistics) when done.  Data Objects A and S are stored at 2 parts: (a)stable part (b)increment part.  At the end of each co-clustering run, the two parts are merged to obtain a new set of stable values. 10
  • 12. Data sets and Exp. Settings(1/2)  Data set  MovieLens: 943-1882 user-by-movie matrix. Totally 100,000 ratings. Rated from 1 to 5.  Evaluation methodology  The prediction accuracy was measured using the mean absolute error (MAE), which is the average of the absolute values of the errors over all the predictions.  The static training time was estimated in terms of the CPU time taken for the core training routines (viz. co- clustering and SVD).  The prediction time was estimated by averaging over the response time taken for all the predictions. 12
  • 13. Data sets and Exp. Settings(2/2)  For evaluating the prediction accuracy, we created ten 80-20% random train-test splits of the datasets and averaged the results over the various splits.  We considered two scenarios, —(i) static testing, where the known ratings do not change, and (ii) dynamic testing, where the ratings are updated incrementally.  Algorithms  We compared the performance of our co-clustering based approach with SVD [13], NNMF [10] and classic correlation-based collaborative filtering [12].  An incremental SVD-based approach [14] using a folding in technique was also implemented in order to evaluate the prediction accuracy in dynamic scenarios with changing ratings.13
  • 14. Evaluation(1/3) k = l = SVD rank =NNMF rank=3 k = l = SVD rank=3 14
  • 18. Conclusion  In this paper, we presented a new dynamic collaborative filtering approach based on simultaneous clustering of users and items.  Empirical results indicate that our approach can provide high quality predictions at a much lower computational cost compared to traditional correlation and SVD-based approaches. 18