SlideShare a Scribd company logo
Content-based recommendation
 The requirement
 some information about the available items such as the genre ("content")
 some sort of user profile describing what the user likes (the preferences)
• “Similarity” is computed from item attributes, e.g.,
• Similarity of movies by actors, director, genre
• Similarity of text by words, topics
• Similarity of music by genre, year
 The task:
 learn user preferences
 locate/recommend items that are "similar" to the user preferences
"show me more
of the same
what I've liked"
• Most Content Based-recommendation techniques were applied to recommending
text documents.
• Like web pages or newsgroup messages for example.
• Content of items can also be represented as text documents.
• With textual descriptions of their basic characteristics.
• Structured: Each item is described by the same set of attributes
Title Genre Author Type Price Keywords
The Night of
the Gun
Memoir David Carr Paperback 29.90 Press and journalism, drug addiction,
personal memoirs, New York
The Lace
Reader
Fiction,
Mystery
Brunonia Barry Hardcover 49.90 American contemporary fiction, detective,
historical
Into the Fire Romance,
Suspense
Suzanne
Brockmann
Hardcover 45.90 American fiction, murder, neo-Nazism
 Item representation
Content representation and item similarities
• Approach
• Compute the similarity of an unseen item with
the user profile based on the keyword overlap
(e.g. using the Dice coefficient)
• Or use and combine multiple metrics
Title Genre Author Type Price Keywords
The Night of
the Gun
Memoir David Carr Paperback 29.90 Press and journalism, drug
addiction, personal memoirs,
New York
The Lace
Reader
Fiction,
Mystery
Brunonia Barry Hardcover 49.90 American contemporary fiction,
detective, historical
Into the Fire Romance,
Suspense
Suzanne
Brockmann
Hardcover 45.90 American fiction, murder, neo-
Nazism
 User profile
Title Genre Author Type Price Keywords
… Fiction Brunonia,
Barry, Ken
Follett
Paperback 25.65 Detective, murder,
New York
𝟐 × 𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒊 ∩ 𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒋
𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒊 + 𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒋
𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑏𝑗
describes Book 𝑏𝑗
with a set of
keywords
Term-Frequency - Inverse Document
Frequency (𝑻𝑭 − 𝑰𝑫𝑭)
• Simple keyword representation has its problems
• in particular when automatically extracted as
• not every word has similar importance
• longer documents have a higher chance to have an overlap with the user profile
• Standard measure: TF-IDF
• Encodes text documents in multi-dimensional Euclidian space
• weighted term vector
• TF: Measures, how often a term appears (density in a document)
• assuming that important terms appear more often
• normalization has to be done in order to take document length into account
• IDF: Aims to reduce the weight of terms that appear in all documents
• Given a keyword 𝑖 and a document 𝑗
• 𝑇𝐹 𝑖, 𝑗
• term frequency of keyword 𝑖 in document 𝑗
• 𝐼𝐷𝐹 𝑖
• inverse document frequency calculated as 𝑰𝑫𝑭 𝒊 = 𝒍𝒐𝒈
𝑵
𝒏 𝒊
• 𝑁 : number of all recommendable documents
• 𝑛 𝑖 : number of documents from 𝑁 in which keyword 𝑖 appears
• 𝑇𝐹 − 𝐼𝐷𝐹
• is calculated as: 𝑻𝑭-𝑰𝑫𝑭 𝒊, 𝒋 = 𝑻𝑭 𝒊, 𝒋 ∗ 𝑰𝑫𝑭 𝒊
Term-Frequency - Inverse Document
Frequency (𝑻𝑭 − 𝑰𝑫𝑭)
Cosine similarity
• Usual similarity metric to compare vectors: Cosine similarity (angle)
• Cosine similarity is calculated based on the angle between the vectors
• 𝑠𝑖𝑚 𝑎, 𝑏 =
𝑎∙𝑏
𝑎 ∗ 𝑏
• Adjusted cosine similarity
• take average user ratings into account ( 𝑟𝑢), transform the original ratings
• U: set of users who have rated both items a and b
• 𝑠𝑖𝑚 𝑎, 𝑏 = 𝑢∈𝑈 𝑟 𝑢,𝑎− 𝑟 𝑢 𝑟 𝑢,𝑏− 𝑟 𝑢
𝑢∈𝑈 𝑟 𝑢,𝑎− 𝑟 𝑢
2
𝑢∈𝑈 𝑟 𝑢,𝑏− 𝑟 𝑢
2
An example for computing cosine similarity of annotations
To calculate cosine similarity between two texts t1 and t2, they are
transformed
in vectors as shown in the Table
Probabilistic methods
Calculation of probabilities in simplistic approach
Item1 Item2 Item3 Item4 Item5
Alice 1 3 3 2 ?
User1 2 4 2 2 4
User2 1 3 3 5 1
User3 4 5 2 3 3
User4 1 1 5 2 1
X = (Item1 =1, Item2=3, Item3= … )
Item1 Item5
Alice 2 ?
User1 1 2
 Idea of Slope One predictors is simple and is based on a popularity
differential between items for users
 Example:
 p(Alice, Item5) =
 Basic scheme: Take the average of these differences of the co-ratings to
make the prediction
 In general: Find a function of the form f(x) = x + b
Slope One predictors
-
2 + ( 2 - 1 ) = 3
Relevant
Nonrelevant
• Most learning methods aim to find coefficients of a linear model
• A simplified classifier with only two dimensions can be represented by a line
 Other linear classifiers:
– Naive Bayes classifier, Rocchio method, Windrow-Hoff algorithm, Support vector machines
Linear classifiers
 The line has the form 𝒘 𝟏 𝒙 𝟏 + 𝒘 𝟐 𝒙 𝟐 = 𝒃
– 𝑥1 and 𝑥2 correspond to the vector
representation of a document (using e.g. TF-IDF
weights)
– 𝑤1, 𝑤2 and 𝑏 are parameters to be learned
– Classification of a document based on checking
𝑤1 𝑥1 + 𝑤2 𝑥2 > 𝑏
 In n-dimensional space the classification
function is 𝑤 𝑇 𝑥 = 𝑏
– Mean Absolute Error (MAE) computes the deviation
between predicted ratings and actual ratings
– Root Mean Square Error (RMSE) is similar to MAE,
but places more emphasis on larger deviation
Metrics Measure Error rate
Next …
• Hybrid recommendation systems
• More theories
• Boolean and Vector Space
Retrieval Models
• Clustering
• Data mining
• And so on
Ad

More Related Content

What's hot (20)

Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
D Yogendra Rao
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
Trieu Nguyen
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
Anamta Sayyed
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
Robin Reni
 
Recommender system
Recommender systemRecommender system
Recommender system
Nilotpal Pramanik
 
Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system
Mauryasuraj98
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
David Zibriczky
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
Milind Gokhale
 
Movie Recommendation engine
Movie Recommendation engineMovie Recommendation engine
Movie Recommendation engine
Jayesh Lahori
 
Movie lens movie recommendation system
Movie lens movie recommendation systemMovie lens movie recommendation system
Movie lens movie recommendation system
Gaurav Sawant
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Lior Rokach
 
Content based recommendation systems
Content based recommendation systemsContent based recommendation systems
Content based recommendation systems
Aravindharamanan S
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Carlos Castillo (ChaTo)
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
Stanley Wang
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Girish Khanzode
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
Crossing Minds
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
Vikrant Arya
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptx
randominfo
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
Liang Xiang
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
Kapil Garg
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
D Yogendra Rao
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
Trieu Nguyen
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
Anamta Sayyed
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
Robin Reni
 
Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system Movie recommendation system using collaborative filtering system
Movie recommendation system using collaborative filtering system
Mauryasuraj98
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
David Zibriczky
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
Milind Gokhale
 
Movie Recommendation engine
Movie Recommendation engineMovie Recommendation engine
Movie Recommendation engine
Jayesh Lahori
 
Movie lens movie recommendation system
Movie lens movie recommendation systemMovie lens movie recommendation system
Movie lens movie recommendation system
Gaurav Sawant
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Lior Rokach
 
Content based recommendation systems
Content based recommendation systemsContent based recommendation systems
Content based recommendation systems
Aravindharamanan S
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
Stanley Wang
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
Crossing Minds
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
Vikrant Arya
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptx
randominfo
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
Liang Xiang
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
Kapil Garg
 

Viewers also liked (6)

How To Implement a CMS
How To Implement a CMSHow To Implement a CMS
How To Implement a CMS
Jonathan Smith
 
How to build a Recommender System
How to build a Recommender SystemHow to build a Recommender System
How to build a Recommender System
Võ Duy Tuấn
 
How to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based FilteringHow to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based Filtering
Võ Duy Tuấn
 
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Bartlomiej Twardowski
 
Systemy rekomendacji
Systemy rekomendacjiSystemy rekomendacji
Systemy rekomendacji
Adam Kawa
 
Collaborative filtering for recommendation systems in Python, Nicolas Hug
Collaborative filtering for recommendation systems in Python, Nicolas HugCollaborative filtering for recommendation systems in Python, Nicolas Hug
Collaborative filtering for recommendation systems in Python, Nicolas Hug
Pôle Systematic Paris-Region
 
How To Implement a CMS
How To Implement a CMSHow To Implement a CMS
How To Implement a CMS
Jonathan Smith
 
How to build a Recommender System
How to build a Recommender SystemHow to build a Recommender System
How to build a Recommender System
Võ Duy Tuấn
 
How to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based FilteringHow to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based Filtering
Võ Duy Tuấn
 
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Bartlomiej Twardowski
 
Systemy rekomendacji
Systemy rekomendacjiSystemy rekomendacji
Systemy rekomendacji
Adam Kawa
 
Collaborative filtering for recommendation systems in Python, Nicolas Hug
Collaborative filtering for recommendation systems in Python, Nicolas HugCollaborative filtering for recommendation systems in Python, Nicolas Hug
Collaborative filtering for recommendation systems in Python, Nicolas Hug
Pôle Systematic Paris-Region
 
Ad

Similar to Content based filtering (20)

ppt on sentiment analysis using various techniques
ppt on sentiment analysis using various techniquesppt on sentiment analysis using various techniques
ppt on sentiment analysis using various techniques
NiharikaKhanna19
 
Book Recommendation Engine
Book Recommendation EngineBook Recommendation Engine
Book Recommendation Engine
ShravaniBheema
 
sa-mincut-aditya.ppt
sa-mincut-aditya.pptsa-mincut-aditya.ppt
sa-mincut-aditya.ppt
aashnareddy1
 
sa.ppt
sa.pptsa.ppt
sa.ppt
INyomanSwitrayana
 
sa-mincut-aditya.ppt
sa-mincut-aditya.pptsa-mincut-aditya.ppt
sa-mincut-aditya.ppt
ShaliniVerma380300
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
NagendraBabu27244
 
Recommenders.ppt
Recommenders.pptRecommenders.ppt
Recommenders.ppt
Aravind Reddy
 
Publish or Perish: Towards a Ranking of Scientists using Bibliographic Data ...
Publish or Perish:  Towards a Ranking of Scientists using Bibliographic Data ...Publish or Perish:  Towards a Ranking of Scientists using Bibliographic Data ...
Publish or Perish: Towards a Ranking of Scientists using Bibliographic Data ...
Lior Rokach
 
Digital Image Processing.pptx
Digital Image Processing.pptxDigital Image Processing.pptx
Digital Image Processing.pptx
MukhtiarKhan5
 
Scalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSHScalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSH
Maruf Aytekin
 
Simple semantics in topic detection and tracking
Simple semantics in topic detection and trackingSimple semantics in topic detection and tracking
Simple semantics in topic detection and tracking
George Ang
 
Filtering content bbased crs
Filtering content bbased crsFiltering content bbased crs
Filtering content bbased crs
Aravindharamanan S
 
IRT Unit_ 2.pptx
IRT Unit_ 2.pptxIRT Unit_ 2.pptx
IRT Unit_ 2.pptx
thenmozhip8
 
unit -4MODELING AND RETRIEVAL EVALUATION
unit -4MODELING AND RETRIEVAL EVALUATIONunit -4MODELING AND RETRIEVAL EVALUATION
unit -4MODELING AND RETRIEVAL EVALUATION
karthiksmart21
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
Sai Srinivas Kotni
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
DataminingTools Inc
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
Datamining Tools
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
guest0edcaf
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
Tamer Rezk
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用
台灣資料科學年會
 
ppt on sentiment analysis using various techniques
ppt on sentiment analysis using various techniquesppt on sentiment analysis using various techniques
ppt on sentiment analysis using various techniques
NiharikaKhanna19
 
Book Recommendation Engine
Book Recommendation EngineBook Recommendation Engine
Book Recommendation Engine
ShravaniBheema
 
sa-mincut-aditya.ppt
sa-mincut-aditya.pptsa-mincut-aditya.ppt
sa-mincut-aditya.ppt
aashnareddy1
 
Publish or Perish: Towards a Ranking of Scientists using Bibliographic Data ...
Publish or Perish:  Towards a Ranking of Scientists using Bibliographic Data ...Publish or Perish:  Towards a Ranking of Scientists using Bibliographic Data ...
Publish or Perish: Towards a Ranking of Scientists using Bibliographic Data ...
Lior Rokach
 
Digital Image Processing.pptx
Digital Image Processing.pptxDigital Image Processing.pptx
Digital Image Processing.pptx
MukhtiarKhan5
 
Scalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSHScalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSH
Maruf Aytekin
 
Simple semantics in topic detection and tracking
Simple semantics in topic detection and trackingSimple semantics in topic detection and tracking
Simple semantics in topic detection and tracking
George Ang
 
IRT Unit_ 2.pptx
IRT Unit_ 2.pptxIRT Unit_ 2.pptx
IRT Unit_ 2.pptx
thenmozhip8
 
unit -4MODELING AND RETRIEVAL EVALUATION
unit -4MODELING AND RETRIEVAL EVALUATIONunit -4MODELING AND RETRIEVAL EVALUATION
unit -4MODELING AND RETRIEVAL EVALUATION
karthiksmart21
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
Sai Srinivas Kotni
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
DataminingTools Inc
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
Datamining Tools
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
guest0edcaf
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
Tamer Rezk
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用
台灣資料科學年會
 
Ad

Recently uploaded (20)

Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 

Content based filtering

  • 1. Content-based recommendation  The requirement  some information about the available items such as the genre ("content")  some sort of user profile describing what the user likes (the preferences) • “Similarity” is computed from item attributes, e.g., • Similarity of movies by actors, director, genre • Similarity of text by words, topics • Similarity of music by genre, year  The task:  learn user preferences  locate/recommend items that are "similar" to the user preferences "show me more of the same what I've liked"
  • 2. • Most Content Based-recommendation techniques were applied to recommending text documents. • Like web pages or newsgroup messages for example. • Content of items can also be represented as text documents. • With textual descriptions of their basic characteristics. • Structured: Each item is described by the same set of attributes Title Genre Author Type Price Keywords The Night of the Gun Memoir David Carr Paperback 29.90 Press and journalism, drug addiction, personal memoirs, New York The Lace Reader Fiction, Mystery Brunonia Barry Hardcover 49.90 American contemporary fiction, detective, historical Into the Fire Romance, Suspense Suzanne Brockmann Hardcover 45.90 American fiction, murder, neo-Nazism
  • 3.  Item representation Content representation and item similarities • Approach • Compute the similarity of an unseen item with the user profile based on the keyword overlap (e.g. using the Dice coefficient) • Or use and combine multiple metrics Title Genre Author Type Price Keywords The Night of the Gun Memoir David Carr Paperback 29.90 Press and journalism, drug addiction, personal memoirs, New York The Lace Reader Fiction, Mystery Brunonia Barry Hardcover 49.90 American contemporary fiction, detective, historical Into the Fire Romance, Suspense Suzanne Brockmann Hardcover 45.90 American fiction, murder, neo- Nazism  User profile Title Genre Author Type Price Keywords … Fiction Brunonia, Barry, Ken Follett Paperback 25.65 Detective, murder, New York 𝟐 × 𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒊 ∩ 𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒋 𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒊 + 𝒌𝒆𝒚𝒘𝒐𝒓𝒅𝒔 𝒃𝒋 𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑏𝑗 describes Book 𝑏𝑗 with a set of keywords
  • 4. Term-Frequency - Inverse Document Frequency (𝑻𝑭 − 𝑰𝑫𝑭) • Simple keyword representation has its problems • in particular when automatically extracted as • not every word has similar importance • longer documents have a higher chance to have an overlap with the user profile • Standard measure: TF-IDF • Encodes text documents in multi-dimensional Euclidian space • weighted term vector • TF: Measures, how often a term appears (density in a document) • assuming that important terms appear more often • normalization has to be done in order to take document length into account • IDF: Aims to reduce the weight of terms that appear in all documents
  • 5. • Given a keyword 𝑖 and a document 𝑗 • 𝑇𝐹 𝑖, 𝑗 • term frequency of keyword 𝑖 in document 𝑗 • 𝐼𝐷𝐹 𝑖 • inverse document frequency calculated as 𝑰𝑫𝑭 𝒊 = 𝒍𝒐𝒈 𝑵 𝒏 𝒊 • 𝑁 : number of all recommendable documents • 𝑛 𝑖 : number of documents from 𝑁 in which keyword 𝑖 appears • 𝑇𝐹 − 𝐼𝐷𝐹 • is calculated as: 𝑻𝑭-𝑰𝑫𝑭 𝒊, 𝒋 = 𝑻𝑭 𝒊, 𝒋 ∗ 𝑰𝑫𝑭 𝒊 Term-Frequency - Inverse Document Frequency (𝑻𝑭 − 𝑰𝑫𝑭)
  • 6. Cosine similarity • Usual similarity metric to compare vectors: Cosine similarity (angle) • Cosine similarity is calculated based on the angle between the vectors • 𝑠𝑖𝑚 𝑎, 𝑏 = 𝑎∙𝑏 𝑎 ∗ 𝑏 • Adjusted cosine similarity • take average user ratings into account ( 𝑟𝑢), transform the original ratings • U: set of users who have rated both items a and b • 𝑠𝑖𝑚 𝑎, 𝑏 = 𝑢∈𝑈 𝑟 𝑢,𝑎− 𝑟 𝑢 𝑟 𝑢,𝑏− 𝑟 𝑢 𝑢∈𝑈 𝑟 𝑢,𝑎− 𝑟 𝑢 2 𝑢∈𝑈 𝑟 𝑢,𝑏− 𝑟 𝑢 2
  • 7. An example for computing cosine similarity of annotations To calculate cosine similarity between two texts t1 and t2, they are transformed in vectors as shown in the Table
  • 9. Calculation of probabilities in simplistic approach Item1 Item2 Item3 Item4 Item5 Alice 1 3 3 2 ? User1 2 4 2 2 4 User2 1 3 3 5 1 User3 4 5 2 3 3 User4 1 1 5 2 1 X = (Item1 =1, Item2=3, Item3= … )
  • 10. Item1 Item5 Alice 2 ? User1 1 2  Idea of Slope One predictors is simple and is based on a popularity differential between items for users  Example:  p(Alice, Item5) =  Basic scheme: Take the average of these differences of the co-ratings to make the prediction  In general: Find a function of the form f(x) = x + b Slope One predictors - 2 + ( 2 - 1 ) = 3
  • 11. Relevant Nonrelevant • Most learning methods aim to find coefficients of a linear model • A simplified classifier with only two dimensions can be represented by a line  Other linear classifiers: – Naive Bayes classifier, Rocchio method, Windrow-Hoff algorithm, Support vector machines Linear classifiers  The line has the form 𝒘 𝟏 𝒙 𝟏 + 𝒘 𝟐 𝒙 𝟐 = 𝒃 – 𝑥1 and 𝑥2 correspond to the vector representation of a document (using e.g. TF-IDF weights) – 𝑤1, 𝑤2 and 𝑏 are parameters to be learned – Classification of a document based on checking 𝑤1 𝑥1 + 𝑤2 𝑥2 > 𝑏  In n-dimensional space the classification function is 𝑤 𝑇 𝑥 = 𝑏
  • 12. – Mean Absolute Error (MAE) computes the deviation between predicted ratings and actual ratings – Root Mean Square Error (RMSE) is similar to MAE, but places more emphasis on larger deviation Metrics Measure Error rate
  • 13. Next … • Hybrid recommendation systems • More theories • Boolean and Vector Space Retrieval Models • Clustering • Data mining • And so on
  翻译: