From the course: Recommendation Systems: A Practical Hands-On Introduction
Coding: Collaborative filtering algorithm - Python Tutorial
From the course: Recommendation Systems: A Practical Hands-On Introduction
Coding: Collaborative filtering algorithm
- [Instructor] Let's take a look at how to implement SAR and here in the 02 SAR MovieLens notebook, I'm going to run it first and I'm going to hide this. As we discussed, SAR is one of the best algorithms you can use if you're starting your journey in recommendation systems. The idea is that SAR is going to recommend items that are the most similar items that the user has already an affinity for, and in this number we're going to see how to program SAR using Python. The first thing we do is they import us always and we're going to use MovieLens data. The schema SAR is going to use is user ID, item ID, timestamp, and optionally you can add the event type or a weight, and this is very useful if for example, you're in an e-commerce and you have different interactions, the interactions buy is the strongest interaction and that has the highest affinity. So you are going to give this interaction a large weight, whereas other interactions like click or view will have a lower weight. For this another we're going to use MovieLens dataset, and what we're going to do first is we're going to load it. This is how MovieLens looks like. Use our item rating and timestamp. Then we're going to do a stratified split. In this case, we are stratifying by user. So you see here that we have the same number of users and the structure that we're going to follow is first we instantiate the algorithm using the different parameters and then we're going to just train. And what is going to happen is that it's going to first calculate the affinity matrix and then it is going to compute the similarity based on our concurrence or depending on the similarity metric that you're using, then it's going to train. You see that this is quite fast, and it is quite fast, because SAR is actually not a machine learning algorithm. It's more like a matrix multiplication, so it's a algebraic algorithm. And then we're going to predict on the test set, and this is how it looks like. Now the last thing we're going to do is, we're going to do the offline evaluation, and here I added both the ranking metrics and the rating metrics. Typically, for this I use case recommendations system. It is more interesting the ranking metrics, so it's not just looking at how accurate the model is able to predict the exact rating in the dataset. We are more interested in the order. Here we see the results, so we see MAP and the CG position, recall, et cetera. These are the metrics that we're going to use when we want to compare between algorithms offline. And now let's quickly look at one of the users. And here what we did is we just took the test set for one of the user and we compared the ground truth with the prediction. So the first thing you'll notice is that the prediction of SAR is not a rating, it's just a score. It measures the strength of the interest between a specific user and a specific item. So we're interested more in the order rather than in the absolute value. This algorithm is quite fast algorithm, is scalable, if the dataset is quite high, there is spark implementation in recommenders that you can use and is definitely one of the first algorithms that I will use if I were started in a new team.
Contents
-
-
-
-
Recommendation systems algorithms3m
-
Collaborative filtering6m 12s
-
Content-based filtering4m 33s
-
Building your first collaborative filtering solution4m 40s
-
Building your first content-based filtering solution3m 37s
-
Evaluation of recommendation systems4m 14s
-
Coding: Collaborative filtering algorithm3m 59s
-
Coding: Content-based filtering algorithm3m 29s
-
-
-
-