Start free trial Sign in

From the course: Recommendation Systems: A Practical Hands-On Introduction

Recommendation system architectures - Python Tutorial

From the course: Recommendation Systems: A Practical Hands-On Introduction

Start my 1-month free trial

Recommendation system architectures

“

- To become a top data scientist in your company, it is not enough to know a lot about recommendation systems. You need to generate business value with them. Since this is a practical course, in this video, we're going to talk about the deployment techniques that companies use today to deploy recommendation systems. In Reco there are three general architectures that are used. The batch architecture is the simplest way to deploy our recommendation system. The idea is as follows. We gather the data from whatever storage or data process we have and load it into a compute. The compute can be a spark compute or a virtual machine. The next step is to try different machine learning algorithms. Unfortunately, as we talked about, in recommendation systems there's not a best algorithm all the time. We need to do a lot of experiments with different algorithms to find the one that works with our data. Once you select the algorithm that provides the best performance, you're going to score a top-k recommendation for each user. The output you get is a table of a user per row and a set of 10 to 50 recommended items for each user. Next, you store that table in a database. Finally, the recommendations are served from the front end by just doing a query to the database. Optionally, we can have a set of business rules to filter their recommendations. Examples of business rules are removing items that are out of stock, removing items that are not available in the user's country, et cetera. A real-time recommender is similar to the batch architecture with small variation. The data ingestion is the same. The machine learning computation is the same. But the way we deploy the solution is different. Instead of scoring the recommendations and storing them in a database, in the real-time architecture, we deploy the machine learning model in a production cluster. Every time a user enters the front end, we're going to make a query to the machine learning model that is loaded in memory. The model is going to score in real time, and the recommended items are going to be shown to the user in the front end. If we compare the batch architecture with a real-time architecture, we see the following. The batch architecture is simpler to implement. It can easily handle any machine learning algorithm, no matter the size, and it provides the fastest request time. Its main disadvantage is that the computation of the model is typically done once a day, so the list of items the user gets won't change no matter the behavior of the user during the session of that day. The real-time architecture is able to adapt in real time to the behavior of the user. However, it is more difficult to implement. An alarm machine learning model will produce slow request time. Imagine you find yourself in this situation. Your company asks you to develop a record that is able to adapt to the user's behavior in real time, and also that it's served with less than 20-milliseconds latency. Is there a way to use a complex machine learning algorithm and at the same time have a small request time? Well, yes. The hybrid architecture or two-step recommender has the best of both worlds. The idea is to use a combination of batch and real-time architecture. In the first step, we use a heavy machine learning algorithm to generate a set of candidates. For each user we score a list of 100 to 500 candidate items that we store in a database. In this phase, we want to optimize for recall, we want to capture the relevant items for each user. In the second phase, we're going to use a real-time algorithm that just reranks the list of candidates for each user in real time. Since this is a liked algorithm, the request time is much faster. And at the same time, we are able to capture the recent behavior of the user and provide a more adequate response. There are three ways of deploying our recommendation system. The first solution I will use, if I were starting on a new team, is the batch architecture. It is easy to implement and provides a fast request time. Once the solution mature, I will either go to the real-time architecture or the hybrid.

Contents

翻译：