Exciting world of Reinforcement Learning - a case for Consumer businesses
Ever since I got curious and hooked on to the field of reinforcement learning and its numerous applications for the industry, my excitement for the field has only gotten stronger by the day. Here, I’d like to share some of my learnings about the potential applications of reinforcement learning (RL) for consumer businesses. But, before I dive into the details, a quick introduction about RL for ML practitioners who are new to the subject.
RL is a branch of Machine Learning involving the training of smart agent that can learn to perform a goal through trial & error in an environment and at the end of the training we have an agent that can perform the goal in real life independently. Now, if you are familiar with the other types of ML — supervised & unsupervised learning techniques, this might sound very similar to the supervised learning approach. But the big difference between the two (among other differences) is that RL does not require any explicit labels to be provided unlike in supervised learning techniques (although there are always exceptions to this rule). For more details and context you may read several blogs/articles on RL. You could also lookup some of the ground breaking work done by Deepmind, OpenAI to learn more about the accomplishments over the past several years and also read the book ‘Reinforcement Learning — An Introduction’ by Richard S. Sutton & Andrew G. Barto to learn how the field of RL came about.
Among the many super exciting applications of RL, my search focussed on applications on the personalisation use cases for consumer businesses. While my use case is focussed on media & publishing industry, this could very easily be extended to other consumer focussed industries such as such as e-tailers, travel/hospitality, etc.
a) NL delivery personalisation — One of the primary sources of traffic for any media & publishing firm is through newsletters and following are ways RL could be used to drive personalised experiences for the readers. We often experience that the newsletters from our favourite daily/weekly newspapers and magazines reach us at the same time regardless of the time we want or read it. In other words, its not uncommon that newsletters get blast out to all users at the same time of the day and the same day of the week. Now, in the current era of digitisation, this need not be. The ideal solution would be to send it at the time when its highly likely to be opened by the user.
b) NL capacity identification — Secondly, another issue marketers grapple with often is to identify the optimum number of NLs to send to a subscriber. ‘How many is too many emails per user?’. Its common knowledge that the NL appetite varies from user to user and it is not always the same. Yet, we are used to sending the same number of emails to all subscribers all the time. I admit that there is no easy way to dynamically determine this magic number for every user. But, with the application RL techniques, this is a problem that could be solved for.
c) Personalised box-subscriptions — Box subscriptions are subscription products that is designed such that a subscription issue consists of a certain assortment of products. eg. a monthly subscription of a beauty box would contain an random assortment of beauty products such as products for face, skin, hair etc. The next month issue could be a totally different assortment of products. Pls note that the subscriber has no choice in selecting the products he/she wants in this model and the only feedback that’s there from the user is the user renewing the subscription. The major challenge in this problem is in identifying the right product mix that would maximise the retention of our subscribers.
Formulating this problem as an RL problem, we could determine the optimal assortment for every subscription issue that is most personalised for a user while maximising the retention of a subscriber.
d) Dynamic paywall metering — In the digital media & publishing industry one of the key decisions that publishers need to make is about a tradeoff between allowing users to read articles for free while making revenue through serving ads and blocking free access in the form of a digital paywall after a certain limit of articles inducing the user to subscribe. The call to action from a paywall could either be to subscribe or to get the reader to register in order to read on further. Usually the paywall meter is set number of free articles -2/4/6 per month for all users.
Recommended by LinkedIn
But such an implementation is a not an optimal solution because a loyal reader of a brand, would continue to read more articles contributing through more ad revenue and cutting this user’s readership to just 4 articles per month is cutting off potentially more ad revenue from this user prematurely. Ideally, we could bring up the paywall for such as user after may be 6–7 articles per month.On the other hand, a less engaged user who is unlikely to return to read a second article need not have a 4 article limit as this user is unlikely to generate any revenue through ads so we could set a pay wall even for the second visit for such a user and push this user to subscribe.
Instead of setting up such manual rules for each user, RL could learn the reading pattern of each user and recommend a paywall limit to maximise the revenue potential for each user. Not just that, it’s learning adapts itself to the changing reading behaviour of each and every user over time and automatically adjusts the paywall limit suitably to maximise the revenue potential for the brand.
I hope you found the above business use cases for the application RL to be insightful & useful. If you like it, kindly leave a like on the article. I’d like to followup this article with a few use cases for the ad/commercial business and few learning resources that I found to be useful for building the RL capability within my team.
Manager, RPA Delivery | MBA | IIM Lucknow
3yIlluminating article in layman's terms
Management Consultant Chennai
3yBroad insight.