Recommendation Systems
What is a Recommendation System?
Every personalized ad on Google, recommended video on YouTube, or Discover Weekly playlist on Spotify is made possible with a set of algorithms and methods under the general name of recommendation systems. These systems give either personalized or general recommendations on a service, both for the interest of the user and the provider. In the case of personalized recommendation systems, these can be defined as programs that try to recommend relevant products or services to a user based on the past information collected from the user.
There are two interconnected driving forces behind the advance of recommendation systems. The first one is known as the long tail phenomenon. The contrast between online stores or service providers and physical ones is the number of resources. A physical store has a scarcity; since it is limited by space, it can only show a certain amount of items. An online retailer, however, does not have such a problem; it can display millions of items. In this abundance, some items are bound to get lost if they are not popular enough to be known and searched by name. These lower popularity niche items make up the long tail of products and they can resurface with the recommendation systems deployed on the website.
The other related reason is that with the advance of technology, the shift from the old tradition of mass production of a single item to a personalized and varied production is possible and must be observed. [Joe Pine, Mass Customization, 1993] To keep an edge, companies have to serve multiple customers with varying interests with multiple products. This becomes possible with recommendation systems. Every user sees a different landing page on online retailer or news websites.
Older Recommendation Approaches
E-commerce and online shopping, in general, started taking off around the turn of the millennium. By 1999, Amazon and eBay were already in business, albeit in much smaller scales. There were movie websites like Moviefinder.com and Reel.com that were leveraging recommendation systems. However, one of the basic needs to create well-performing algorithms is data.
By June 2000, only 22% of Americans had ever bought a product online. While people looked for information online about a product they wanted to buy, they went to physical stores to buy them. There was a general lack of trust for the Internet. This made it hard for companies to gather data from users to perform personalized recommendations.
The most well-known recommender systems algorithms that emerged around this time were content-based and collaborative filtering.
Content-based recommendations
In this approach, item similarities are taken as the basis for recommendations. Items are profiled according to their attributes. For a book, this can be its genre, author, publish date, country of origin, etc. Then, based on these profiles, items that are similar to the ones that the user bought, or have viewed, are recommended to the user.
However, there are several difficulties with this approach. Profiling items and selecting attributes can be difficult. For a book or a movie, attributes can be selected more easily. But for ephemeral products like news articles or furniture, it is usually harder. Adding to this, it’s harder to recommend items that are outside of the category of the selected item. For example, if a user is interested in a book about cameras, the system might not be able to recommend cameras or tripods since they have different feature spaces and categories.
Collaborative filtering
This approach takes user behaviors as the basis for recommendations. Assume that we have a user A and a user B. If user A and B have bought the same products, or have rated the same movies similarly, we can recommend user A the products user B likes.
This approach has cold start and popularity bias problems. This approach cannot recommend anything to new users without any rating since it cannot find similar users. Also, it recommends only from the items that like-minded users have enjoyed. The more popular items, especially if they are well-liked within the community as a whole, rise to the top for recommendations.
Hybrid systems utilize both of these techniques to overcome their shortcomings.
Other than these, non-personalized recommendations were also common with websites showing the most popular items without accounting for the particular user viewing. Another way of non-personalized recommendations was to show the average ratings of items (Amazon) or the average rating for sellers (eBay). Manual searches, such as searching based on keywords, or filtering categories, were utilized on almost every website.
Another aspect to consider is the recommendation interface. After finding certain recommendations, how they are displayed to the user holds great importance. It was common to see practices similar to those today, such as showing similar items on a product page, sending a newsletter via email for new or unrated products the user might be interested in, displaying average ratings for products and sellers, etc. Business and marketing were usually the basis of these choices, with the added capabilities of the Internet.
The algorithms described above had many shortcomings as will be discussed later. Algorithms usually ran offline in batches since data collection was harder and real-time predictions were much less common. However, as the number of customers these websites attracted grew, the algorithms became harder to scale. Also, new players joined the field that wanted to make use of recommendations, such as social media platforms.
Period of Growth
A paper published in 2003 by Greg Linden, Brent Smith, and Jeremy York from Amazon made accurate recommendations possible. The paper introduced item-to-item collaborative filtering in which the items bought together are considered to be similar. This computation is expensive, but it can be done offline, and the result table can be used as a lookup table. The online computation is therefore quite fast and performs well.
This item-to-item collaborative filtering approach is still being used today and has been selected as the article that stood the “Test of Time” by IEEE Internet Computing committee.
Then came the contest now known as the Netflix Prize. In 2006, Netflix announced that they would give a million dollars for the algorithm that increased their existing accuracy by 10%. They made an enormous amount of data public, consisting of 6 years worth of total data, from 480,000 users and a million ratings.
The competition ended after three years and BellKor’s Pragmatic Chaos team won the prize by incorporating numerous algorithms to create a layered, multi-scale model to reach a 10.09% improvement over Netflix’s CineMatch algorithm.
This competition boosted the popularity of recommendation systems and popularized Single Value Decomposition (SVD)-based matrix factorization algorithms. SVD matrix factorization enabled inferring latent factors from ratings. Matrix factorization is still seen as a very beneficial addition to collaborative filtering and is used in numerous settings.
However, in the end, the winning algorithm was not used by Netflix in production. The engineering effort to bring the algorithm to the production stage was too great. Netflix had grown so much in scale in three years and had become an instant streaming platform.
The amount of Internet usage, online shopping, and instant streaming grew extremely fast in a few years which brought the concerns of scalability and performance to the forefront.
Newer Recommendation Approaches
Now, we are seeing more personalized content than ever before. One important reason is the advance in cheaper hardware, and the computing systems accompanying these. It is now possible to reserve computation for specific, personal cases, while it was not feasible to do so when hardware and resources, in general, were more valuable.
Traditional algorithms like collaborative filtering have been widely applied in the industry with modifications like matrix factorization. But serving these algorithms in smaller time intervals is still an issue and time sensitivity is becoming more important. The Internet is moving fast; popular content is changing every day.
Numerous algorithms are emerging to deal with the extreme amount of data the companies have and the time constraints. Each use case presents its own unique problems which makes specialized solutions necessary.
Netflix has one of the biggest platforms, serving millions of people. 75% of views on the platform are powered by a recommendation. Some newer concerns that emerged with the rapid globalization of Netflix around their recommendation systems include accounting for the availability of a product in a certain region or country, cultural awareness and regional popularity, and language barriers for movies and accessibility. These concerns can be generalized to other global companies as well. Amazon needs to keep track of stocks in global warehouses and general item availability to prevent recommending unavailable products to customers. Youtube creates popular videos based on the location of the user.
It’s hard to get information about the details of the algorithms a particular company uses for their recommendations. However, there are new approaches that are known to be utilized by big companies.
Multi-Armed Bandit Approach
One of these approaches is the multi-armed bandit which is used by Netflix in terms of exploration and exploitation. In multi-armed bandit algorithms, the aim is to maximize the reward taken by a decision, which can be maximizing click-through rate or watch time of a set of personalized shows or movies. Applying this algorithm enables scoring and generating better recommendations for the user and can be scalable.
Deep Neural Networks Approach
YouTube employs a different approach with deep neural networks for solving two problems: candidate generation and ranking. With the scale and dynamic content YouTube has, they need a robust method to be able to recommend relevant videos to each user. In their approach, they treat recommendations as extreme multiclass classification to create a subset of videos that the user might be interested in, process the candidate generation, and then score these videos to show a sorted list to the user, ranked, utilizing deep neural networks in both steps.
Deep Learning with DSSTNE
Amazon also open sourced the Deep Scalable Sparse Tensor Network Engine, DSSTNE, for building deep learning models for recommendations. Their approach offers better scalability and greater resource utilization leading to better performance.
Continued Advancements
Other than new approaches, deploying machine learning models for recommendations is getting easier with machine learning platforms of cloud providers like Azure ML Studio, Google Cloud Compute Engine, and AWS. Because of the increased accessibility and ready-to-use models in these platforms, recommendation systems are easier to implement and deploy.
As for the recommendation interfaces, they have evolved quite a lot, with recommendations taking place in multiple locations on a website. A fascinating example of this is the artwork personalization at Netflix. They generate different stills to represent each show or movie based on the user information that they have and display different stills to different users.
The Future of Recommendation Systems
Recommendation systems still face a multitude of problems that range from accuracy and large scale to user privacy and compliance with regulations. Users can enable or disable recommendations, they can request their data not to be collected for advertisement purposes, and they can request for their data to be deleted.
Further challenges are from social implications. One of these is the concept of filter bubbles. Recommendation systems can greatly enforce personal opinions by not showing content outside of one’s point of view. They need to be designed to not narrow deeply on a person’s perspective.
Another issue is the perceived radicalization on platforms like YouTube. More extreme and eye-catching videos are usually recommended more, and these can sometimes lead to highly radical content.
Recommender systems are still developing, and as popular as they are, they have come under even more scrutiny. We have no doubt that they will continue evolving toward a more personalized and hopefully more socially responsible direction.