Recommendation systems are deployed in applications related to e-learning, online shopping, social media, etc., to provide personalized experiences to their users while harnessing the power of big data and machine learning.
Recommendation systems are broadly categorized into the following three types:
Content-based filtering
Collaborative filtering
User-based collaborative filtering
Item-based collaborative filtering
Hybrid filtering
Collaborative filtering considers a user’s past behavior and builds a recommendation model on top of that. For any target user, this past behavior is judged in two ways, i.e., user-user similarity and item-item similarity.
User-based collaborative filtering: This technique identifies users who have similar tastes to the target user and suggests items that similar users have enjoyed.
Item-based collaborative filtering: This method finds items that share similarities with the ones the target user has shown interest in and then suggests those similar items to the user.
Let’s suppose there is a dataset of user-item interactions where users have rated movies on a scale of 1 (lowest) to 5 (highest). The data looks as shown in the following table.
Movie1 | Movie2 | Movie3 | Movie4 | |
User1 | 4 | 5 | 3 | - |
User2 | - | 3 | - | 4 |
User3 | 5 | 4 | - | 2 |
In this matrix, each row represents a user, and each column represents a movie. The users have provided the ratings, and “-” indicates that a user hasn’t rated a particular movie.
The next step for item-based collaborative filtering is to calculate the similarity between items. One common similarity metric is cosine similarity.
The similarity calculations for all pairs of movies result in an item similarity matrix like the one shown below.
Movie1 | Movie2 | Movie3 | Movie4 | |
Movie1 | 1 | 0.88 | 0.62 | 0.35 |
Movie2 | 0.88 | 1 | 0.71 | 0.63 |
Movie3 | 0.62 | 0.71 | 1 | 0 |
Movie4 | 0.35 | 0.63 | 0 | 1 |
Finally, to make recommendations for a user, we take their ratings and multiply them by the respective similarity scores. We then sum up the weighted scores and divide by the sum of the absolute similarities.
For example, if we want to recommend movies for User2, and they haven’t seen Movie1 and Movie3, we’ll use the following formula:
Where:
Based on the provided ratings and the item similarity matrix, the predicted ratings for User2 will be:
Movie1: ≈ 3.3
Movie3: ≈ 3.0
Based on the predicted ratings, we can now provide recommendations for User2 by ranking the movies in order of their predicted ratings and suggesting the top ones.
import numpy as npimport pandas as pd# Step 1: Create user-item interaction matrixinteraction_matrix = np.array([[4, 5, 3, 0],[0, 3, 0, 4],[5, 4, 0, 2]])# Step 2: Calculate item similarities (Cosine Similarity)def cosine_similarity(item_matrix):norm = np.linalg.norm(item_matrix, axis=0)similarity_matrix = np.dot(item_matrix.T, item_matrix) / (norm[:, None] * norm[None, :])np.fill_diagonal(similarity_matrix, 0) # Set diagonal elements to 0 to avoid self-similarityreturn similarity_matrix# Step 3: Generate recommendations for a specific userdef generate_recommendations(user_ratings, item_similarity_matrix):# Initialize an array to store the weighted sum and absolute similarity sumweighted_sum = np.zeros(item_similarity_matrix.shape[0])abs_similarity_sum = np.zeros(item_similarity_matrix.shape[0])# Iterate through each itemfor item_id, rating in enumerate(user_ratings):if rating != 0: # Ignore rated items# Find similar items (non-zero similarity) and their similarity scoressimilar_items = np.where(item_similarity_matrix[item_id] != 0)[0]sim_scores = item_similarity_matrix[item_id, similar_items]# Update the weighted sum and absolute similarity sumweighted_sum[similar_items] += rating * sim_scoresabs_similarity_sum[similar_items] += np.abs(sim_scores)# Calculate the final recommendationsrecommendations = np.zeros_like(user_ratings, dtype=float)non_zero_indices = np.where(abs_similarity_sum != 0)[0]recommendations[non_zero_indices] = weighted_sum[non_zero_indices] / abs_similarity_sum[non_zero_indices]# Exclude items already rated by the userrecommendations[user_ratings != 0] = 0return recommendations# Step 4: Provide recommendations for User2user2_ratings = interaction_matrix[1] # User2's ratingsitem_similarity_matrix = cosine_similarity(interaction_matrix)user2_recommendations = generate_recommendations(user2_ratings, item_similarity_matrix)# Display the non-zero recommendationsmovie_names = ["Movie1", "Movie2", "Movie3", "Movie4"]non_zero_recommendations = pd.DataFrame({"Movie": [movie_names[i] for i in np.where(user2_recommendations != 0)[0]],"PredictedRating": user2_recommendations[user2_recommendations != 0]})print("Movie Recommendations for User2")print(non_zero_recommendations)
Lines 4–9: We define an interaction matrix where every row represents a user, and every column represents an item.
Lines 12–16: The cosine_similarity()
function calculates the item similarities as described earlier, except that the diagonal elements are set to 0
instead of 1. Setting the diagonal elements to 0
in the cosine similarity matrix avoids self-similarity, ensuring that an item is not considered similar to itself in collaborative filtering. Therefore, it improves the quality of recommendations for unseen items.
Lines 19–42: The generate_recommendations()
function takes the user’s ratings and item similarity matrix, iterates through each item the user has interacted with, calculates the weighted sum and absolute similarity sum of similar items, and generates recommendations by dividing the weighted sum by the absolute similarity sum, excluding items already rated by the user.
Lines 45–55: We calculate the recommendations for User2
(as an example), using the functions created above.
Lines 57–58: We display the recommendations of unseen items for User2
.
This Answer gives a simple overview without diving into all the details. We can improve real applications by using more data, advanced techniques and considering various features. The true strength of item-based collaborative filtering comes from constantly improving and optimizing the technique.
Free Resources