Machine Learning / Recommender Systems

Let $Y \in R^{(m, k)}$ matrix of $m$ product ratings given by $k$ users. Assign $x^{(i)} \in R^n$ vector of features to each i-th product.

By minimizing linear regression cost function, find $\theta^{(j)} \in R^n$ for each user so that

$$y_{predicted}^{(i,j)} = (\theta^{(j)})^T x^{(i)}$$

gives the prediction of j-th user rating for i-th product.

In [1]:
# import libs
import sys
sys.path.append("../")

import matplotlib.pyplot as plt
import numpy as np
import scipy.io
import re
from scipy import optimize
from collaborative_filtering import *
In [2]:
def load_movie_ratings_data():
     return scipy.io.loadmat('../data/ex8_movies.mat')
    

movie_ratings = load_movie_ratings_data()

# (m x k) movie ratings
y = movie_ratings['Y']

# binary (m x k) matrix indicating if user rated movie
r = movie_ratings['R']

print("Y", y.shape)
print("R", r.shape)
Y (1682, 943)
R (1682, 943)

Collaborative Filtering

Minimize linear regression cost function for both $X$ and $\Theta$:

$$J(x^{(0)}, \dots, x^{(i)}, \theta^{(0)}, \dots, \theta^{(j)}) = \frac{1}{2} \sum_{(i,j):r(i,j)=1}{((\theta^{(j)})^T x^{(i)} - y^{(i,j)})^2} + \frac{\lambda}{2} \sum_{j=0}^k (\theta^{(j)})^2 + \frac{\lambda}{2} \sum_{i=0}^n (x^{(i)})^2$$

Where $r(i, j)=1$ when there is a rating given by j-th user to i-th product.

Gradients:

$$\frac{dJ}{d \theta^{(j)}_k} = \sum_{i:r(i,j)=1}{((\theta^{(j)})^T x^{(i)} - y^{(i,j)})} x^{(i)}_k + \lambda \theta^{(j)}_k$$$$\frac{dJ}{d x^{(i)}_k} = \sum_{j:r(i,j)=1}{((\theta^{(j)})^T x^{(i)} - y^{(i,j)})} \theta^{(j)}_k + \lambda x^{(i)}_k$$

Movie Rating Prediction

Use collaborative filtering algorithm to find recommended movies for a new user.

In [3]:
# load movie names database
def load_movies():
    movies = []
    p = re.compile("\s+(.*)\n")
    with open('../data/movie_ids.txt', encoding="iso-8859-1") as f:
        for line in f:
            movies.append(p.search(line).group(1))
            
    return movies


movies = load_movies()
In [4]:
# give ratings to some movies of the data set
my_ratings = np.zeros((y.shape[0], 1))  # m x 1
my_ratings[0] = 4  # Toy Story (1995)
my_ratings[97] = 2  # Silence of the Lambs (1991)
my_ratings[6] = 3
my_ratings[11] = 5
my_ratings[53] = 4
my_ratings[63] = 5
my_ratings[65] = 3
my_ratings[68] = 5
my_ratings[182] = 4
my_ratings[225] = 5
my_ratings[354] = 5

for m in np.argwhere(my_ratings > 0)[:,0]:
    print(my_ratings[m][0], movies[m])
4.0 Toy Story (1995)
3.0 Twelve Monkeys (1995)
5.0 Usual Suspects, The (1995)
4.0 Outbreak (1995)
5.0 Shawshank Redemption, The (1994)
3.0 While You Were Sleeping (1995)
5.0 Forrest Gump (1994)
2.0 Silence of the Lambs, The (1991)
4.0 Alien (1979)
5.0 Die Hard 2 (1990)
5.0 Sphere (1998)
In [5]:
# add new ratings to Y and R matrices
y = np.hstack((y, my_ratings))
r = np.hstack((r, my_ratings > 0))

y_means, y_norm = mean_normalize_variables(y, r)

# find new set of Theta and X for each user and movie
new_theta, new_x = find_new_parameters(y_norm, r)

print("Theta", new_theta.shape)
print("X", new_x.shape)
Theta (944, 10)
X (1682, 10)

Find top 20 recommendations.

In [7]:
# find recommendations for each movie
my_coefficients = new_theta[-1, :].transpose().reshape((new_theta.shape[1], 1))

recommended_movies = find_recommended_movies(new_x, my_coefficients, y_means)
for index, rating in recommended_movies[0:20,:]:
    print(round(rating, 1), movies[int(index)])
5.0 Great Day in Harlem, A (1994)
5.0 Aiqing wansui (1994)
5.0 Someone Else's America (1995)
5.0 Prefontaine (1997)
5.0 Marlene Dietrich: Shadow and Light (1996) 
5.0 Santa with Muscles (1996)
5.0 Saint of Fort Washington, The (1993)
5.0 They Made Me a Criminal (1939)
5.0 Entertaining Angels: The Dorothy Day Story (1996)
5.0 Star Kid (1997)
4.6 Pather Panchali (1955)
4.6 Star Wars (1977)
4.6 Shawshank Redemption, The (1994)
4.5 Maya Lin: A Strong Clear Vision (1994)
4.5 Anna (1996)
4.5 Wrong Trousers, The (1993)
4.5 Some Mother's Son (1996)
4.5 Everest (1998)
4.5 Schindler's List (1993)
4.5 Raiders of the Lost Ark (1981)